Chapter 3: Functions

Definition & Notation

A function is a relation between two sets, where each element of the first set (called the domain) is assigned to exactly one element of the second set (called the codomain). As illustrated below, a function can be thought of as an input/output device $f$ : for any given $x$ input, the output $y = f (x)$ is uniquely determined.

*A conceptual illustration of a function as a mapping from input to output: each input $x$ is processed by a function to produce a unique output $y = f (x)$ .*

We now provide a more formal definition of a function and introduce several related concepts.

Definition: A Function

*Illustration of a function as a mapping from elements in an input set (domain) to elements in an output set (codomain).*

A function $f$ is a rule that assigns to each input $x \in A$ exactly one output $y \in B$ . This relationship is often written as:

$f : A \to B$

In particular:

The set $A$ is called the domain of the function. It contains all possible valid inputs.
The set $B$ is called the codomain. It is the set into which all outputs are mapped.
The range (also called the image) of the function is the set of actual outputs the function produces based on its domain. It is a subset of the codomain:

$Range (f) \subseteq B$

Terminology: Independent & Dependent Variable

When we use $x$ to denote the input and $y$ to denote the output associated with $x$ , $x$ is also referred to as the independent variable and $y$ as the dependent variable, because its value "depends on $x$ ".

A function always has a domain, which is the set of all inputs for which the function is defined. If no specific domain is stated for a function given by an equation, the default is typically the set of all real numbers that yield valid (usually real) outputs.

Functions are powerful tools for describing relationships between two quantities. Many real-world scenarios can be modeled using functions, where one variable depends on another. In this context, it is also important to understand a function’s domain, codomain, and range, as these concepts help clarify what kinds of inputs are valid, what types of outputs are expected, and what outputs actually occur.

Example 1: Area of a Square

The area of a square depends on the length of its side. If the side length is $s$ , the area $A$ is given by

$A (s) = s^{2}$

Domain: $s \in [0, \infty)$ , because side lengths cannot be negative.
Codomain: $R$ , since the function produces real-number outputs (areas measured in real units).
Range: $[0, \infty)$ , because squaring any non-negative number gives a non-negative result. The smallest value occurs at $s = 0$ , where $A (0) = 0$ , and as $s$ increases, $A (s)$ grows without bound.

Example 2: Temperature Over Time

The temperature at a given time of day can be expressed as a function of time. Suppose the temperature (in °C) follows the rule

$T (t) = 10 + 8 sin (\frac{π t}{12})$

Domain: $t \in [0, 24]$ , because the model describes temperature over a single day (in hours).
Codomain: $R$ , since temperature values are real numbers.
Range: $[2, 18]$ , since the sine term $sin (\frac{π \cdot t}{12})$ varies between $- 1$ and $1$ . This means $8 sin (\cdot)$ varies between $- 8$ and $8$ , and adding $10$ shifts the range to $[2, 18]$ .

Example 3: Distance Traveled at Constant Speed

If a car travels at a constant speed of 60 km/h, the distance traveled after $t$ hours is given by

$D (t) = 60 t$

Domain: $t \in [0, \infty)$ , because time cannot be negative.
Codomain: $R$ , since distances are expressed as real numbers.
Range: $[0, \infty)$ , because multiplying a non-negative $t$ by 60 produces a non-negative result. The distance is $D (0) = 0$ at the start, and increases without bound as time increases.

Note: Choosing the Codomain

The codomain defines the set of values a function is declared to produce, while the range consists of the values that actually occur.

In many cases, the codomain is chosen to be the set of real numbers, $R$ , even when the outputs are only non-negative (as in Example 1). This convention keeps real-valued functions compatible with one another, i.e., it allows us to compare, combine, and apply the same rules (for example, addition, composition, or differentiation) without worrying about mismatched output sets.

In general, the codomain specifies the structure of the output space: it tells us what kind of values a function is expected to produce, while the range shows which of those values actually occur.

Representation Methods

Functions can be represented in several different ways, each offering different insights into the relationship they describe. Depending on the context, one representation may be more useful or informative than another.

To illustrate these representations, we will use a simplified example based on (synthetically generated) agriculture data. Let $f (x)$ denote the crop yield (in t/ha) as a function of fertilizer amount $x$ (in kg/ha). That is, we define:

$f (x) = yield corresponding to fertilizer amount x$

This example models a common real-world scenario where one quantity (fertilizer) depends on another (crop yield).

Tables

A table is one of the most straightforward ways to represent a function. This form is especially useful when working with data collected through observation or measurement. Essentially, a table just lists specific input values and their corresponding output values.

Fertilizer $x$ ( $kg/ha$ )	Crop Yield $f (x)$ ( $t/ha$ )
0	3.4942
1	3.5038
2	3.5133
3	3.5228
4	3.5322
$⋮$	$⋮$
197	4.1589
198	4.1559
199	4.1530
200	4.1500
201	4.1469
$⋮$	$⋮$
396	2.3319
397	2.3163
398	2.3008
399	2.2851
400	2.2694

In this table, each row shows a specific input value $x$ and the corresponding output value $f (x)$ . Tables are useful for answering discrete queries, such as: "What is the crop yield if given 200 kg/ha fertilizer?".

They can also help identify general trends in the data, which leads us to the following definitions.

Definition: Increasing on an Interval

We say that a function $f$ is increasing on an interval $I$ if for all $x_{1}, x_{2} \in I$ it holds that

$f (x_{1}) \leq f (x_{2}), when x_{1} < x_{2}$

The function $f$ is said to be strictly increasing (note the inequality) when

$f (x_{1}) < f (x_{2}), when x_{1} < x_{2}$

Definition: Decreasing on an Interval

We say that a function $f$ is decreasing on an interval $I$ if for all $x_{1}, x_{2} \in I$ it holds that

$f (x_{1}) \geq f (x_{2}), when x_{1} < x_{2}$

The function $f$ is said to be strictly decreasing (note the inequality) when

$f (x_{1}) > f (x_{2}), when x_{1} < x_{2}$

By applying these definitions and inspecting the table, we can observe that the crop yield $f (x)$ increases as the fertilizer amount $x$ increases - up to a certain point - and then decreases. However, beyond this general behavior, it is difficult to tell much more. The table alone does not reveal whether the relationship is simply linear, or follows a more complex curve. In particular, it does not clearly convey the rate at which the crop yield increases or whether this rate changes over the domain. For such insights, a graphical or algebraic representation is usually more informative.

Graphs

A visual picture of a function can be provided in the form of a graph. The graph of a function is the set of points $(x, f (x))$ plotted in a coordinate plane, where $z = f (x)$ for all $x$ in the domain of $f$ . Plotting data points from a table helps reveal the overall shape and behavior of the function, which may not be immediately apparent from a list of values alone.

From this graph, we can observe that the function increases with fertilizer ( $x$ ), to a point, but not linearly. The curve appears to flatten and then decrease more sharply, suggesting that the relationship between fertilizer ( $x$ ) and yield ( $f (x)$ ) is non-linear, possibly polynomial.

Algebraic Formulas

Often, we want more than just individual data points, we want a general rule that allows us to compute the output for any valid input. An algebraic formula provides a compact, symbolic way to describe the relationship between inputs and outputs.

The table and graphs shown above of the crop yield have actually been generated based on the quadratic polynomial:

$f (x) = - 3.17042 \cdot 1 0^{- 5} \cdot x^{2} + 0.00961968 x + 3.49418$

This expression is not arbitrary, it was obtained from observed data using a method that fits a mathematical function to the measurements. In this case, the polynomial gives an approximation of the relationship between fertilizer amount and crop yield, smoothing out random variations while preserving the overall pattern seen in the data.

Having an algebraic representation allows us to carry out several useful analyses:

Interpolation: Estimate values between known data points.
Extrapolation: Predict behavior beyond the observed range, for instance, for very small or large fertilizer amounts ( $x$ ).
Equation solving: Find input values corresponding to specific outputs, for example solving $f (x) = 4.5$ to determine the corresponding fertilizer amount that yields 4.5 t/ha.

More broadly, models based on algebraic formulas let us describe and explore real-world phenomena: how quantities change together, where growth slows or reverses, and how one variable influences another. Such models form the foundation of mathematical analysis, offering insight into underlying behavior.

To describe these relationships effectively, we must choose a suitable type of function and fit it to the data. In this example, the coefficients of the polynomial were determined using least squares regression, a method that finds the curve that best matches the observed data. Recognizing different classes of functions, such as linear, quadratic, cubic, or exponential, helps us select appropriate models and interpret the types of behavior they represent.

Basic Classes of Functions

Functions can be grouped into different classes based on their algebraic form. Each class has its own properties, domain and range, and characteristic graph shape. In this section, we focus on common basic function classes and describe their general forms (graphical) behavior.

Before exploring specific types, it is useful to note two important features that appear frequently in graphs of functions:

Feature	Definition	Why It Matters
Intercepts	Points where the graph crosses the coordinate axes. $x$ -intercepts occur when $f (x) = 0$ , and the $y$ -intercept occurs when $x = 0$ .	Represent starting values, equilibrium states, or solutions to problems.
Turning Points	Points where the graph changes direction from increasing to decreasing, or vice versa.	Indicate local maxima or minima; used to identify peaks, troughs, or optimal conditions.
Asymptotes	Lines that the graph approaches but does not cross (or only crosses at infinity): horizontal, vertical, or oblique.	Describe long-term trends or limits in growth/behavior; mark boundaries.

Polynomial Functions

Polynomial functions are smooth, continuous curves with no sharp corners or breaks. Their general behavior depends on the degree and the leading coefficient.

Definition: Polynomial Function

*Examples of polynomial functions of different degrees, showing how the degree affects the shape and number of turning points of the graph.*

Polynomials belong to a broad class of functions that can be written in the general form:

$f (x) = a_{n} x^{n} + a_{n - 1} x^{n - 1} + \dots + a_{1} x + a_{0}$

where:

$n$ is a non-negative integer (the degree of the polynomial)
$a_{n}, a_{n - 1}, \dots, a_{0}$ are real constants
$a_{n} \neq = 0$ if $n > 0$

Key characteristics:

Graph: smooth, continuous curve.
Intercepts: Up to $n$ real $x$ -intercepts; always one $y$ -intercept at $(0, a_{0})$ .
Domain: All real numbers ( $R$ )
Range: Depends on the degree and coefficients.

Terminology: Classification of Polynomials

Polynomials are commonly classified based on two characteristics:

The number of terms
The degree of the expression

The following tables summarize these classifications with corresponding examples.

Table 1. Classification of polynomials by the number of terms.

Number of Terms	Name	Example
$1$	Monomial	$5 x^{3}$
$2$	Binomial	$3 x^{2} + 1$
$3$	Trinomial	$x^{2} - 4 x + 4$
$\geq 4$	Polynomial	$x^{4} + x^{3} - 2 x + 7$

Table 2. Classification of polynomials by degree.

Degree	Name	Example
$0$	Constant	$7$
$1$	Linear	$2 x + 3$
$2$	Quadratic	$x^{2} - 4 x + 4$
$3$	Cubic	$x^{3} - x$
$4$	Quartic	$x^{4} + 2 x^{2} + 1$
$5$	Quintic	$x^{5} - x^{3} + 1$
$n \geq 6$	$n$ th-degree polynomial	$x^{6} + \dots$

Linear Functions

A linear function is a polynomial of degree $1$ and its graph is a straight line.

Definition: Linear Function

*Graphs of two linear functions. The first shows an increasing line ( $a > 0$ ), while the second shows a decreasing function ( $a < 0$ ).*

A linear function can be written in the general (slope-intercept) form:

$f (x) = a x + b$

where $a$ and $b$ are constants. If $a \neq = 0$ , it is a polynomial of degree $1$ ; if $a = 0$ , it simplifies to $f (x) = b$ , which is a constant function (a polynomial of degree 0).

Key characteristics:

Graph: A straight line with slope $a$ .
- If $a > 0$ the function is increasing
- If $a < 0$ the function is decreasing
Intercepts:
- $y$ -intercept at point $(0, b)$
- $x$ -intercept at point $(- \frac{b}{a}, 0)$
Domain: All real numbers ( $R$ )
Range: All real numbers ( $R$ )

Examples: Linear Functions

Determine which of the following functions are linear functions:

$f (x) = 3 x - 5$
$g (x) = 7$
$h (x) = 2 x^{2} + 1$

Answer: $f (x)$ is linear (degree 1), while $g (x)$ is a constant function (degree 0), and $h$ is quadratic (degree 2).

One of the defining characteristics of a line is its slope. The slope describes how a line rises or falls as we move along the $x$ -axis, i.e., in other words, it represents the rate of change in $y$ for each unit change in $x$ .

The slope measures both the steepness and the direction of a line:

If the slope is positive, the line points upward when moving from left to right
If the slope is negative, the line points downward when moving from left to right
If the slope is zero, the line is horizontal

To determine the slope numerically, we compare how much $y$ changes relative to $x$ . This comparison gives us the ratio of the change in $y$ to the change in $x$ , leading to the more formal definition below.

Definition: Slope of a Linear Function

Consider a line passing through points $(x_{1}, y_{1})$ and $(x_{2}, y_{2})$ . Let $Δ y = y_{2} - y_{1}$ and $Δ x = x_{2} - x_{1}$ denote the changes in $y$ and $x$ , respectively. The slope of the line is:

$m = \frac{y _{2} - y _{1}}{x _{2} - x _{1}} = \frac{Δ y}{Δ x}$

Now, let us explore how this definition relates to the formula of a linear function. Consider the function:

$f (x) = a x + b$

We already know that the graph of a linear function is a straight line. To find its slope, we can apply the definition above using any two points, i.e., $(x_{1}, y_{1})$ and $(x_{2}, y_{2})$ , on the line. In particular, let us evaluate the function at two convenient points:

When $x_{1} = 0$ , we have $y_{1} = f (0) = a \cdot 0 + b = b$ . This gives us the point: $(x_{1}, y_{1}) = (0, b)$
When $x_{2} = 1$ , we have $y_{2} = f (1) = a \cdot 1 + b = a + b$ . This gives us the point: $(x_{2}, y_{2}) = (1, a + b)$

Therefore, substituting the points into the formula for the slope, the slope of this line is:

$m = \frac{y _{2} - y _{1}}{x _{2} - x _{1}} = \frac{( a + b ) - b}{1 - 0} = \frac{a}{1} = a$

This shows that the coefficient $a$ in the function $f (x) = a x + b$ represents the slope of the line. Hence, every linear function of the form $f (x) = a x + b$ describes a line with slope $a$ and $y$ -intercept $b$ .

This relationship will be revisited in Chapter 8, where the concept of slope forms the basis for defining differentiation.

Quadratic Functions

A quadratic function is a polynomial of degree $2$ ; its graph is a parabola.

Definition: Quadratic Function

*Graphs of three quadratic functions. The first two parabolas open upward ( $a > 0$ ), while the last opens downward $(a < 0)$ .*

A quadratic function can be written in the general form:

$f (x) = a x^{2} + b x + c$

where $a \neq = 0$ .

Key characteristics:

Graph: A parabola.
- If $a > 0$ the parabola opens upward
- If $a < 0$ the parabola opens downward
Intercepts: Up to two $x$ -intercepts, and exactly one $y$ -intercept
Turning Point: The peak of the graph.
- If $a > 0$ it is the lowest point $(x_{m a x}, y_{m a x})$
- If $a < 0$ it is the highest point $(x_{m i n}, y_{m i n})$
Domain: All real numbers ( $R$ )
Range:
- If $a > 0$ it is $[y_{m i n}, \infty)$
- If $a < 0$ it is $(- \infty, y_{m a x}]$

Examples: Quadratic Functions

Determine which of the following functions are quadratic functions:

$f (x) = 4 x^{2} - x + 7$
$g (x) = x (x - 5)$
$h (x) = 3 x^{3} - 2 x + 1$

Answer: $f (x)$ and $g (x)$ are quadratic. $h (x)$ is cubic.

Exponential Functions

Exponential functions have a constant base raised to a variable exponent.

Definition: Exponential Function

*Examples of exponential functions. The first two illustrate exponential growth ( $b > 1$ ), while the last shows exponential decay ( $0 < b < 1$ ).*

An exponential function can be written in the general form:

$f (x) = a b^{x}$

where $a \neq = 0$ , $b > 0$ , and $b \neq = 1$ .

Key characteristics:

Graph:
- If $b > 1$ the graph is increasing (growth)
- If $0 < b < 1$ the graph is decreasing (decay)
Asymptote: Horizontal at $y = 0$ .
Domain: All real numbers ( $R$ )
Range: $(0, \infty)$ if $a > 0$

Examples: Exponential Functions

Determine which of the following functions are exponential functions:

$f (x) = 2^{x}$
$g (x) = x^{2}$
$h (x) = 5 \cdot (0.5)^{x}$

Answer: $f (x)$ and $h (x)$ are exponential. Here $g (x)$ is a power function (a special case of a polynomial).

Logarithmic Function

Logarithmic functions are the inverses of exponential functions.

Definition: Logarithmic Function

*Three examples of a logarithmic functions. They are all increasing functions, scaled or using different base.*

A logarithmic function can be written in the general form:

$f (x) = a \cdot lo g_{b} (x)$

where $a \neq = 0$ , $b > 0$ , and $b \neq = 1$ .

Key characteristics:

Graph:
- Passes through $(1, 0)$ if $a = 1$
- Slow, unbounded growth for large $x$
Asymptote: Vertical at $x = 0$ .
Domain: $(0, \infty)$
Range: All real numbers ( $R$ )

Examples: Logarithmic Functions

Determine which of the following functions are logarithmic functions:

$f (x) = lo g_{2} (x)$
$g (x) = ln (x)$
$h (x) = 5^{x}$

Answer: $f (x)$ and $g (x)$ are logarithmic. $h (x)$ is exponential.

Piecewise-Defined Functions

Not all functions can be described by a single formula. In some cases, different rules apply to different parts of the domain. Such functions are called piecewise-defined functions.

Definition: Piecewise Function

A piecewise-defined function is a function whose rule is given by multiple expressions, each applying to a specific interval (or subset) of the domain. Formally, it can be written as:

$f (x) = ⎩ ⎨ ⎧ f_{1} (x), f_{2} (x), ⋮ f_{n} (x), if x \in D_{1}, if x \in D_{2}, ⋮ if x \in D_{n},$

where:

Each $f_{i} (x)$ defines the function on a subset $D_{i}$ of the domain
The subsets $D_{1}, D_{2}, \dots, D_{n}$ are non-overlapping and form the entire domain of $f$
Each input $x$ belongs to exactly one of the subsets $D_{i}$ , ensuring that the function assigns one unique output for every input

Key characteristics:

Graph: May be continuous or discontinuous at the boundary points between pieces
Domain: The union of all subsets $D_{i}$
Range: The union of the output values of all pieces

Example 1: A Piecewise Function

*Graph of a piecewise-defined function with two expressions joined at $x = 2$ .*

Consider the function defined by

$f (x) = {3 x + 1, x^{2}, if x \geq 2, if x < 2.$

To evaluate a piecewise function, first determine which part of the domain the input belongs to, and then apply the corresponding rule. For instance:

For $x = 5$ , since $5 \geq 2$ , use function $3 x + 1$ : $f (5) = 3 \cdot 5 + 1 = 16$
For $x = - 1$ , since $- 1 < 2$ , use function $x^{2}$ : $f (- 1) = (- 1)^{2} = 1$

Example 2: A Piecewise Function

*Graph of the absolute value function $f (x) = ∣ x ∣$ , showing a change in rule at $x = 0$ .*

The absolute value function, denoted by $f (x) = ∣ x ∣$ , can be expressed as a piecewise-defined function:

$f (x) = {- x, - x, if x \geq 0, if x < 0.$

Here, positive inputs are unchanged, while negative inputs are reflected across the $x$ -axis, ensuring that $f (x)$ is always non-negative.

Example 3: A Piecewise Function

*Graph of the ReLU (Rectified Linear Unit) function, which outputs zero for negative inputs and increases linearly for positive inputs.*

The Rectified Linear Unit (ReLU) is a commonly used activation function in neural networks. It can be expressed as a piecewise-defined function:

$f (x) = {x, 0, if x > 0, if x \leq 0.$

The ReLU function outputs the input value itself when it is positive, and zero otherwise. This simple non-linear behavior introduces nonlinearity into neural networks, which is an essential property that allows them to learn complex patterns and relationships in data.

Injective, Surjective, and Bijective Functions

Functions can also be classified based on how they relate elements of their domain to elements of their codomain. While algebraic form determines a function’s shape or formula, mapping properties determine whether the function is one-to-one, onto, or both.

Definition: Injective Function

A function $f : A \to B$ is injective (or one-to-one) if it never assigns the same output value to two different inputs. In other words, each output in $B$ comes from at most one input in $A$ .

More formally, in predicate logic, we can write:

$\forall x_{1}, x_{2} \in A, f (x_{1}) = f (x_{2}) \Rightarrow x_{1} = x_{2}$

Or in plain words: If two inputs of a function give the same output, then those inputs must be equal.

Example: Injective Function

*Left: An injective function. Right: A non-injective function.*

Let $f : R \to R$ be defined by:

$f (x) = 2 x + 3$

This function is injective because different $x$ -values always produce different $y$ -values. However, $f (x) = x^{2}$ is not injective on $R$ since $f (2) = f (- 2) = 4$ .

Definition: Surjective Function

A function $f : A \to B$ is surjective (or onto) if every element of the codomain $B$ appears as an output of the function. That means the function covers all of $B$ , i.e., its range is equal to its codomain.

More formally, in predicate logic, we can write:

$\forall y \in B, \exists x \in A such that f (x) = y$

Or in plain words: For every possible output value in the codomain, there exists at least one input value in the domain that produces it.

Example: Surjective Function

*Left: A surjective function that covers all possible $y$ -values in the codomain. Right: A non-surjective function which leaves gaps.*

Let $f : R \to R$ be defined by:

$f (x) = 2 x + 1$

For any $y \in R$ , there exists $x = \frac{y - 1}{2}$ , so $f$ is surjective. However, $f (x) = x^{2}$ from $R \to R$ is not surjective because negative $y$ -values are never reached.

Definition: Bijective Function

A function $f : A \to B$ is bijective if it is both injective and surjective. This means that each element of $A$ is mapped to a unique element of $B$ (injectivity), and every element of $B$ is the image of some element of $A$ (surjectivity).

More formally, we can write: $f is bijective ⟺ (f is injective) and (f is surjective)$

Or in plain words: A bijective function establishes a one-to-one correspondence between the sets $A$ and $B$ , so that nothing is repeated and nothing is left out.

Example: Bijective Function

*A bijective function: each element of the domain maps to exactly one unique element of the codomain.*

Let $f : R \to R$ be defined by:

$f (x) = x + 1$

The function is bijective because each input produces a unique output (injective) and every real number occurs exactly once as an output (surjective).

Combining Functions

Up to this point, we have explored the basic characteristics of individual functions. We now turn to what happens when functions are combined using standard mathematical operations to create new ones. Just as numbers can be added, subtracted, multiplied, or divided, functions can also be combined in similar ways to form new functions with related behaviors.

Example 1: Combining Functions

In machine learning, the loss function used to train a model often combines several components that measure different aspects of performance.

Suppose we define:

$f (x)$ : The prediction error
$g (x)$ : A regularization term that penalizes overly complex models

Note that $x$ may represent several model parameters, but the idea of combining functions, i.e., adding terms that capture different effects, follows the same principle as in the single-variable case.

The resulting loss function balances accuracy (how well predictions match the observed data) with simplicity (how small the model parameters are):

$L (x) = f (x) + λ \cdot g (x),$

where $λ > 0$ controls how strongly the regularization term influences the model.

Example 2: Combining Functions

In many real-world models, new relationships are created by combining existing quantities using arithmetic operations.

Suppose we define:

$f (x)$ : the temperature (in °C)
$g (x)$ : the humidity (in %)

A new function $h (x)$ can be defined to estimate a heat index (a perceived temperature) as follows:

$h (x) = f (x) + 0.1 \cdot g (x)$

Here, $h (x)$ is obtained by adding a weighted contribution from humidity to the temperature. Such combinations describe how different quantities together determine a result. In this case, both temperature and humidity contribute to the perceived heat.

Suppose $f$ and $g$ are functions defined on the same domain. The following operations define new functions as shown below:

Operation	Notation	Definition
Sum	$(f + g) (x)$	$f (x) + g (x)$
Difference	$(f - g) (x)$	$f (x) - g (x)$
Product	$(f \cdot g) (x)$	$f (x) \cdot g (x)$
Quotient	$(\frac{f}{g}) (x)$	$\frac{f ( x )}{g ( x )}, g (x) \neq = 0$

Ultimately, these operations let us construct more complex relationships from simpler ones.

Example 3: Combining Functions

In this example, we explore how subtraction and division affect the relationship between two functions. For this purpose, let

$f (x) = x - 1, and g (x) = x^{2} - 1.$

We will now find and simplify both $(g - f) (x)$ and $(\frac{g}{f}) (x)$ to see how these operations transform the expressions.

First, subtract $f (x)$ from $g (x)$ :

$(g - f) (x) = g (x) - f (x) = (x^{2} - 1) - (x - 1) = x^{2} - x = x (x - 1)$

Then, divide $g (x)$ by $f (x)$ :

$(\frac{g}{f}) (x) = \frac{g ( x )}{f ( x )} = \frac{x ^{2} - 1}{x - 1} = \frac{( x + 1 ) ( x - 1 )}{x - 1} = x + 1$

We can see that subtraction and division lead to very different results, i.e., $(g - f) (x)$ is a quadratic expression, while $\frac{g}{f}$ simplifies to a linear one.

Even though both start from the same $f$ and $g$ , the way we combine them changes the type of function we obtain.

Example 4: Combining Functions

In this example, we explore how multiplication and subtraction affect the relationship between two functions. For this purpose, let

$f (x) = x - 1, and g (x) = x^{2} - 1$

We will now find and simplify both $(f \cdot g) (x)$ and $(f - g) (x)$ to see how these operations transform the expressions.

First, multiply $f (x)$ and $g (x)$ :

$(f \cdot g) (x) = f (x) \cdot g (x) = (x - 1) (x^{2} - 1) = x^{3} - x^{2} - x + 1$

Then, subtract $g (x)$ from $f (x)$ :

$(f - g) (x) = f (x) - g (x) = (x - 1) - (x^{2} - 1) = x - x^{2}$

Again, the two resulting functions are very different, i.e., $(f \cdot g) (x)$ is cubic, while $(f - g) (x)$ is quadratic.

Function Composition

In the previous examples, we combined functions using arithmetic operations such as addition and multiplication. Now, explore a different kind of combination, i.e., function composition, where the output of one function becomes the input of another.

Function composition allows us to describe multi-step relationships between quantities that depend on one another.

In many real-world situations, one variable influences a second, which in turn affects a third. By composing functions, we can express an entire chain of dependencies as a single mathematical expression.

Example 1: Function Composition

Suppose we want to calculate how much electricity is used to cool a house on a particular day of the year. The electricity usage depends on the average indoor-outdoor temperature difference, which in turn depends on the average daily temperature outside.

Thus, we have two relationships:

$E (T)$ : Describes the electricity (in kWh) required to maintain a desired indoor temperature for a given outdoor temperature $T$ (°C)
$T (d)$ : Describes the average outdoor temperature (°C) on day $d$ of the year

For any given day $d$ , the electricity use depends on the temperature, which itself depends on the day. We can therefore evaluate $E$ at the temperature given by $T (d)$ :

$E (T (d))$

This expression represents the electricity used on day $d$ . For example, to find the electricity usage on the 10th day of the year, we would first compute $T (10)$ , the average temperature on day 10, and then use that value in function $E$ , i.e., $E (T (10))$ gives the electricity required to cool the house on the 10th day of the year.

In this case, the function describing temperature is said to be composed with the function describing electricity usage.

The relationship illustrated above can be generalized by defining a new function that represents applying one function after another. For this to be well-defined, the range of the inner function must lie within the domain of the outer function.

This brings us to the formal definition of the composition of functions.

Definition: Function Composition

Let $f : B \to C$ and $g : A \to B$ be functions, where the codomain of $g$ (the set of its possible outputs) is contained in the domain of $f$ . The composition of $f$ and $g$ , denoted $f \circ g$ , is the function

$f \circ g : A \to C$

defined by

$(f \circ g) (x) = f (g (x)) for all x \in A .$

Warning: Misconceptions About Composition

Composition is not multiplication:

The composition of two functions is denoted by $f \circ g$ and defined as

$(f \circ g) (x) = f (g (x)) .$

In contrast, the product of two functions is denoted by $f \cdot g$ and defined as

$(f \cdot g) (x) = f (x) \cdot g (x) .$

The first applies one function inside another, while the second multiplies their outputs.
Composition is not commutative:

In general it is the case that

$f \circ g \neq = g \circ f,$

since

$f (g (x)) \neq = g (f (x))$

for most functions $f$ and $g$ . In other words, the order matters because the output of one function becomes the input of the other, and reversing that order usually produces a different intermediate value.

Example 2: Function Composition

Using the following functions, find both $f (g (x))$ and $g (f (x))$ to determine whether composition is commutative.

$f (x) = 2 x + 1, g (x) = 3 - x$

First, substitute $g (x)$ into $f (x)$ :

$f (g (x)) = 2 (3 - x) + 1 = 6 - 2 x + 1 = 7 - 2 x$

Next, substitute $f (x)$ into $g (x)$ :

$g (f (x)) = 3 - (2 x + 1) = 3 - 2 x - 1 = 2 - 2 x$

Because $f (g (x)) \neq = g (f (x))$ , we see that function composition is not commutative.

Decomposing Functions

The idea of composition naturally leads to its reverse process, i.e., decomposition. While composition builds complex relationships by applying one function after another, decomposition involves expressing a single, complicated function in terms of simpler ones:

$h (x) = (f \circ g) (x) = f (g (x))$

This approach makes functions easier to understand and, more importantly, easier to work with. It will play an important role later, particularly in Chapter 8, where recognizing how a function is composed of simpler parts becomes essential for applying the chain rule of differentiation.

Note that a single function may have more than one possible decomposition. In practice, we choose the one that makes the problem easier.

Example 1: Decomposing Functions

Express $h (x) = 5 - x^{2}$ as the composition of two simpler functions.

We are looking for functions $f$ and $g$ such that

$h (x) = (f \circ g) (x) = f (g (x))$

To identify these functions, notice that $5 - x^{2}$ appears inside the square root. This suggests the inner function produces $5 - x^{2}$ , and the outer function takes the square root of its input. Thus, we can define

$g (x) = 5 - x^{2} and f (x) = x$

We can verify our decomposition by recomposing the functions:

$(f \circ g) (x) = f (g (x)) = f (5 - x^{2}) = 5 - x^{2}$

Therefore, $h (x) = (f \circ g) (x)$ with

$g (x) = 5 - x^{2}, and f (x) = x$

Example 2: Decomposing Functions

Express $h (x) = \frac{4}{3 - 4 + x ^{2}}$ as the composition of two simpler functions.

We are looking for functions $f$ and $g$ such that

$h (x) = (f \circ g) (x) = f (g (x))$

Here, the expression $4 + x^{2}$ appears inside the denominator. We can treat that as the output of the inner function $g$ , and then let the outer function $f$ operate on that result. Thus, we can define

$g (x) = 4 + x^{2} and f (x) = \frac{4}{3 - x}$

We can verify our decomposition by recomposing the functions:

$(f \circ g) (x) = f (g (x)) = f (4 + x^{2}) = \frac{4}{3 - 4 + x ^{2}}$

Therefore, $h (x) = (f \circ g) (x)$ with

$g (x) = 4 + x^{2}, and f (x) = \frac{4}{3 - x}$

Mathematics Brush-up for Data Science

Chapter 3: Functions

Definition & Notation

Representation Methods

Tables

Graphs

Algebraic Formulas

Basic Classes of Functions

Polynomial Functions

Linear Functions

Quadratic Functions

Exponential Functions

Logarithmic Function

Piecewise-Defined Functions

Injective, Surjective, and Bijective Functions

Combining Functions

Function Composition

Decomposing Functions

Chapter Exercises

Keyboard shortcuts

Mathematics Brush-up for Data Science