15 Transcendental Functions
Highlights of this Chapter: we introduce the idea of defining functions by a Functional Equation specifying how a function should behave instead of specifying how to compute it. Following this approach, we give rigorous definitions for exponentials logarithms and trigonometric functions, and investigate some of their consequences. With these definitions in hand, we are able to define the field of Elementary Functions, familiar from calculus and the sciences.
At the heart of real analysis is the study of functions: not only the study of their properties (continuity being a prime example) but also their very definition. Exponentials, trigonometric functions and logarithms are all examples of transcendental functions or things that transcend algebra: they are not built from a finite composition of the field operations and instead are calculated as the result of infinite processes.
In this chapter we will not focus on how to compute such functions, but rather on the more pressing question of how to even define them: if all we have available to us are the axioms of a complete ordered field how do we rigorously capture aspects of circles in the plane (trigonometry) or continuous growth (exponentials)? The key is the idea of a functional equation: something that will let us define a function by how it behaves, instead of by directly specifying a formula to compute it.
15.1 Functional Equations
Recall the great shift in our collective conception of a function that occurred around the time of Euler, where mathematicians stopped insisting that functions were given by formulas and rather began to welcome rather arbitrary rules, so long as they assigned a unique output to each input. This is accompanied by a conceptual leap, removing the focus from how to compute a function and turning towards what is the function doing?
This is perhaps easiest to illustrate by example, so we give two below for functions that we already know of from algebra: roots and linear functions.
15.1.1 Roots
How should one define the square root, to someone who has never seen it before? Perhaps as “the square root is a number that when multiplied by itself, gives the number you started with”. Such a description does a good job of telling us exactly what the square root does, and is worth trying to translate into formal mathematics.
In symbols, this means if \(r(x)\) is the square root, we need for each allowable value of \(x\), that \(r^2(x)=x\). In Example 16.2 we will show that exactly two such functions exist, and there is a unique one with \(R>0\). Thus, this approach is fully rigorous and we call this function the square root and write \(R(x)=\sqrt{x}\), consistent with Definition 4.6.
In general, we make the same definition, justified by the uniqueness result in Theorem 16.3.
Definition 15.1 Let \(r\colon[0,\infty)\to [0,\infty)\) satisfy the functional equation \[r(x)^n=x\] Then \(r\) is called the \(n^{th}\) root function and denoted \(r(x)=\sqrt[n]{x}\).
The utility of functional equations is that if we can take them as the definition of a particular function we are interested in, we know for sure that this function has the property we want: that’s all the definition specifies! The hard work them comes in figuring out how to actually compute the values of functions which are defined functionally.
15.1.2 Linear Functions
We know how to express linear functions already using the field axioms, as maps \(f(x)=kx\) for some real number \(k\). To speak of linear functions functionally however, we should not give a definition telling us how to compute their values (take the input, and multiply by a fixed constant \(k\)) but rather by what they’re for: by the defining property of linearity.
This more abstract functional approach was first taken by Cauchy during the development of analysis, and so the resulting equation is called the Cauchy Functional Equation
Definition 15.2 (Cauchy’s Functional Equation for Linearity) A function \(f\colon\RR\to\RR\) satisfies Cauchy’s functional equation if for all \(x,y\in\RR\), \[f(x+y)=f(x)+f(y)\]
Such an abstract characterization has had a tremendous influence in mathematics: for example, think of the definition of a linear map in linear algebra.
15.1.3 Difficulties
Moving away from defining a function computationally, there are several potential issues that need to be confronted. First, how do we know that there even is a function satisfying our functional equation?
Example 15.1 (An impossible functional equation) Consider the functional equation \[f(x)^2=-1\] There is no real valued function \(f\) satisfying this equation, as squaring to a negative requires complex solutions.
The second worry is to make sure the functional equation really is strict enough to capture what you want it to capture. One example is already presented by linearity: its easy to see that any linear function must be zero at \(x=0\), Thus we could propose the functional equation \(L(0)=0\) enforcing this property. But, this is far from actually capturing the notion of a linear function we had in mind: this functional equation has all sorts of solutions like \(f(x)=x^3\) which do exactly what was asked (are zero at zero) but are not what we had in mind.
But, its even worse than this: while it seems that Cauchy’s equation captures exactly what we want from the idea of linearity (the ability to distribute over addition) it also has pathological solutions beyond \(x\mapsto kx\) that we did not intend:
Example 15.2 (Pathological Solutions to Cauchy’s Functional Equation)
To avoid such pathological solutions one needs to impose extra conditions - and a hint at which conditions may help comes from the example above, which turns out to be continuous only at the point \(x=0\). What happens if we ask for a continuous solution to Cauchy’s equation?
Theorem 15.1 Any continuous solution to Cauchy’s functional equation is a function of the form \(f(x)=kx\) for some \(k\in\RR\).
Exercise 15.1 Prove Theorem 15.1 by following the outline below:
- Define \(k=f(1)\), and prove that \(f(x)=kx\) for all \(x\in\ZZ\), using the functional equation.
- Extend this to show that \(f(1/n)=k/n\) using the functional equation, and then that for any \(r\in\QQ\) \(f(r)=kr\).
- Use continuity to show that for any \(a\in\RR\) this implies that \(f(a)=ka\).
This is one critical way that analysis enters into the very definition of functions - if we specify what we want a function to do that often leaves room for pathological, discontinuous behavior. And, to get what we really want, we need to ask for our function to behave continuously. We see this time and again below, where we define exponentials logarithms and trigonometric functions all as the continuous solutions to various functional equations.
15.2 Exponentials
Definition 15.3 (The Law of Exponents) A function \(E\colon\RR\to\RR\) satisfies the law of exponents if for every \(x,y\in\RR\) \[E(x+y)=E(x)E(y)\]
We use this to give a functional definition of exponential functions.
Definition 15.4 An exponential function is a continuous nonzero function \(E\) that satisfies the law of exponents.
Now that we have a formal definition, we can start seeing what properties exponential functions must have.
Example 15.3 If \(E\) satisfies the law of exponents and evaluates to zero at any point, then \(E\) is the zero function.
Proof. Let \(E\) be an exponential function and assume there is some \(z\in\RR\) such that \(E(z)=0\). Then for any \(x\in\RR\) we may write \(x=x-z+z=(x-z)+z=y+z\) for \(y=x-z\in\RR\). Evaluating \(E(x)\) using the law of exponents, \[E(x)=E(y+z)=E(y)E(z)=E(y)\cdot 0 =0\]
Exercise 15.2 Prove that if \(E\) is any exponential function, then \(E(0)=1\), and that \(E(-x)=1/E(x)\).
Exercise 15.3 (Convexity of exponentials) Prove that exponential functions are convex (Definition 13.8): their secant lines lie above their graphs.
Proposition 15.1 Prove that if \(E\) is an exponential function, \(x\in\RR\) and \(r\in\QQ\) then \[E\left(xr\right)=E(x)^{r}\]
Proof. We deal separately with two cases, for nonzero integers \(p,q\). First we see that \(E(px)=E(x)^p\) by inductively applying the law of exponents to \(px=x+x+\cdots+x\): \[E(px)=E(x+x+\cdots +x )=E(x)E(x)\cdots E(x)=E(x)^p\]
Next, we see that \(E(x/q)=\sqrt[q]{E(x)}\), again by the law of exponents: Since \(x=q(x/q)=(x/q)+(x/q)+\cdots+(x/q)\), we can use the above to see \[E(x)=E\left(q\frac{x}{q}\right)=E\left(\frac{x}{q}\right)^q\] Thus, \(E(x/q)\) is a number such that when raised to the \(q^{th}\) power gives \(E(x)\). This is the definition of the \(q^{th}\) root, so \[E\left(\frac{x}{q}\right)=\sqrt[q]{E(x)}=E(x)^{\frac{1}{q}}\]
Putting these two cases together completes the argument, as for \(r=p/q\) an arbitrary rational number \[E(rx)=E\left(\frac{p}{q}x\right)=E\left(\frac{x}{q}\right)^p=\left(E(x)^p\right)^{\frac{1}{q}}=E(x)^r\]
This has a rather strong consequence for the values of an exponential function at the rational numbers, in terms of its value at a single point:
Definition 15.5 (The Base of an Exponential) If \(E\) is any exponential function, its value at \(1\) is called its base.
Corollary 15.1 Let \(E(x)\) be an exponential function with base \(a\). Then for every \(r\in\QQ\) we have \[E(r)=a^r\]
Proof. Let \(r\in\QQ\) and Proposition 15.1 to \(r=r\cdot 1\): \[E(r)=E(r\cdot 1)=E(1)^r=a^r\]
This is a pretty strong property: any two exponential functions that agree at 1 actually agree on the entire real line, since they agree at a dense set. In fact, this is true not just of \(1\), but of any point.
Exercise 15.4 (Exponentials that agree at a point) Prove that if \(E,F\) are two exponential functions which take the same value at any nonzero \(x\in\RR\), then they are equal.
Hint: prove that \(x\QQ=\{xr\mid r\in\QQ\}\) is dense in \(\RR\), and use Proposition 15.1, and Exercise 14.12
This work all tells us that if an exponential function exists at all then it is fully determined by its value at any point: phrased in terms of the value at \(1\), the base uniquely determines the exponential function (if it exists). We will have to do some more work before we can prove that these functions actually exist however!
Exercise 15.5 Prove that if \(E\) is an exponential function with base \(a\), then \(a>0\). Hint: if \(a<0\) what is \(E(1/2)\)?
15.3 Logarithms
Just like we defined an exponential function by what we want it to do, we will define a logarithm based on its desired properties, giving a functional equation. Logarithms were originally invented to speed up computation, by turning multiplication into addition.
Definition 15.6 (The Laws of Logarithms) We say a function \(L\colon \RR_+\to\RR\) satisfies the laws of logarithms if for every \(x,y\in\RR_+\) \[L(xy)=L(x)+L(y)\]
Like in the case of exponentials, we are right to worry that there may be many pathological, everywhere discontinuous solutions to this functional equation. To avoid these, we define logarithms to be the continuous solutions
Definition 15.7 (Logarithm) A function \(L\) is a logarithm if it is a continuous solution to the law of logarithms (Definition 15.6).
Because of the similarity of the logarithm law to that of exponentials, its perhaps no surprise that with some induction we can fully understand the behavior of these functions on rational inputs:
Proposition 15.2 Let \(r\in\QQ\). Then if \(L\) is any logarithm, for every \(x\in\RR\) we have \[L(x^r)=rL(x)\]
Exercise 15.6 Prove Proposition 15.2 via the following steps:
- Prove that for any \(n\in\NN\) we have \(L(x^n)=nL(x)\) inductively.
- Prove that \(L(x^{-1})=-L(x)\). Use this to conclude that \(L(x^{-n})=-L(x^n)\) for all \(n\in\NN\). Thus \(L(x^p)=pL(x)\) for all \(p\in\ZZ\).
- Prove that \(L(x^{1/q})=\frac{1}{q}L(x)\) for \(q\in \NN\).
- Put these together to see that for \(r=p/q\), \(L(x^r)=rL(x)\).
This gives an equality between two functions at every rational value. Because the functions are continuous (\(L\) is continuous, \(a^x\) is continuous, and multiplication is continuous). Thus, these are equal at every real value
Corollary 15.2 Let \(L\) be a logarithmic function, \(x>0\), and \(y\in\RR\). Then \[L(x^y)=yL(x)\]
This has a pretty incredible consequence:
Theorem 15.2 The inverse of an exponential function is a logarithm function!
Proof (Proof). Let \(E\) be an exponential and \(L\) a logarithm function. Then for any \(x\) we have \[L(E(x))=L(E(1)^x)=xL(E(1))\]
Thus, for the exponential \(E\) of base \(a=E(1)\), if we choose a logarithm function \(L\) where \(L(a)=1\), we see
\[L(E(x))=x\] so they are inverses!
This makes it natural to try and define the base of a logarithm:
Definition 15.8 (Base of a Logarithm) If \(L\) is a logarithm, its base is the real number \(a\) such that \(L(a)=1\).
Unlike for the exponential where the base was a value of the function (which then existed by definition), we do not know a priori that every logarithm takes the value \(1\) at some point, or even that it does so uniquely! So, we will have some work to do to show this actually makes sense.
15.4 Trigonometric
The trigonometric functions are originally defined geometrically, but like the exponentials above, we will specify them by a functional equation - specifying how the functions behave instead of what they measure.
Trigonometric functions satisfy many functional equations - these are what we call trigonometric identities! And, as one is perhaps too familiar with from a trigonometry class, there are many many trigonometric identities! Here our goal is to pick some small set of identities to impose as the axioms for trigonometry, from which all other functional properties can be derived.
The natural candidates are the angle sum or difference identities:
Definition 15.9 (Angle Sum Identities) Two functions \(S,C\) satisfy the angle-sum identities if for any \(x,y\in\RR\): \[S(x+y)=S(x)C(y)+C(x)S(y)\] \[C(x+y)=C(x)C(y)-S(x)S(y)\]
Definition 15.10 (Angle Difference Identities) Two functions \(S,C\) satisfy the angle-sum identities if for any \(x,y\in\RR\): \[S(x-y)=S(x)C(y)-C(x)S(y)\] \[C(x-y)=C(x)C(y)+S(x)S(y)\]
In fact, either of these serves just fine, but for technical reasons (shortening some proofs a little bit) it’s easier to take the angle difference identities as our functional equations.
Definition 15.11 A pair of functions \(S,C\) are called trigonometric if they are continuous solutions to the angle difference identities.
This seems perhaps surprisingly non-restrictive: nowhere have we built in tha these functions are periodic, or differentiable, or anything else! Can all of trigonometry really be reduced to this simple rule and the imposition of continuity? Indeed it can! And this development will be the subject of the final project in this course. –>
15.5 \(\bigstar\) Elementary Functions
The functions you are used to seeing in a calculus course, and in the sciences are called elementary functions, and include all the functions we have discussed so far in this course, as well as messy combinations like
\[\frac{}{}\]
While in the sciences people often non-rigorously think of the elementary functions as simply “those functions which have a formula” we should be more precise as mathematicians. After all, what is to stop us from giving a fancy name like \(\mathrm{char}(x)\) to the Characteristic function of the rationals - then we could say something like \(\sin(e^\mathrm{char}(x))\) ‘has a formula we can write down’?
Definition 15.12 The elementary functions \(\mathcal{E}\) are the set of real valued functions produced using the field operations and composition from the following basic building blocks:
- Constants
- Powers \(x^n\) and their inverses \(\sqrt[m]{x}\)
- Exponentials \(E(x)\) and their inverses \(L(x)\)
- Trigonometric functions \(S(x),C(x)\) and their inverses \(S^{-1}(x),C^{-1}(x)\)
This list includes all the familiar functions; from the tangent \(\tan x=\frac{\sin x}{\cos x}\) to the hyperbolic cosine \(\cosh(x)=\frac{e^x+e^{-x}}{2}\).
But this list, as written is not fully ‘minimal’: we can remove some functions from it without changing the class \(\mathcal{E}\)!
Exercise 15.7 The function \(x^n\) is generated from \(x\) by repeated multiplication, and so is not required to be part of the list of basic building blocks for elementary functions so long as \(x\) is included.
Show the same is true for the roots \(\sqrt[n]{x}\): given an exponential \(E\) and its inverse \(L\), can you find a formula for the \(n^{th}\) root? (Recall the definition: \(r\) is the \(n^{th}\) root of \(x\) if \(r^n=x\))
Thus, we can reduce without loss of generality the line “powers \(x^n\) and their inverses \(\sqrt[n]{x}\)” to just requiring the identity function \(f(x)=x\) is elementary!
But even further simplification is possible. As we continue to study the transcendental functions, we will learn a lot more about them from their functional definitions. Indeed, we will see that each of these picks out an essentially unique function:
- There is a unique exponential function up to scaling: if \(E_1\) and \(E_2\) are any exponentials, then there exists a constant \(k\in\RR\) such that \(E_2(x)=E_1(kx)\).
- There is a unique logarithm function up to scaling: if \(L_1\) and \(L_2\) are any logarithms, then there exists a constant \(k\in\RR\) such that \(L_2(x)=L_1(kx)\).
- There exists a unique pair of trigonometric functions up to scaling: if \((S_1,C_1)\) and \((S_2,C_2)\) are two pairs of trigonometric functions, then there exists a constant \(k\in\RR\) such that \((S_2(x),C_2(x))=(S_1(kx),C_1(kx))\).
Thus, since constant scaling of the argument is part of the ‘construction kit’ for elementary functions (\(k\) is an elementary function, \(x\) is an elementary function \(kx\) is a field operation and \(f(x)\mapsto f(kx)\) is composition), we will be able to use these results to further simplify our definition by choosing one particular function of each type. We will see that analysis provides a natural choice: the exponential \(e^x\) and its inverse \(\log(x)\), and the trigonometric functions \(\sin(x),\cos(x)\) (from which we will derive the notion of ‘radians’). This abbreviated definition is then
Definition 15.13 The elementary functions \(\mathcal{E}\) are the set of real valued functions produced using the field operations and composition from the following basic building blocks:
- Constants
- The identity \(x\)
- The exponential \(e^x\) and its inverse \(\log(x)\),
- The trigonometric functions \(\sin(x),\cos(x)\) and their inverses \(\arcsin(x),\arccos(x)\).
If we further allow ourselves to work with complex valued functions of a real argument (which is both mathematically convenient, and relevant to the sciences) even further simplification becomes possible: Euler’s formula relates the exponential to the sine and cosine
\[e^{x}=\cos(x)+i\sin(x)\]
So we may derive formulas for \(\sin\) and \(\cos\) in terms of the exponential itself (and formulas for their inverses in terms of the logarithm)
\[\sin(x)=\frac{e^{ix}-e^{-ix}}{2i}\hspace{1cm}\cos(x)=\frac{e^{ix}+e^{-ix}}{2}\]
Thus, if complex constants are allowed instead of just real constants, the definition of elementary functions reduces even further:
Definition 15.14 The elementary functions \(\mathcal{E}\) are the set of real valued functions produced using the field operations and composition from the following basic building blocks:
- Constants \(\in\mathbb{C}\)
- The identity \(x\)
- The exponential \(e^x\) and its inverse \(\log(x)\),
This shows, more than anything else (in my opinion) how the exponential function \(e^x\) truly is fundamental: the large class of elementary functions we’ve known since childhood is simply what you get from using field operations and composition starting from two functions: the identity and the exponential.