16 Theory
Highlights of this Chapter: we prove two foundational results about continuous functions whose proofs have several steps in common:
- The Extreme Value Theorem: a continuous function achieves a max and min on any closed interval.
- The Intermediate Value Theorem: a continuous function must take every value between \(f(a)\) and \(f(b)\) on the interval \([a,b]\).
We will call the proof style introduced with these theorems “proof by continuity”. Finally, we investigate one further topic - uniform continuity - where this proof strategy also helps, and prove that a continuous function on a closed interval is uniformly continuous.
Here cover some of the important theorems about continuous functions that will prove useful during the development of calculus. Just like we have seen various ‘proof styles’ for sequences (recurrent themes in proofs, like ‘an \(\epsilon/2\) argument’) one of the biggest takeaways of this section is a proof technique for working with continuous functions. It has three steps, summarized below:
- Use whatever information you have to start, to construct a sequence of points.
- Use Bolzano Weierstrass to find a convergent subsequence.
- Apply \(f\) to that sequence and use continuity to know the result is also convergent.
This is to vague on its own to be useful, but in reading the proofs of the boundedness theorem, the extreme value theorem, and the intermediate value theorem below, look out for these three recurrent steps.
16.1 Extreme Values
Proposition 16.1 Let \(f\) be a continuous function on a closed interval \([a,b]\). Then the image \(f([a,b])\) is bounded.
Proof. Assume for the sake of contradiction that \(f\) is not bounded. Then for each \(n\in\NN\) there must be some \(x_n\in[a,b]\) where \(|f(x_n)|>n\). This sequence \(\{x_n\}\) need not be convergent, but it lies in the interval \([a,b]\) so it is bounded, and thus contains a convergent subsequence \(x_{n_k}\) by Bolzano Weierstrass. Say \(x_{n_k}\to x\). Then since \(a\leq x_{n_k}\leq b\) for all \(k\), by the inequalities of limits we see \(a\leq x\leq b\) so the limit \(x\) lies in the interval \([a,b]\) as well.
But what is the value \(f(x)\)? Since \(f\) is continuous and \(x_{n_k}\to x\) we know that \[f(x_{n_k})\to f(x)\] But for each \(k\), \(x_{n_k}\) has the property that \(f(x_{n_k})>n_k\) by definition. Thus, the sequence \(f(x_{n_k})\) is not bounded, and cannot be convergent (since all convergent sequences are bounded). This is a contradiction, as it implies that \(f(x)\) is not defined, even though we have assumed \(f\) is defined on the entire interval \([a,b]\).
Thus, no such sequence \(x_n\) is possible, and so there must be some \(n\) where \(|f(x)|<n\) for all \(x\in [a,b]\). That is, \(f\) must be bounded on \([a,b]\).
Building off this result, one can prove that a continuous function actually achieves its upper and lower bounds on any closed interval. This result will play a role several times across the theory of functions and derivatives, so we give it a memorable name: the extreme value theorem (as maxima and minima taken collectively are called extrema).
Theorem 16.1 (The Extreme Value Theorem) Let \(f\) be a continuous function on a closed interval \([a,b]\). Then \(f\) achieves a maximum and minimum value: that is, there exists a point \(p\) where \(f(p)\geq f(x)\) for all \(x\in[a,b]\), and a \(q\) where \(f(q)\leq f(x)\) for all \(x\in [a,b].\)
Proof. Let \(f\) be continuous on \([a,b]\) and let \(R=\{f(x)\mid x\in[a,b]\}\) be the set of outputs, or the range of \(f\). Since \(f\) is bounded we see that \(R\) is a bounded subset of \(\RR\), and so by completeness \[m=\inf R\hspace{1cm} M=\sup R\] must exist. Our goal is to find values \(x_m, x_M\in [a,b]\) for which the infimum and supremum are realized: \[f(x_m)=m\hspace{1cm}f(x_M)=M\]
Here we show this holds for the supremum, the infimum is left as an exercise below. Since \(M\) is the supremum, for any \(\epsilon>0\) we know that \(M-\epsilon\) is not an upper bound for \(R=\{f(x)\mid x\in[a,b]\}\): thus there must be some \(x\) where \(f(x)>M-\epsilon\). So letting \(\epsilon=1/n\) each \(n\), let \(x_n\) be a point where \(M-\frac{1}{n}<f(x_n)\leq M\). As \(n\to\infty\) we know \(M-\frac{1}{n}\to M\) and so by the squeeze theorem we see that \(f(x_n)\to M\) as well.
We don’t know that the points \(x_n\) themselves converge, but we do know that this entire sequence lies inside the closed interval \([a,b]\) so its bounded and Bolzano Weierstrass lets us extract a convergent subsequence \(x_{n_k}\to x\). And as \(a\leq x_{n_k}\leq b\) it follows that the limit \(x\in[a,b]\) as well. Because subsequences of a convgent sequence converge to the same limit, we know that \(f(x_{n_k})\) is convergent, and still has limit \(M\). But now we can finally use continuity!
Since \(f\) is continuous, we know \(\lim f(x_n)=f(\lim x_n)\), and so \(M=f(x)\). Thus we managed to find a point \(x\in[a,b]\) where \(f(x)\) is the supremum: \(f(x)\) is an upper bound for all possible values of \(f\) on \([a,b]\), which by definition means its the max value! So \(f\) achieves a maximum on \([a,b]\).
Exercise 16.1 Complete the proof of the extreme value theorem by showing that the infimum of a function on a closed interval is also realized as its value at a point.
16.2 Intermediate Values
The intermediate value theorem is the rigorous version of “you can draw the graph of a continuous function without picking up your pencil”.
One note: in the statement below we use the phrase \(y\) is between \(f(a)\) and \(f(b)\) as a shorthand to mean that either \(f(a)<y<f(b)\) or \(f(b)<y<f(a)\) (as we don’t know if \(f(a)\) or \(f(b)\) is larger).
Theorem 16.2 (The Intermediate Value Theorem) Let \(f\) be a continuous function on the interval \([a,b]\), and let \(y\) be any number between \(f(a)\) and \(f(b)\). Then there exists an \(x\) between \(a\) and \(b\) such that \(y=f(x)\).
Proof. Without loss of generality we will assume that \(f(a)<f(b)\) so that \(y\) lies in the interval \([f(a),f(b)]\) (the other case is analogous, we just instead must write the interval \([f(b),f(a)]\)). We wish to find a point \(x\in[a,b]\) where \(f(x)=y\), so we start by defining the set of points where \(f(x)\) is less than or equal to \(y\):
\[S=\{x\in[a,b]\mid f(x)\leq y\}\]
This set is nonempty: \(a\in S\) as \(f(a)<y\) by assumption. And its bounded above by \(b\): if \(x\in S\) then \(x\in[a,b]\) so \(x\leq b\) by definition. Thus, the supremum \(\sigma = \sup S\) exists, and \(\sigma\in[a,b]\). We will show that \(f(\sigma)=y\), by showing both inequalities \(f(\sigma)\leq y\) and \(f(\sigma)\geq y\).
First, we show \(\leq\). Since \(\sigma\) is the supremeum, for each \(n\) we know that \(\sigma-\frac{1}{n}\) is not an upper bound, and so there must be an point \(x_n\in(\sigma-1/n,\sigma)\) where \(f(x_n)\leq y\). The squeeze theorem assures that \(x_n\to\sigma\), and the continuity of \(f\) assures that \(f(x_n)\) converges (since \(x_n\) does). But for all \(n\) we know \(f(x_n)\leq y\), so by the inequalities of limits we also know \(\lim f(x_n)=f(\sigma)\leq y\).
Next, we show \(\geq\). First note that \(\sigma\neq b\) as \(f(\sigma)\leq y\) but \(f(b)>y\). So, \(\sigma< b\) and so after truncating finitely many terms, the sequence \(x_n=\sigma+1/n\) lies strictly between \(\sigma\) and \(b\). Since this sequence is greater than the upper bound \(\sigma\), we know that none of the \(x_n\) are in \(S\) and so \(f(x_n)>y\) by definition, for all \(n\). But as \(n\to\infty\) the sequence of \(x_n\)’s is squeezed to converge to \(\sigma\), and so by continuity we know \[f(\sigma)=f(\lim x_n)=\lim f(x_n)\] Applying the inequalities of limits this time yields the reverse: since for all \(n\) we know \(f(x_n)>y\), it follows that \(\lim f(x_n)\geq y\) so \(f(\sigma)\geq y\).
Putting these together we know that \(f(\sigma)\) is some number which must simultaneously by \(\geq y\) and \(\leq y\). The only number satisfying both of these inequalities is \(y\) itself, so \[f(\sigma)=y\]
Historically, the intermediate value theorem was one of the reasons for developing much of analysis: mathematicians knew that whatever the correct formal definition of continuity was, it should certainly imply this! So, our proof of the intermediate value theorem (which embodies the intuitive notion of continuity) may be seen as evidence that we have chosen good definitions of continuity and convergence: they work as we expect!
Remark 16.1. It may seem at first that this is EQUIVALENT to continuity: if a function satisfies the intermediate value property, then its continuous. Try to prove it! Where do you get stuck?
Example 16.1 Consider the following function \[f(x)=\begin{cases} \sin\left(\frac{1}{x}\right)& x\neq 0\\ 0 & x=0 \end{cases}\]
Then \(f\) satisfies the conclusion of the intermediate value theorem on every closed interval, but \(f\) is not continuous at \(0\).
16.2.1 Applications of the IVT
The intermediate value theorem has many applications, as it is often the case that we know information about a function at several points, and want to infer information about its value at others. One immediate application is a way of finding roots:
Corollary 16.1 If \(f\) is a continuous function on an interval and it is positive one endpoint and negative on the other, then \(f\) has a zero in-between.
This suggests a means of finding the zeros of a function, which narrows in on them exponentially fast! Called “bisection”: find any two points where function changes sign. Divide region in half, evaluate at midpoint. Keep interval with different sign endpoints, repeat.
Second, this lets us understand something about the range of continuous functions:
Corollary 16.2 If \(f\) is a continuous function and \(I\subset\RR\) is a closed interval, then \(f(I)\) is an interval.
Here we allow the degenerate case \([a,a]=\{a\}\) to count as an interval, if \(f\) is constant.
Another application is to prove the existence of certain inverse functions - we will look here at the example of roots. Of course, we already have a rigorous argument for the existence of \(\sqrt{x}\) for any nonnegative \(x\), but this argument was quite low-level: working directly with the definition of supremum and the Archimedean property! Now that we have built up more machinery, we can re-prove the same result in a much cleaner way:
Example 16.2 For every \(v> 0\) there exists a positive \(u\) with \(u^2=v\): we call this the square root \(u=\sqrt{v}\).
Proof. Let \(v>0\) and consider the function \(f(x)=x^2-v\). This function is continuous, and at \(x=0\) this function is negative, so all we need to do is find a point where the function is positive to be able to apply the IVT. Note \(f(v+1)=(v+1)^2-v=v^2+v+1\) is positive: thus there must be some point \(u\in[0,v+1]\) such that \(u^2=v\), as required.
Exercise 16.2 For any \(x\in\RR\) there exists a unique \(a\) with \(a^3=x\): we call this \(a\) the cube root.
Theorem 16.3 For every \(x\geq 0\) there exists a unique positive number \(y\) such that \(y^n=x\).
16.2.2 Fixed Points
Another application of the intermediate value theorem beyond finding roots is to prove various fixed point theorems which guarantee that, under certain conditions on a function \(f\) there is a point \(x\) with \(f(x)=x\).
Example 16.3 (A Fixed Point Theorem) Let \(f\colon [0,1]\to [0,1]\) be any continuous function. Then there exists a fixed point: an \(x\in[0,1]\) where \(f(x)=x\).
Example 16.4 There is a solution to the equation \(\cos(x)=x\).
Exercise 16.3 Prove that every cubic polynomial has a real root. Hint: show its enough to consider monic cubics \(p(x)=x^3+ax^2+bx+c\). Can you prove there is some number \(M\) where \(p(M)\) is positive but \(p(-M)\) is negative?
16.3 Uniform Continuity
There is one final tool that often proves useful when working with continuous functions, related to the \(\epsilon\)-\(\delta\) definition. That definition specifies that given an \(\epsilon\), at each \(x\) where \(f\) is continuous you can find a \(\delta\), but does not give any information about how to do so, meaning we cannot say anything about if the \(\delta\)’s at nearby points are related.
Thus, just looking at the definition, one may be tempted to also write down a stronger alternative, which says that you can use the same \(\delta\) at every point:
Definition 16.1 (Uniform Continuity) A function \(f\) is uniformly continuous on a domain \(D\) if for every \(\epsilon>0\) there exists a *single \(\delta* that can be used at every point in the continuity definition:\)\(\forall a\in D,\, |x-a|<\delta\implies |f(x)-f(a)|<\epsilon\)$
Sometimes when proving continuity using \(\epsilon-\delta\), its easy to directly see that a function is in fact uniform continuous as when playing the \(\epsilon-\delta\) game its easy to pick an expression for \(\delta\) that doesn’t depend on \(a\):
Example 16.5 \(f(x)=x^2\) is uniformly continuous on the interval \([1,3]\).
Here’s some scratch work: let \(\epsilon>0\). Then at any \(a\) we see that \(|f(x)-f(a)|=|x^2-a^2|=|x+a||x-a|\). If \(|x-a|<\delta\) and we want \(|f(x)-f(a)|<\epsilon\), this tells us that we want \[|x+a|\delta<\epsilon\] We don’t know what \(x\) and \(a\) are, but we do know they are points in the interval \([1,3]\)! So, the smallest \(x+a\) could be is \(1+1=2\), and the biggest is \(3+3=6\). This means that \[|x+a|\delta \leq 6\delta\] So, if we can make \(6\delta<\epsilon\), we are good! This is totally possible: just set \(\delta=\epsilon/6\). Below is the rigorous proof.
Proof. Let \(\epsilon>0\), and set \(\delta=\epsilon/6\). Note that for any \(a\in [1,3]\) and any \(x\) within \(\delta\) of \(a\), we know \(a\leq 3\) and \(x\leq 3\) so \(x+a\leq 6\). But this implies that \[|x^2-a^2|=|x+a||x-a|\leq 6|x-a|<6\delta<6\frac{\epsilon}{6}=\epsilon\] And so \(f\) is uniformly continuous, as this single choice of \(\delta\) works for every point \(a\in [1,3]\).
Intuitively, what this means is that at our function \(f\) cannot vary too much over any fixed interval: we can use the same error bar at any point to control the total change in \(y\) values! But of course, this is not true for every continuous function - functions that change arbitrarily quickly (say, by having a vertical asymptote, or an accumulation of oscillations) require smaller and smaller choices of \(\delta\) as one approaches the area where the function is ‘behaving badly’.
Example 16.6 \(f(x)=\frac{1}{x}\) is not uniformly continuous on the interval \((0,\infty)\)
Again, lets start with some scratch work. First, notice that we can easily see (via the sequence definition of continuity) that \(f\) is continuous on \((0,\infty)\). But, let’s actually do it to see what happens:
Looking at \(|f(x)-f(a)|\) we can do some algebra to see \[\left|\frac{1}{x}-\frac{1}{a}\right|=\left|\frac{a-x}{xa}\right|=\frac{\delta}{xa}\]
We want to make this less than \(\epsilon\), and we know that \(x\) is within \(\delta\) of \(a\) (so the smallest it could be is \(a-\delta\)). Thus, \(\delta/a(a-\delta)<\epsilon\) and we can solve this for \(\delta\):
\[\delta=\frac{a^2\epsilon}{1+a\epsilon}\]
This gives us for each \(a\), a different \(\delta\). What we might like to do is to pick the minimum \(\delta\) - that would work for all \(a\)! But here we have a problem - there is no minimum: as \(a\to 0\), the \(\delta\) we need to choose goes to zero as well.
Example 16.7 The function \(f(x)=\sin\left(\frac{1}{x}\right)\) is continuous, but is not uniformly continuous on \((0,\infty)\), or on any interval of the form \((0,L)\). As \(a\) gets closer to \(0\), one must choose smaller and smaller \(\delta\)s to keep the oscillation of \(\sin(1/x)\) less than \(\epsilon\).
Both of these functions have problems stemming from a function misbehaving at the boundary of an open interval, as when approaching this endpoint our allowed choices of \(\delta\) go to zero. A natural question is - is this the only problem that can occur? That is, if we have a function defined on a closed interval, can we always pick a uniform \(\delta\)?
Indeed we can! This tells us that on closed intervals, continuous functions are even nicer behaved than we originally knew: they must be uniformly continuous.
Theorem 16.4 If \(f\) is continuous on a closed interval \(I\), \(f\) is uniformly continuous.
Proof. Assume for the sake of contradiction that \(f\) is not uniformly continuous, and fix \(\epsilon>0\). Then there is no fixed \(\delta\) that works, so for any proposed \(\delta\), there must be some \(a\) where it fails.
We can use this to produce a sequence: for \(\delta=1/n\) let \(a_n\in I\) be a point where this \(\delta\) fails: there is some \(x_n\) within \(1/n\) of \(a_n\) but \(|f(x_n)-f(a_n)|>\epsilon\).
Thus, in fact we have two sequences \(x_n\) and \(a_n\)! We know very little about either except that they are in a closed interval \(I\), so we can apply Bolzano Weierstrass to get convergent subsequences (we have to be a bit careful here, see the exercise below).
We will call the subsequences \(X_n\) and \(A_n\) (with capital letters). Now that we know they both converge, we can see that they also have the same limit: (as, by construction \(|X_n-A_n|<\frac{1}{n}\)). Call that limit \(L\).
Then since \(f\) is continuous at \(L\), we know that \[\lim f(X_n)=f(\lim X_n)=f(L)=f(\lim A_n)=\lim f(A_n)\] Thus, \(\lim f(X_n)-f(A_n)=0\). However this is impossible, since for all values of \(n\) we know \(|f(X_n)-f(A_n)|>\epsilon\)! This is a contradiction, and thus there must have been some uniform \(\delta\) that worked all along.
Exercise 16.4 Let \(x_n\) and \(y_n\) be two bounded sequences. Show that it is possible to choose some subsequence of the indices \(n_k\) such that the subsequences \(x_{n_k}\) and \(y_{n_k}\) both converge.
(Note we can’t apply Bolzano Weierstrass individually to \(x_n\) and \(y_n\): what if that gives you that the even subsequence of \(x_n\) is convergent, and the odd subsequence of \(y_n\) is convergent!)
Exercise 16.5 (Uniform Continuity and \(\RR\):) We know that if \(f\) is continuous on any closed interval it is uniformly continuous, but it is also possible that functions on open or infinite intervals are uniformly continuous. Show this by example, confirming that \[f(x)=\frac{1}{1+x^2}\] is uniformly continuous on the entire real line.
Hint: try to simplify and overestimate the quantity \(f(x)-f(a)\): remember that \(1+x^2\) and \(1+a^2\) are always \(\geq 1\)!