16  Theory

Highlights of this Chapter: we prove two foundational results about continuous functions whose proofs have several steps in common:

  • The Extreme Value Theorem: a continuous function achieves a max and min on any closed interval.
  • The Intermediate Value Theorem: a continuous function must take every value between f(a) and f(b) on the interval [a,b].

We will call the proof style introduced with these theorems “proof by continuity”. Finally, we investigate one further topic - uniform continuity - where this proof strategy also helps, and prove that a continuous function on a closed interval is uniformly continuous.

Here cover some of the important theorems about continuous functions that will prove useful during the development of calculus. Just like we have seen various ‘proof styles’ for sequences (recurrent themes in proofs, like ‘an ϵ/2 argument’) one of the biggest takeaways of this section is a proof technique for working with continuous functions. It has three steps, summarized below:

This is to vague on its own to be useful, but in reading the proofs of the boundedness theorem, the extreme value theorem, and the intermediate value theorem below, look out for these three recurrent steps.

16.1 Extreme Values

Proposition 16.1 Let f be a continuous function on a closed interval [a,b]. Then the image f([a,b]) is bounded.

Proof. Assume for the sake of contradiction that f is not bounded. Then for each nN there must be some xn[a,b] where |f(xn)|>n. This sequence {xn} need not be convergent, but it lies in the interval [a,b] so it is bounded, and thus contains a convergent subsequence xnk by Bolzano Weierstrass. Say xnkx. Then since axnkb for all k, by the inequalities of limits we see axb so the limit x lies in the interval [a,b] as well.

But what is the value f(x)? Since f is continuous and xnkx we know that f(xnk)f(x) But for each k, xnk has the property that f(xnk)>nk by definition. Thus, the sequence f(xnk) is not bounded, and cannot be convergent (since all convergent sequences are bounded). This is a contradiction, as it implies that f(x) is not defined, even though we have assumed f is defined on the entire interval [a,b].

Thus, no such sequence xn is possible, and so there must be some n where |f(x)|<n for all x[a,b]. That is, f must be bounded on [a,b].

Building off this result, one can prove that a continuous function actually achieves its upper and lower bounds on any closed interval. This result will play a role several times across the theory of functions and derivatives, so we give it a memorable name: the extreme value theorem (as maxima and minima taken collectively are called extrema).

Theorem 16.1 (The Extreme Value Theorem) Let f be a continuous function on a closed interval [a,b]. Then f achieves a maximum and minimum value: that is, there exists a point p where f(p)f(x) for all x[a,b], and a q where f(q)f(x) for all x[a,b].

Proof. Let f be continuous on [a,b] and let R={f(x)x[a,b]} be the set of outputs, or the range of f. Since f is bounded we see that R is a bounded subset of R, and so by completeness m=infRM=supR must exist. Our goal is to find values xm,xM[a,b] for which the infimum and supremum are realized: f(xm)=mf(xM)=M

Here we show this holds for the supremum, the infimum is left as an exercise below. Since M is the supremum, for any ϵ>0 we know that Mϵ is not an upper bound for R={f(x)x[a,b]}: thus there must be some x where f(x)>Mϵ. So letting ϵ=1/n each n, let xn be a point where M1n<f(xn)M. As n we know M1nM and so by the squeeze theorem we see that f(xn)M as well.

We don’t know that the points xn themselves converge, but we do know that this entire sequence lies inside the closed interval [a,b] so its bounded and Bolzano Weierstrass lets us extract a convergent subsequence xnkx. And as axnkb it follows that the limit x[a,b] as well. Because subsequences of a convgent sequence converge to the same limit, we know that f(xnk) is convergent, and still has limit M. But now we can finally use continuity!

Since f is continuous, we know limf(xn)=f(limxn), and so M=f(x). Thus we managed to find a point x[a,b] where f(x) is the supremum: f(x) is an upper bound for all possible values of f on [a,b], which by definition means its the max value! So f achieves a maximum on [a,b].

Exercise 16.1 Complete the proof of the extreme value theorem by showing that the infimum of a function on a closed interval is also realized as its value at a point.

16.2 Intermediate Values

The intermediate value theorem is the rigorous version of “you can draw the graph of a continuous function without picking up your pencil”.

One note: in the statement below we use the phrase y is between f(a) and f(b) as a shorthand to mean that either f(a)<y<f(b) or f(b)<y<f(a) (as we don’t know if f(a) or f(b) is larger).

Theorem 16.2 (The Intermediate Value Theorem) Let f be a continuous function on the interval [a,b], and let y be any number between f(a) and f(b). Then there exists an x between a and b such that y=f(x).

Proof. Without loss of generality we will assume that f(a)<f(b) so that y lies in the interval [f(a),f(b)] (the other case is analogous, we just instead must write the interval [f(b),f(a)]). We wish to find a point x[a,b] where f(x)=y, so we start by defining the set of points where f(x) is less than or equal to y:

S={x[a,b]f(x)y}

This set is nonempty: aS as f(a)<y by assumption. And its bounded above by b: if xS then x[a,b] so xb by definition. Thus, the supremum σ=supS exists, and σ[a,b]. We will show that f(σ)=y, by showing both inequalities f(σ)y and f(σ)y.

First, we show . Since σ is the supremeum, for each n we know that σ1n is not an upper bound, and so there must be an point xn(σ1/n,σ) where f(xn)y. The squeeze theorem assures that xnσ, and the continuity of f assures that f(xn) converges (since xn does). But for all n we know f(xn)y, so by the inequalities of limits we also know limf(xn)=f(σ)y.

Next, we show . First note that σb as f(σ)y but f(b)>y. So, σ<b and so after truncating finitely many terms, the sequence xn=σ+1/n lies strictly between σ and b. Since this sequence is greater than the upper bound σ, we know that none of the xn are in S and so f(xn)>y by definition, for all n. But as n the sequence of xn’s is squeezed to converge to σ, and so by continuity we know f(σ)=f(limxn)=limf(xn) Applying the inequalities of limits this time yields the reverse: since for all n we know f(xn)>y, it follows that limf(xn)y so f(σ)y.

Putting these together we know that f(σ) is some number which must simultaneously by y and y. The only number satisfying both of these inequalities is y itself, so f(σ)=y

Historically, the intermediate value theorem was one of the reasons for developing much of analysis: mathematicians knew that whatever the correct formal definition of continuity was, it should certainly imply this! So, our proof of the intermediate value theorem (which embodies the intuitive notion of continuity) may be seen as evidence that we have chosen good definitions of continuity and convergence: they work as we expect!

Remark 16.1. It may seem at first that this is EQUIVALENT to continuity: if a function satisfies the intermediate value property, then its continuous. Try to prove it! Where do you get stuck?

Example 16.1 Consider the following function f(x)={sin(1x)x00x=0

Then f satisfies the conclusion of the intermediate value theorem on every closed interval, but f is not continuous at 0.

16.2.1 Applications of the IVT

The intermediate value theorem has many applications, as it is often the case that we know information about a function at several points, and want to infer information about its value at others. One immediate application is a way of finding roots:

Corollary 16.1 If f is a continuous function on an interval and it is positive one endpoint and negative on the other, then f has a zero in-between.

This suggests a means of finding the zeros of a function, which narrows in on them exponentially fast! Called “bisection”: find any two points where function changes sign. Divide region in half, evaluate at midpoint. Keep interval with different sign endpoints, repeat.

Second, this lets us understand something about the range of continuous functions:

Corollary 16.2 If f is a continuous function and IR is a closed interval, then f(I) is an interval.

Here we allow the degenerate case [a,a]={a} to count as an interval, if f is constant.

Another application is to prove the existence of certain inverse functions - we will look here at the example of roots. Of course, we already have a rigorous argument for the existence of x for any nonnegative x, but this argument was quite low-level: working directly with the definition of supremum and the Archimedean property! Now that we have built up more machinery, we can re-prove the same result in a much cleaner way:

Example 16.2 For every v>0 there exists a positive u with u2=v: we call this the square root u=v.

Proof. Let v>0 and consider the function f(x)=x2v. This function is continuous, and at x=0 this function is negative, so all we need to do is find a point where the function is positive to be able to apply the IVT. Note f(v+1)=(v+1)2v=v2+v+1 is positive: thus there must be some point u[0,v+1] such that u2=v, as required.

Exercise 16.2 For any xR there exists a unique a with a3=x: we call this a the cube root.

Theorem 16.3 For every x0 there exists a unique positive number y such that yn=x.

16.2.2 Fixed Points

Another application of the intermediate value theorem beyond finding roots is to prove various fixed point theorems which guarantee that, under certain conditions on a function f there is a point x with f(x)=x.

Example 16.3 (A Fixed Point Theorem) Let f:[0,1][0,1] be any continuous function. Then there exists a fixed point: an x[0,1] where f(x)=x.

Example 16.4 There is a solution to the equation cos(x)=x.

Exercise 16.3 Prove that every cubic polynomial has a real root. Hint: show its enough to consider monic cubics p(x)=x3+ax2+bx+c. Can you prove there is some number M where p(M) is positive but p(M) is negative?

16.3 Uniform Continuity

There is one final tool that often proves useful when working with continuous functions, related to the ϵ-δ definition. That definition specifies that given an ϵ, at each x where f is continuous you can find a δ, but does not give any information about how to do so, meaning we cannot say anything about if the δ’s at nearby points are related.

Thus, just looking at the definition, one may be tempted to also write down a stronger alternative, which says that you can use the same δ at every point:

Definition 16.1 (Uniform Continuity) A function f is uniformly continuous on a domain D if for every ϵ>0 there exists a *single δthatcanbeusedateverypointinthecontinuitydefinition:aD,|xa|<δ|f(x)f(a)|<ϵ$

Sometimes when proving continuity using ϵδ, its easy to directly see that a function is in fact uniform continuous as when playing the ϵδ game its easy to pick an expression for δ that doesn’t depend on a:

Example 16.5 f(x)=x2 is uniformly continuous on the interval [1,3].

Here’s some scratch work: let ϵ>0. Then at any a we see that |f(x)f(a)|=|x2a2|=|x+a||xa|. If |xa|<δ and we want |f(x)f(a)|<ϵ, this tells us that we want |x+a|δ<ϵ We don’t know what x and a are, but we do know they are points in the interval [1,3]! So, the smallest x+a could be is 1+1=2, and the biggest is 3+3=6. This means that |x+a|δ6δ So, if we can make 6δ<ϵ, we are good! This is totally possible: just set δ=ϵ/6. Below is the rigorous proof.

Proof. Let ϵ>0, and set δ=ϵ/6. Note that for any a[1,3] and any x within δ of a, we know a3 and x3 so x+a6. But this implies that |x2a2|=|x+a||xa|6|xa|<6δ<6ϵ6=ϵ And so f is uniformly continuous, as this single choice of δ works for every point a[1,3].

Intuitively, what this means is that at our function f cannot vary too much over any fixed interval: we can use the same error bar at any point to control the total change in y values! But of course, this is not true for every continuous function - functions that change arbitrarily quickly (say, by having a vertical asymptote, or an accumulation of oscillations) require smaller and smaller choices of δ as one approaches the area where the function is ‘behaving badly’.

Example 16.6 f(x)=1x is not uniformly continuous on the interval (0,)

Again, lets start with some scratch work. First, notice that we can easily see (via the sequence definition of continuity) that f is continuous on (0,). But, let’s actually do it to see what happens:

Looking at |f(x)f(a)| we can do some algebra to see |1x1a|=|axxa|=δxa

We want to make this less than ϵ, and we know that x is within δ of a (so the smallest it could be is aδ). Thus, δ/a(aδ)<ϵ and we can solve this for δ:

δ=a2ϵ1+aϵ

This gives us for each a, a different δ. What we might like to do is to pick the minimum δ - that would work for all a! But here we have a problem - there is no minimum: as a0, the δ we need to choose goes to zero as well.

Example 16.7 The function f(x)=sin(1x) is continuous, but is not uniformly continuous on (0,), or on any interval of the form (0,L). As a gets closer to 0, one must choose smaller and smaller δs to keep the oscillation of sin(1/x) less than ϵ.

Both of these functions have problems stemming from a function misbehaving at the boundary of an open interval, as when approaching this endpoint our allowed choices of δ go to zero. A natural question is - is this the only problem that can occur? That is, if we have a function defined on a closed interval, can we always pick a uniform δ?

Indeed we can! This tells us that on closed intervals, continuous functions are even nicer behaved than we originally knew: they must be uniformly continuous.

Theorem 16.4 If f is continuous on a closed interval I, f is uniformly continuous.

Proof. Assume for the sake of contradiction that f is not uniformly continuous, and fix ϵ>0. Then there is no fixed δ that works, so for any proposed δ, there must be some a where it fails.

We can use this to produce a sequence: for δ=1/n let anI be a point where this δ fails: there is some xn within 1/n of an but |f(xn)f(an)|>ϵ.

Thus, in fact we have two sequences xn and an! We know very little about either except that they are in a closed interval I, so we can apply Bolzano Weierstrass to get convergent subsequences (we have to be a bit careful here, see the exercise below).

We will call the subsequences Xn and An (with capital letters). Now that we know they both converge, we can see that they also have the same limit: (as, by construction |XnAn|<1n). Call that limit L.

Then since f is continuous at L, we know that limf(Xn)=f(limXn)=f(L)=f(limAn)=limf(An) Thus, limf(Xn)f(An)=0. However this is impossible, since for all values of n we know |f(Xn)f(An)|>ϵ! This is a contradiction, and thus there must have been some uniform δ that worked all along.

Exercise 16.4 Let xn and yn be two bounded sequences. Show that it is possible to choose some subsequence of the indices nk such that the subsequences xnk and ynk both converge.

(Note we can’t apply Bolzano Weierstrass individually to xn and yn: what if that gives you that the even subsequence of xn is convergent, and the odd subsequence of yn is convergent!)

Exercise 16.5 (Uniform Continuity and R:) We know that if f is continuous on any closed interval it is uniformly continuous, but it is also possible that functions on open or infinite intervals are uniformly continuous. Show this by example, confirming that f(x)=11+x2 is uniformly continuous on the entire real line.
Hint: try to simplify and overestimate the quantity f(x)f(a): remember that 1+x2 and 1+a2 are always 1!