$$ \newcommand{\RR}{\mathbb{R}} \newcommand{\QQ}{\mathbb{Q}} \newcommand{\CC}{\mathbb{C}} \newcommand{\NN}{\mathbb{N}} \newcommand{\ZZ}{\mathbb{Z}} \newcommand{\FF}{\mathbb{F}} % ALTERNATE VERSIONS % \newcommand{\uppersum}[1]{{\textstyle\sum^+_{#1}}} % \newcommand{\lowersum}[1]{{\textstyle\sum^-_{#1}}} % \newcommand{\upperint}[1]{{\textstyle\smallint^+_{#1}}} % \newcommand{\lowerint}[1]{{\textstyle\smallint^-_{#1}}} % \newcommand{\rsum}[1]{{\textstyle\sum_{#1}}} \newcommand{\uppersum}[1]{U_{#1}} \newcommand{\lowersum}[1]{L_{#1}} \newcommand{\upperint}[1]{U_{#1}} \newcommand{\lowerint}[1]{L_{#1}} \newcommand{\rsum}[1]{{\textstyle\sum_{#1}}} \newcommand{\partitions}[1]{\mathcal{P}_{#1}} \newcommand{\sampleset}[1]{\mathcal{S}_{#1}} \newcommand{\erf}{\operatorname{erf}} $$

24  Theory

Highlights of this Chapter: we study the relationship between the behavior of a function and its derivative, proving several foundational results in the theory of differentiable functions:

  • Fermat’s Theorem: A differentiable function has derivative zero at an extremum.
  • Rolle’s Theorem: if a differentiable function is equal at two points, it must have zero derivative at some point in-between.
  • The Mean Value Theorem: the average slope of a differentiable function on an interval is realized as the instantaneous slope at some point inside that interval.

The Mean Value theorem is really the star of the show, and we go on to study several of its prominent applications:

  • The Second Derivative Test for function extrema
  • The ambiguity in antidifferentiation is at most a constant
  • L’Hospital’s rule

24.1 Derivatives and Extrema

The most memorable procedure from calculus I is likely to find maxes and mins, set the derivative equal to zero and solve. This is not precisely correct (nor is it exactly what is taught in Calculus I; just what is remembered!) so here we will give the precise story.

Definition 24.1 (Local Extrema) Let \(f\) be a real-valued function with domain \(D\subset\RR\). Then a point \(m\in D\) is a local maximum if \(f(m)\geq f(x)\) for \(x\) near \(a\), and is a local minimum if \(f(m)\leq f(x)\) for \(x\) near \(m\). A point that is either a local minimum or local maximum is known as a local extremum.

By ‘\(x\) near \(m\)’ we mean that there is some interval \((a,b)\subset R\) containing \(m\), where the claimed inequality holds for all \(x\in(a,b)\).

That the derivative (rate of change) should be able to detect local extrema is an old idea, even predating the calculus of Newton and Leibniz. Though certainly realized earlier in certain cases, it is Fermat who is credited with the first general theorem (so, the result below is often called Fermat’s theorem)

Theorem 24.1 (Finding Local Extrema (Fermat’s Thm)) Let \(f\) be a function with a local extremum at \(m\). Then if \(f\) is differentiable at \(m\), we must have \(f^\prime(m)=0\).

Proof. Without loss of generality we will assume that \(m\) is the location of a local minimum (the same argument applies for local maxima, except the inequalities in the numerators reverse). As \(f\) is differentiable at \(m\), we know that both the right and left hand limits of the difference quotient exist, and are equal.

First, some preliminaries that apply to both right and left limits. Since we know the limit exists, it’s value can by computed via any appropriate sequence \(x_n\to m\). Choosing some such sequence we investigate the difference quotient

\[\frac{f(x_n)-f(m)}{x_n-m}\]

Because \(m\) is a local minimum, there is some interval (say, of radius \(\epsilon\)) about \(m\) where \(f(x)\geq f(m)\). As \(x_n\to m\), we know the sequence eventually enters this interval (by the definition of convergence) thus for all sufficiently large \(n\) we know \[f(x_n)-f(m)\geq 0\]

Now, we separate out the limits from above and below, starting with \(\lim_{x\to m^-}\). If \(x_n\to m\) but \(x_n<m\) then we know \(x_n-m\) is negative for all \(n\), and so

\[\frac{f(x_n)-f(m)}{x_n-m}=\frac{\mathrm{pos}}{\mathrm{neg}}=\mathrm{neg}\]

Thus, for all \(n\) the difference quotient is \(\leq 0\), and so the limit must be as well! That is, \[\lim_{x\to m^-}\frac{f(x)-f(m)}{x-m}\leq 0\]

Performing the analogous investigation for the limit from above, we now have a sequence \(x_n\to m\) with \(x_n\geq m\). This changes the sign of the denominator, so

\[\frac{f(x_n)-f(m)}{x_n-m}=\frac{\mathrm{pos}}{\mathrm{pos}}=\mathrm{pos}\]

Again, if the difference quotient is \(\geq 0\) for all \(n\), we know the same is true of the limit.

\[\lim_{x\to m^+}\frac{f(x)-f(m)}{x-m}\geq 0\]

But, by our assumption that \(f\) is differentiable at \(m\) we know both of these must be equal! And if one is \(\geq 0\) and the other \(\leq 0\) the only possibility is that \(f^\prime(m)=0\).

This provides a clear strategy for tracking down local extrema, especially for functions that are only occasionally not differentiable (piecewise functions, for example): we only need to check the points where \(f^\prime\) is either zero, or undefined. This motivates the below definition, giving a uniform term to these disparate categories:

Definition 24.2 (Critical Point) A critical point of a function \(f\) is a point where either (1) \(f\) is not differentiable, or (2) \(f\) is differentiable, and the derivative is zero.

Note that not all critical points are necessarily local extrema - Fermat’s theorem only claims that extrema are critical points - not the converse! There are many examples showing this is not an if and only if:

Example 24.1 The function \(f(x)=x^3\) has a critical point at \(x=0\) (as the derivative is zero), but does not have a local extremum there. The function \(g(x)=2x+|x|\) has a critical point at \(0\) (because it is not differentiable there) but also does not have a local extremum.

To classify exactly when a critical point is a local max/min (and crucially, which it is) will require a bit more theory, to come. But if one is only interested in the absolute max and min of the function over its entire domain, this already provides a reasonable strategy, which is one of the early highlights of Calculus I.

Theorem 24.2 (Finding The Global Max and Min) Let \(f\) be a continuous function defined on a closed interval \(I\) with finitely many critical points. Then the absolute maximum and minimum value of \(f\) are explicitly findable via the following procedure:

  • Find the value of \(f\) at the endpoints of \(I\)
  • Find the value of \(f\) at the points of non-differentiability
  • Find the value of \(f\) at the points where \(f^\prime(x)=0\).

The absolute max of \(f\) is the largest of these values, and the the absolute min is the smallest.

Proof. Because \(I\) is a closed interval and \(f\) is continuous, we are guaranteed by the extreme value theorem that \(f\) achieves both a maximum and minimum value. Let these be \(\max,\min\) respectively, realized at points \(M,m\) with \[f(M)=\max\hspace{1cm}f(m)=\min\]

Without loss of generality, we will consider \(M\) (the same argument applies to \(m\)).

First, \(M\) could be at one of the endpoints of \(f\). If it is not, then \(M\) lies in the interior of \(I\), and there is some small interval \((a,b)\) containing \(M\) totally contained in the domain \(I\). Since \(M\) is the location of the global max, we know for all \(x\in I\), \(f(x)\leq f(M)\). Thus, for all \(x\in(a,b)\), \(f(x)\leq f(M)\) so \(M\) is the location of a local max.

But if \(M\) is the location of a local maximum, if \(f\) is differentiable there by Fermat’s theorem we know \(f^\prime(M)=0\). Thus, \(M\) must be a critical point of \(f\) (whether differentiable or not).

Thus, \(M\) occurs in the list of critical points and endpoints, which are the points we checked.

24.2 The Mean Value Theorem

One of the most important theorems relating \(f\) and \(f^\prime\) is the mean value theorem. This is an excellent example of a theorem that is intuitively obvious (from our experience with reasonable functions) but yet requires careful proof (as we know by know many functions have non-intuitive behavior). Indeed, when I teach calculus I, I often paraphrase the mean value theorem as follows:

If you drove 60 miles in one hour, then at some point you must have been driving 60 miles per hour

How can we write this mathematically? Say you drove \(D\) miles in \(T\) hours. If \(f(t)\) is your position as a function of time*, and you were driving between \(t=a\) and \(t=b\) (where \(b-a=T\)), your average speed was

\[\frac{D}{T}=\frac{f(b)-f(a)}{b-a}\]

To then say *at some point you were going \(D\) miles per hour implies that there exists some \(t^\star\) between \(a\) and \(b\) where the instantaneous rate of change - the derivative - is equal to this value. This is exactly the Mean Value Theorem:

Theorem 24.3 (The Mean Value Theorem) If \(f\) is a function which is continuous on the closed interval \([a,b]\) and differentiable on the open interval \((a,b)\), then there exists some \(x^\star\in(a,b)\) where \[f^\prime(x^\star)=\frac{f(b)-f(a)}{b-a}\]

Note: The reason we require differentiability only ont he interior of the interval is that the two sided limit defining the derivative may not exist at the endpoints, (if for example, the domain of \(f\) is only \([a,b]\)).

In this section we will prove the mean value theorem. It’s simplest to break the proof into two steps: first the special case were \(f(a)=f(b)\) (and so we are seeking \(f^\prime(x^\star=0)\)), and then apply this to the general version. This special case is often useful in its own right and so has a name: Rolle’s Theorem.

Theorem 24.4 (Rolle’s Theorem) Let \(f\) be continuous on the closed interval \([a,b]\) and differentiable on \((a,b)\). Then if \(f(b)=f(a)\), there exists some \(x^\star\in (a,b)\) where \(f^\prime(x^\star)=0\).

Proof. Without loss of generality we may take \(f(b)=f(a)=0\) (if their common value is \(k\), consider instead the function \(f(x)-k\), and use the linearity of differentiation to see this yields the same result).

There are two cases: (1) \(f\) is constant, and (2) \(f\) is not. In the first case, \(f^\prime(x)=0\) for all \(x\in(a,b)\) so we may choose any such point. In the second case, since \(f\) is continuous, it achieves both a maximum and minimum value on \([a,b]\) by the extreme value theorem. Because \(f\) is nonconstant these values are distinct, and so at least one of them must be nonzero. Let \(c\in(a,b)\) denote the location of either a (positive) absolute max or (negative) absolute min.

Then, \(c\in(a,b)\) and for all \(x\in(a,b)\), \(f(x)\leq f(c)\) if \(c\) is the absolute min, and \(f(x)\geq f(c)\) if its the max. In both cases, \(c\) satisfies the definition of a local extremum. And, as \(f\) is differentiable on \((a,b)\) this implies \(f^\prime(c)=0\), as required.

Now, we return to the main theorem:

Proof. Let \(f\) be a function satisfying the hypotheses of the mean value theorem, and \(L\) be the secant line connecting \((a,f(a))\) to \((b,f(b))\). Computing this line, \[L=f(a)+\frac{f(b)-f(a)}{b-a}(x-a)\]

Now define the auxiliary function \(g(x)=f(x)-L(x)\). Since \(L(a)=f(a)\) and \(L(b)=f(b)\), we see that \(g\) is zero at both endpoints. Further, since both \(L\) and \(f\) are continuous on \([a,b]\) and differentiable on \((a,b)\), so is \(g\). Thus, \(g\) satisfies the hypotheses of Rolle’s theorem, and so there exists some \(\star\in(a,b)\) with \[g(\star)=0\]

But differentiating \(g\) we find

\[\begin{align*}0&=f^\prime(\star)-L^\prime(\star)\\ &= f^\prime(\star)-\frac{f(b)-f(a)}{b-a} \end{align*}\]

Thus, at \(\star\) we have \(f^\prime(\star)=\frac{f(b)-f(a)}{b-a}\) as claimed

Exercise 24.1 Verify the mean value theorem holds for \(f(x)=x^2+x-1\) on the interval \([4,7]\).

24.2.1 \(\bigstar\) The Simultaneous Mean Value Theorem

One natural extension is to wonder if this can be done for two functions at once: given \(f\) and \(g\) can we find a single point \(c\) where \(f^\prime(c)\) and \(g^\prime(c)\) equal the average slopes of \(f\) and \(g\) respectively?

Exercise 24.2 Show that this is impossible in general, by considering \(f(x)=x^2\) and \(g(x)=x^3\) on the interval \([0,1]\). Show that each has a unique point \(c_f, c_g\) satisfying the Mean Value Theorem, and \(c_f\neq c_g\): thus there is no point that works for both.

A slight weakening of the question may be as follows: perhaps there is no \(c\) giving each of the average slopes individually, but could there be a \(c\) such that the ratio of the instantaneous slopes is equal to the ratio of the average slopes? That is, a \(c\in(a,b)\) such that

\[\frac{f^\prime(c)}{g^\prime(c)}=\frac{\frac{f(b)-f(a)}{b-a}}{\frac{g(b)-g(a)}{b-a}}\]

Exercise 24.3 Show that this revised notion does hold for the functions \(f(x)=x^2\) and \(g(x)=x^3\) on the interval \([0,1]\). In fact, show that this holds on any interval \([a,b]\): there is some \(c\in(a,b)\) where \[\frac{f^\prime(c)}{g^\prime(c)}=\frac{f(b)-f(a)}{g(b)-g(a)}\]

In fact, this holds for all functions \(f,g\), in a result called the generalized, or simultaneous mean value theorem.

Theorem 24.5 (The Simultaneous Mean Value Theorem) Prove that if \(f\) and \(g\) are both continuous on \([a,b]\) and differentiable on \((a,b)\) with \(g^\prime\) nonzero, then there exists some \(c\in (a,b)\) where \[\frac{f^\prime(c)}{g^\prime(c)}=\frac{f(b)-f(a)}{g(b)-g(a)}\]

Exercise 24.4 Prove Theorem 24.5.

Hint: Define some function \(h\) built from \(f\) and \(g\) to which you can apply the mean value theorem, and conclude \[\left[f(b)-f(a)\right]g^\prime(c)=\left[g(b)-g(a)\right]f^\prime(c)\]

24.3 MVT Applications

The mean value theorem is a particularly useful piece of technology, as it lets us connect information about the derivative of a function, to the values of the function itself. This sort of relationship is used all the time in calculus I: three prominent examples are below.

24.3.1 Function Behavior

Proposition 24.1 If \(f\) is is continuous and differentiable on \([a,b]\), then \(f(x)\) is monotone increasing on \([a,b]\) if and only of \(f^\prime(x)\geq 0\) for all \(x\in [a,b]\).

As this is an if and only if statement, we prove the two claims separately. First, we assume that \(f^\prime\geq 0\) and show \(f\) is increasing:

Proof. Let \(x<y\) be any two points in the interval \([a,b]\): we wish to show that \(f(x)\leq f(y)\). By the Mean Value Theorem, we know there must be some point \(\star\in (x,y)\) such that \[f^\prime(\star)=\frac{f(y)-f(x)}{y-x}\]

But, we’ve assumed that \(f^\prime\geq 0\) on the entire interval, so \(f^\prime(\star)\geq 0\). Thus\(\frac{f(y)-f(x)}{y-x}\geq 0\), and since \(y-x\) is positive, this implies

\[f(y)-f(x)\geq 0\]

That is, \(f(y)\geq f(x)\). Note that we can extract even more information here than claimed: if we know that \(f^\prime\) is strictly greater than 0 then following the argument we learn that \(f(y)>f(x)\), so \(f\) is strictly monotone increasing.

Next, we assume \(f\) is increasing and show \(f^\prime\geq 0\):

Proof. Assume \(f\) is increasing on \([a,b]\), and let \(x\in(a,b)\) be arbitrary. Because we have assumed \(f\) is differentiable, we know that the right and left limits both exist and are equal, and that either of them equals the value of the derivative. So, we consider the right limit \[f^\prime(x)=\lim_{t\to x^+}\frac{f(t)-f(x)}{t-x}\]

For any \(t>x\) we know \(f(t)\geq f(x)\) by the increasing hypothesis, and we know that \(t-x>0\) by definition. Thus, for all such \(t\) this difference quotient is nonnegative, and hence remains so in the limit:

\[f^\prime(x)\geq 0\]

Exercise 24.5 Prove the analogous statement for negative derivatives: \(f^\prime(x)\leq 0\) on \([a,b]\) if and only if \(f(x)\) is monotone decreasing on \([a,b]\).

Corollary 24.1 (Distinguishing Maxes and Mins) Let \(f\) be a continuously differentiable function on \([a,b]\) and \(c\in(a,b)\) be a critical point where \(f^\prime(x)<0\) for \(x<c\) and \(f^\prime(x)>0\) if \(x>0\), for all \(x\) in some small interval about \(c\).

Then \(c\) is a local minimum of \(f\).

Proof. By the above, we know that \(f^\prime(x)<0\) for \(x<c\) implies that \(f\) is monotone decreasing for \(x<c\): that is, \(x< c\implies f(x)\geq f(c)\). Similarly, as \(f^\prime(x)>0\) for \(x>0\), we have that \(f\) is increasing, and \(c< x\implies f(c)\leq f(x)\).

Thus, for \(x\) on either side of \(c\) we have \(f(x)\geq f(c)\), so \(c\) is the location of a local minimum.

This is even more simply phrased in terms of the second derivative, as is common in Calculus I.

Corollary 24.2 (The Second Derivative Test) Let \(f\) be a twice continuously differentiable function on \([a,b]\), and \(c\) a critical point. Then if \(f^\prime\prime(c)>0\), the point \(c\) is the location of a local minimum, and if \(f^{\prime\prime}(x)>0\) then \(c\) is the location of a local maximum.

Proof. We consider the case that \(f^{\prime\prime}(c)>0\), the other is analogous. Since \(f^{\prime\prime}\) is continuous and positive at \(c\), we know that there exists a small interval \((c-\delta,c+\delta)\) about \(c\) where \(f^{\prime\prime}\) is positive (by Proposition 14.1).

Thus, by Proposition 24.1, we know on this interval that \(f^\prime\) is an increasing function. Since \(f^\prime(c)=0\), this means that if \(x<c\) we have \(f^\prime(x)<0\) and if \(x>c\) we have \(f^\prime(x)>0\). That is, \(f^\prime\) changes from negative to positive at \(c\), so \(c\) is the location of a local minimum by Corollary 24.1.

24.3.2 Convexity

Recall back from the very introduction to functions we defined the property of convexity, saying that a function was convex if the secant line \(L\) connecting any two points lies strictly above the graph of \(f\), or \(L(x)-f(x)\geq 0\).

It’s good to have a quick review: if \(a,b\) are two points in the domain, the secant line connecting \((a,f(a))\) to \((b,f(b))\) is familiar from our proof of the Mean Value Theorem:

\[L_{a,b}(x)=f(a)+\frac{f(b)-f(a)}{b-a}(x-a)\]

Exercise 24.6 Show that you can equivalently express this secant line as below, via algebraic manipulation: \[L_{a,b}(x)=f(a)\left(\frac{b-x}{b-a}\right)+f(b)\left(\frac{x-a}{b-a}\right)\]

Working even harder, we can come up with a rather simple looking condition that is equivalent to \(f\) lying below its secant line \(L_{a,b}\) for all \(x\in(a,b)\). This is all still strictly algebraic manipulations, encapsulated into a lemma below.

Lemma 24.1 If \(f\) is a function defined on \([a,b]\) the, \(f\) lies below its secant line \(L_{a,b}(x)\) everywhere on the interval if and only if \[\frac{f(b)-f(x)}{b-x}-\frac{f(a)-f(x)}{x-a}>0\] for all \(x\in(a,b)\).

Proof. Because \(1=\frac{b-x}{b-a}+\frac{x-a}{b-a}\), multiplying through by \(f(x)\) yields the identity

\[f(x)=f(x)\frac{b-x}{b-a}+f(x)\frac{x-a}{b-a}\]

Substituting this into the simplified form of Exercise 24.6, we can collect like terms and see

\[\begin{align*}L_{a,b}(x)-f(x)&=\left[f(b)-f(x)\right]\frac{b-x}{b-a}+\left[f(a)-f(x)\right]\frac{x-a}{b-a}\\ &= \frac{x-a}{b-a}\left[f(b)-f(x)\right]-\frac{b-x}{b-a}\left[f(x)-f(a)\right] \end{align*}\]

We are trying to set ourselves up to use the Mean Value Theorem, so there’s one more algebraic trick we can employ: we can multiply and divide the first term by \(b-x\), and multiply and divide the second term by \(x-a\): This gives

\[\begin{align*}L_{a,b}(x)-f(x)&= \frac{b-x}{b-x}\frac{x-a}{b-a}\left[f(b)-f(x)\right]-\frac{x-a}{x-a}\frac{b-x}{b-a}\left[f(x)-f(a)\right]\\ &=\frac{(b-x)(x-a)}{b-a}\frac{f(b)-f(x)}{b-x}-\frac{(b-x)(x-a)}{b-a}\frac{f(x)-f(a)}{x-a} \end{align*}\]

Note that each of these terms has the factor \(\frac{(b-x)(x-a)}{b-a}\) in common, and that this factor is positive (as \(x\in(a,b)\) implies \(b-x>0\) and \(x-a>0\)). Thus, we can factor it out and see that \(L_{a,b}(x)-f(x)\) is positive if and only if the remaining term is positive: that is, if and only if \[\frac{f(b)-f(x)}{b-x}-\frac{f(a)-f(x)}{x-a}>0\] as claimed

Now, our goal is to use the Mean Value Theorem to relate this expression (which is a property of \(f\)) to a property of one of its derivatives (here \(f^{\prime\prime}\)).

Exercise 24.7 If \(f^{\prime\prime}>0\) on the interval \([a,b]\) prove that \(f\) lies below its secant line \(L_{a,b}\).

Hint: Here’s a sketch of how to proceed

  • For \(x\in(a,b)\), start with the expression \(\frac{f(b)-f(x)}{b-x}-\frac{f(a)-f(x)}{x-a}\), which you eventually want to show is positive.
  • Apply the MVT for \(f\) to find points \(c_1\in(a,x)\) and \(c_2\in(x,b)\) where \(f^\prime(c_i)\) equals the respective average slopes.
  • Using this, show that your original expression is equivalent to \((c_2-c_1)\frac{f^\prime(c_2)-f^\prime(c_1)}{c_2-c_1}\), and argue that it is sufficient to show that \(\frac{f^\prime(c_2)-f^\prime(c_1)}{c_2-c_1}\) is positive.
  • Can you apply the MVT again (this time to \(f^\prime\)) and use our assumption on the second derivative to finish the argument?

Using this, we can quickly prove the main claimed result:

Theorem 24.6 If \(f\) is twice differentiable on an interval and \(f^{\prime\prime}>0\) on that interval, then \(f\) is convex on the interval.

Proof. Let \(I\) be the interval in question, and let \(a<b\) be any two points in \(I\). Restricting our function to the interval \([a,b]\) we have \(f^{\prime\prime}(x)>0\) for all \(x\in[a,b]\) by hypothesis; so Exercise 24.7 implies that the secant line lies strictly above the graph. Since the interval \([a,b]\) was arbitrary, this holds for any two such points, which is the definition of convexity.

In fact (though we will not need it) the converse of this is true as well. I’ve stated it below for reference

Theorem 24.7 If \(f\) is convex on an interval, then \(f^{\prime\prime}\) is positive on that interval.

24.3.3 \(\bigstar\) Antidifferentiation

Proposition 24.2 If \(f\) is a differentiable function where \(f^\prime(x)=0\) on an interval \(I\), then \(f\) is constant on that interval.

Proof. Let \(a,b\) be any two points in the interval: we will show that \(f(a)=f(b)\), so \(f\) takes the same value at all points. If \(a<b\) we can apply the mean value theorem to this pair, which furnishes a point \(c\in(a,b)\) such that \[f^\prime(c)=\frac{f(b)-f(a)}{b-a}\] But, \(f^\prime(c)=0\) by assumption! Thus \(f(b)-f(a)=0\), so \(f(b)=f(a)\).

Corollary 24.3 If \(f,g\) are two functions which are differentiable on an interval \(I\) and \(f^\prime=g^\prime\) on \(I\), then there exists a \(C\in\RR\) with \[f(x)=g(x)+C\]

Proof. Consider the function \(h(x)=f(x)-g(x)\). Then by the differentiation laws, \[h^\prime(x)=f^\prime(x)-g^\prime(x)=0\] as we have assumed \(f^\prime=g^\prime\). But now Proposition 24.2 implies that \(h\) is constant, so \(h(x)=C\) for some \(C\). Substituting this in yields \[f(x)=g(x)+C\]

Definition 24.3 Let \(f\) be a function. If \(F\) is a differentiable function with the same domain such that \(F^\prime = f\), we say \(F\) is an antiderivative of \(f\).

Thus, another way of saying Corollary 24.3 is that any two antiderivatives of a function can only differ by a constant. This is the origin of the \(+C\) from calculus, that we will see in the Fundamental Theorem.

24.3.4 \(\bigstar\) L’Hospital’s Rule

L’Hospital’s rule is a very convenient trick for computing tricky limits in calculus: it tells us that when we are trying to evaluate the limit of a quotient of continuous functions and ‘plugging in’ yields the undefined expression \(0/0\) we can attempt to find the limit’s value by differentiating the numerator and denominator, and trying again. Precisely:

Theorem 24.8 Let \(f\) and \(g\) be continuous functions on an interval containing \(a\), and assume that both \(f\) and \(g\) are differentiable on this interval, with the possible exception of the point \(a\).

Then if \(f(a)=g(a)=0\) and \(g^\prime(x)\neq 0\) for all \(x\neq a\), \[\lim_{x\to a}\frac{f^\prime(x)}{g^\prime(x)}=L\hspace{1cm}\textrm{implies}\hspace{1cm}\lim_{x\to a}\frac{f(x)}{g(x)}=L\]

Proof. Assume that \(\lim_{x\to a}\frac{f^\prime(x)}{g^\prime(x)}=L\). Now, we wish to compute the limit of \(f(x)/g(x)\). Recalling that \(f(a)=g(a)=0\) we have that for any \(x\neq a\)

\[\frac{f(x)}{g(x)}=\frac{f(x)-f(a)}{g(x)-g(a)} \]

Applying the simultaneous mean value theorem gives a \(c\in(a,b)\) such that

\[\frac{f^\prime(c)}{g^\prime(c)}=\frac{f(x)-f(a)}{g(x)-g(a)}\]

Now, our goal is to calculate \(\lim_{x\to a}\frac{f(x)}{g(x)}\), so we begin by choosing an arbitrary sequence \(x_n\to a\) with \(x_n\neq a\). Applying the above result gives us a sequences \(c_n\) trapped between \(x_n\) and \(a\): so, by the squeeze theorem we know \(c_n\to a\). But, our assumption on \(f^\prime/g^\prime\) tells us that since \(c_n\neq a\)

\[\lim \frac{f^\prime(c_n)}{g^\prime(c_n)}=\lim_{x\to a}\frac{f^\prime}{g^\prime}=L\]

But, for each \(n\) we know that

\[\frac{f^\prime(c_n)}{g^\prime(c_n)}=\frac{f(x_n)}{g(x_n)}\]

So, in fact we know \(\lim \frac{f(x_n)}{g(x_n)}=L\). Since \(x_n\) was an arbitrary sequence \(x_n\to a\) with \(x_n\neq a\), this holds for all such sequences, and so as claimed,

\[\lim_{x\to a}\frac{f(x)}{g(x)}=L\]

Exercise 24.8 Give an alternate proof of L’Hospitals rule using just the ordinary mean value theorem (not the generalized version) following the steps below:

  • Show that for any \(x\), we have \[\frac{f(x)}{g(x)}=\frac{f(x)-f(a)}{g(x)-g(a)} =\frac{\frac{f(x)-f(a)}{x-a}}{\frac{g(x)-g(a)}{x-a}} \]
  • For any \(x\), use the MVT to get points \(c, k\) such that \(f^\prime(c)=\frac{f(x)-f(a)}{x-a}\) and \(g^\prime(k)=\frac{g(x)-g(a)}{x-a}\).
  • Choose a sequence \(x_n\to a\): for each \(x_n\), the above furnishes points \(c_n,k_n\): show these sequences converge to \(a\) by squeezing.
  • Use this to show that the sequence \(s_n = \frac{f^\prime(c_n)}{g^\prime(k_n)}\) converges to \(L\), using our assumption \(\lim_{x\to a}\frac{f^\prime}{g^\prime}=L\).
  • Conclude that the sequence \(\frac{f(x_n)}{g(x_n)}\to L\), and that \(\lim_{x\to a}\frac{f(x)}{g(x)}=L\) as claimed.

Hint: Use the \(\epsilon-\delta\) definition of a functional limit our assumption \(\lim_{x\to a}\frac{f^\prime(x)}{g^\prime(x)}=L\) to help: for any \(\epsilon\), theres a \(\delta\) where \(|x-a|<\delta\) implies this quotient is within \(\epsilon\) of \(L\). Since \(c_n,k_n\to a\) can you find an \(N\) beyond which \(f^\prime(c_n)/g^\prime(k_n)\) is always within \(\epsilon\) of \(L\)?