Skip to content

Inverse Secant

December 4, 2009
tags:

The inverse sine and inverse tangent (and to a lesser extent the inverse cosine and inverse cotangent) are useful throughout your courses and in their applications. What about inverse secant or inverse cosecant? These functions are a bit problematic to define. Worse yet, once you’ve gone to the trouble of building them, they aren’t often useful. Let’s consider the inverse secant.

Here’s the graph of cosine and secant on [-\pi, 2\pi].

Like all of the trig functions, secant is not one-to-one. So we need to restrict the domain to get a one-to-one function and then invert. Which pieces should we use though? It seems reasonable, and in analogy with inverse cosine, to restrict our domain to [0, \frac{\pi}{2}) \cup (\frac{\pi}{2}, \pi]. These are the green and yellow branches. Let’s call this restricted secant \sec_1. It turns out, unfortunately, that this natural choice isn’t perhaps the nicest choice for calculus purposes. So we will also consider \sec_2, the restriction of secant to [0, \frac{\pi}{2}) \cup [\pi, \frac{3\pi}{2}). So \sec_2 consists of the green and orange branches. We will refer to these functions as \sec_* when it doesn’t matter which choices we’ve made.

In either case \sec_* is now one-to-one with range (-\infty, -1] \cup [1, \infty). Thus we have \sec_*^{-1} with domain (-\infty, -1] \cup [1, \infty). What is the derivative of \sec_*^{-1}? Well

\displaystyle{ y = \sec_*^{-1} x \implies x = \sec_* y  }

\displaystyle{ \implies 1 = \sec_* y \tan y \cdot y'  }

\displaystyle{ \implies y' = \frac{1}{\sec_* y \tan y} }

\displaystyle{ \implies y' = \frac{1}{x \cdot \pm\sqrt{x^2-1}} }

where we’ve used the identity 1 + \tan^2 y = \sec^2 y.

We now need to worry about our choice of domain. In the case of \sec_2 our y belongs to [0, \frac{\pi}{2}) \cup [\pi, \frac{3\pi}{2}) and here \tan y > 0. Thus

\displaystyle{ \frac{d}{dx} \sec_2 x = \frac{1}{x\sqrt{x^2-1}} }

But in the case of \sec_1 our y belongs to [0, \frac{\pi}{2}) \cup (\frac{\pi}{2}, \pi]. Now \sec y, \tan y > 0 on [0, \frac{\pi}{2}) and \sec y, \tan y < 0 on (\frac{\pi}{2}, \pi]. Thus \sec y \tan y = |x| \sqrt{x^2 - 1} for these y. Therefore

\displaystyle{ \frac{d}{dx} \sec_1 x = \frac{1}{|x|\sqrt{x^2-1}} }

These formulas agree on [0, \frac{\pi}{2}) of course.

Machin’s Formula

November 23, 2009

Starting with the power series

\displaystyle{ \frac{1}{1-x} = 1 + x + x^2 + x^3 + \cdots = \sum_{n=0}^{\infty} x^n } on (-1, 1)

we substitute -x^2 for x to get

\displaystyle{ \frac{1}{1+x^2} = 1 - x^2 + x^4 - x^6 + \cdots = \sum_{n=0}^{\infty} (-1)^n x^{2n} } on (-1, 1)

and anti-differentiate to find

\displaystyle{ \tan^{-1} x = C + x - \frac{1}{3}x^3 + \frac{1}{5}x^5 - \frac{1}{7}x^7 + \cdots = \sum_{n=0}^{\infty} \frac{(-1)^n}{2n+1} x^{2n+1} } on (-1, 1).

Now since \tan^{-1} 0 = 0 we see that C = 0. Thus we have

\displaystyle{ \tan^{-1} x = x - \frac{1}{3}x^3 + \frac{1}{5}x^5 - \frac{1}{7}x^7 + \cdots = \sum_{n=0}^{\infty} \frac{(-1)^n}{2n+1} x^{2n+1} } good for all x \in (-1, 1).

As an example of how to calculate with this we remember \tan(\pi/6) = 1/\sqrt{3} and so

\displaystyle{ \frac{\pi}{6} = \tan^{-1}\left( \frac{1}{\sqrt{3}} \right) = \frac{1}{\sqrt{3}} - \frac{1}{3}\left(\frac{1}{\sqrt{3}}\right)^3 + \frac{1}{5}\left(\frac{1}{\sqrt{3}}\right)^5 - \frac{1}{7}\left(\frac{1}{\sqrt{3}}\right)^7 + \cdots }

\displaystyle{ = \sum_{n=0}^{\infty}\frac{(-1)^n}{2n+1} \left( \frac{1}{\sqrt{3}} \right)^{2n+1} =  \sum_{n=0}^{\infty}\frac{(-1)^n}{(2n+1) 3^n} \frac{1}{\sqrt{3}} }.

Thus we have

\displaystyle{ \frac{\pi\sqrt{3}}{6} =  \sum_{n=0}^{\infty}\frac{(-1)^n}{(2n+1) 3^n} }.

Now if, for whatever reason, we wanted to calculate \pi\sqrt{3}/6 we could use the well-known estimate from the world of alternating series to find

\displaystyle{ \left| \frac{\pi\sqrt{3}}{6} - \sum_{n = 0}^{N} \frac{(-1)^n}{(2n+1) 3^n} \right| \leq \frac{1}{(2N+3)3^{N+1}} }

where the sum of an alternating series of the form \sum (-1)^n b_n with b_n > 0 and b_n decreasing to 0 differs from its N-th partial sum by at most b_{N+1}.

Now Machin’s Formula states

\displaystyle{\frac{\pi}{4} = 4\tan^{-1}\left(\frac{1}{5}\right) - \tan^{-1}\left(\frac{1}{239}\right) }.

Given our series expression for \tan^{-1}, the niceness of the numbers \frac{1}{5} and \frac{1}{239}, and the easy estimate provided because our series is alternating in the right way, Machin’s Formula gives a very efficient way to calculate \pi.

How do we prove this formula? The proof is actually an easy, though tedious application of the tangent addition formula:

\displaystyle{ \tan(x + y) = \frac{\tan x + \tan y}{1 - \tan x \tan y} }

Here goes:

\displaystyle{ \tan(2A) = \frac{2\tan A}{1 - \tan^2 A} } and so

\displaystyle{ \tan(4A) = \frac{2\frac{2\tan A}{1 - \tan^2 A}}{1 - \left(\frac{2\tan A}{1 - \tan^2 A}\right)^2} = \frac{4\tan A(1-\tan^2 A)}{(1-\tan^2 A) - (2\tan A)^2} }

Now if A = \tan^{-1}(1/5) we have \displaystyle{ \tan(4A) = \frac{4\cdot\frac{1}{5}\cdot\frac{24}{25}}{(\frac{24}{25})^2 - (\frac{2}{5})^2} = \frac{120}{119} }.

Ok, now consider

\displaystyle{ \tan(4A-B) = \frac{\tan 4A - \tan B}{1 + \tan 4A\tan B} }

and if A = \tan^{-1}(1/5) and B = \tan^{-1}(1/239) we have

\displaystyle{ \tan(4A-B) = \frac{\frac{120}{119} - \frac{1}{239}}{1+\frac{120}{119}\frac{1}{239}} = 1 }

and therefore

\displaystyle{ \frac{\pi}{4} = \tan^{-1} 1 = 4A - B = 4\tan^{-1}\left(\frac{1}{5}\right) - \tan^{-1}\left(\frac{1}{239}\right) }

as desired.

Integral Example 8

November 19, 2009

Consider the integral \displaystyle{ \int\! \frac{d\theta}{\cos \theta + \sin \theta} }.

Here’s a too clever solution: Since \sqrt{2}/2 = \cos(\pi/4) = \sin(\pi/4) we have

\displaystyle{ \int\! \frac{d\theta}{\cos \theta + \sin \theta} = \int\! \frac{\frac{\sqrt{2}}{2}\, d\theta}{\cos \theta \cos(\pi/4) + \sin \theta \sin(\pi/4)} }

\displaystyle{ = \int\! \frac{\frac{\sqrt{2}}{2}\, d\theta}{\cos(\theta - \pi/4)} }

\displaystyle{ = \frac{\sqrt{2}}{2}\int\! \sec(\theta - \pi/4)\,d\theta }

\displaystyle{ = \frac{\sqrt{2}}{2}\ln| \sec(\theta - \pi/4) + \tan(\theta - \pi/4)| + C}


Using the Weierstrass substitution here would be a less clever, more general way to solve this integral. Remember we set \theta = 2\tan^{-1}(t) yielding \cos \theta = (1-t^2)/(1+t^2), \sin \theta = 2t/(1+t^2), and d\theta = 2/(1+t^2)\,dt. Applying these here we find

\displaystyle{ \int\! \frac{d\theta}{\cos \theta + \sin \theta} = \int\! \frac{\frac{2}{1+t^2}\,dt}{\frac{1-t^2}{1+t^2} + \frac{2t}{1+t^2}} }

\displaystyle{ = \int\! \frac{2\,dt}{1-t^2 + 2t} }

We complete the square: 1-t^2+2t = 2 - (1-t)^2 and go on.

\displaystyle{ = \int\! \frac{2\,dt}{2-(1-t)^2} }

We make the substitution u = 1-t, du = -dt and find

\displaystyle{ = -2\int\! \frac{du}{2-u^2} }

\displaystyle{ = -2\frac{1}{2\sqrt{2}} \ln\left| \frac{\sqrt{2}+t}{\sqrt{2}-t} \right| + C}

\displaystyle{ = -\frac{1}{\sqrt{2}} \ln\left| \frac{\sqrt{2}+\tan(\theta/2)}{\sqrt{2}-\tan(\theta/2)} \right| + C}

I’ll leave it to you to reconcile the two answers.

Integral Example 7

November 17, 2009

Consider the indefinite integral \displaystyle{ \int\! \frac{1}{a^2-x^2}\,dx } where a is a positive real number.

This can be evaluated in a number of ways. Here are two of them along with a nice consequence.


First we’ll treat this as a straight-forward partial fraction decomposition question. We have

\displaystyle{ \frac{1}{a^2-x^2} = \frac{A}{a+x} + \frac{B}{a-x} }

where A, B are real numbers to be determined. We have then

\displaystyle{ 1 = A(a-x) + B(a+x) }.

Letting x = a we find B = 1/2a; letting x = -a we find A = 1/2a. Thus

\displaystyle{ \int\! \frac{1}{a^2-x^2}\,dx = \int\! \left( \frac{1/2a}{a+x} + \frac{1/2a}{a-x} \right)\,dx }

\displaystyle{ = \frac{1}{2a} \int\! \left( \frac{1}{a+x} + \frac{1}{a-x} \right)\,dx }

\displaystyle{ = \frac{1}{2a}\left(\ln|a+x| - \ln|a-x|\right) + C}

\displaystyle{ = \frac{1}{2a} \ln\left|\frac{a+x}{a-x}\right| + C}

Done and done.


Now we’ll treat this using hyperbolic substitutions. Remember the fundamental hyperbolic identity: \cosh^2 t - \sinh^2 t = 1. From this we can derive the identities 1 - \tanh^2 t = \text{sech}^2 t and \coth^2 t - 1 = \text{csch}^2 t by dividing our fundamental identity by \cosh^2 t and \sinh^2 t respectively.

Remember the graph of y = \tanh t. We see that the domain of \tanh t is all real numbers t and the range is (-1, 1). Further \tanh is one-to-one and so we have a \tanh^{-1} with domain (-1, 1) and range all real numbers. Here’s a graph if your memory of hyperbolic tangent is a little fuzzy.

So in the case where x \in (-a, a) we can make the substitution x = a\tanh t. Then a^2 - x^2 = a^2(1-\tanh^2 t) = a^2\text{sech}^2 t and dx = a\text{sech}^2 t\, dt .

\displaystyle{ \int\! \frac{1}{a^2-x^2}\,dx =  \int\! \frac{1}{a^2\text{sech}^2 t}\,a\text{sech}^2 t\,dt }

\displaystyle{ = \int\! \frac{1}{a}\, dt }

\displaystyle{ = \frac{1}{a} t + C }

\displaystyle{ = \frac{1}{a} \tanh^{-1}\left(\frac{x}{a}\right) + C }

Combining this with our first solution (and setting a = 1) we see that

\displaystyle{ \tanh^{-1} x = \frac{1}{2}\ln\left|\frac{1+x}{1-x}\right| + C }

and since \tanh^{-1} 0 = 0 and \frac{1}{2}\ln 1 = 0 we see that C = 0. This formula for \tanh^{-1} x can be derived in other ways of course.

When | x | > a we make a substitution x = a\coth t and the reasoning is similar.

Thus we have

\displaystyle{ \tanh^{-1} x = \frac{1}{2} \ln\left|\frac{1+x}{1-x}\right| } on x \in (-a, a)

and

\displaystyle{ \coth^{-1} x = \frac{1}{2} \ln\left|\frac{1+x}{1-x}\right| } on x \in (-\infty, -a) \cup (a, \infty)

The Weierstrass Substitution

November 13, 2009

Back in an earlier post we considered a rational parameterization of the unit circle. We saw there that for \theta = 2\tan^{-1}(x) we have \displaystyle{ \cos \theta = \frac{1-x^2}{1+x^2} } and \displaystyle{ \sin \theta = \frac{2x}{1+x^2}}. A moment’s reflection reveals that this substitution would transform any rational function of \sin \theta and \cos \theta into a rational function of x. This is the Weierstrass Substitution. Its main application is to the anti-differentiation of rational functions of \sin \theta and \cos \theta. We would have

\displaystyle{ \int\! R(\sin \theta, \cos \theta)\,d\theta = \int\! R\left(\frac{2x}{1+x^2}, \frac{1-x^2}{1+x^2}\right)\,\frac{2}{1+x^2}\,dx }

where we calculated d\theta = \frac{2}{1+x^2}\,dx from \theta = 2\tan^{-1} x.

Here are a pair of examples.


Consider the integral \displaystyle{ \int\!\frac{d\theta}{1-\sin \theta + \cos \theta} }. We apply the Weierstrass substitution to find

\displaystyle{ \int\!\frac{d\theta}{1-\sin \theta + \cos \theta} = \int\!\frac{\frac{2\,dx}{1+x^2}}{1 - \frac{2x}{1+x^2} + \frac{1-x^2}{1+x^2}} }

\displaystyle{ = \int\! \frac{2\,dx}{1+x^2 - 2x + 1-x^2} }

\displaystyle{ = \int\! \frac{2\,dx}{2 - 2x} }

\displaystyle{ = \int\! \frac{dx}{1 - x} }

\displaystyle{ = -\ln|1-x| + C }

\displaystyle{ = -\ln|1-\tan(\theta/2)| + C }


We considered the integral \int\! \sec \theta\,d\theta in an earlier post. Let’s do it again with the help of the Weierstrass substitution.

\displaystyle{ \int\! \sec \theta\,d\theta = \int\! \frac{1+x^2}{1-x^2} \cdot \frac{2}{1+x^2}\,dx  }

\displaystyle{ = \int\! \frac{2}{1-x^2}\,dx }

\displaystyle{ = \int\! \left( \frac{1}{1+x} + \frac{1}{1-x}\right)\,dx }

\displaystyle{ = \ln|1+x| - \ln|1-x| + C }

\displaystyle{ = \ln\left|\frac{1+x}{1-x}\right| + C }

\displaystyle{ = \ln\left|\frac{1+\tan(\theta/2)}{1-\tan(\theta/2)}\right| + C }

This answer has a very different form from the ones given in our earlier post. We can reconcile this answer with the older ones through algebra and trigonometric identities.

We have

\displaystyle{ \frac{1+\tan(\theta/2)}{1-\tan(\theta/2)} = \frac{1+\tan(\theta/2)}{1-\tan(\theta/2)} \cdot \frac{\cos(\theta/2)}{\cos(\theta/2)} }

\displaystyle{ = \frac{\cos(\theta/2) + \sin(\theta/2)}{\cos(\theta/2) - \sin(\theta/2)} = \frac{\cos(\theta/2) + \sin(\theta/2)}{\cos(\theta/2) - \sin(\theta/2)} \cdot \frac{\cos(\theta/2) + \sin(\theta/2)}{\cos(\theta/2) + \sin(\theta/2)}  }

\displaystyle{ = \frac{\cos^2(\theta/2) + \sin^2(\theta/2) + 2\sin(\theta/2)\cos(\theta/2)}{\cos^2(\theta/2) - \sin^2(\theta/2)}}

\displaystyle{ = \frac{1 + \sin \theta}{\cos \theta} }

\displaystyle{ = \sec \theta + \tan \theta }

as expected. (In the penultimate equality we used the usual \cos^2 A + \sin^2 A = 1, \sin 2A = 2\sin A\cos A, and \cos 2A = \cos^2 A - \sin^2 A identities.)

Gabriel’s Horn

November 8, 2009

The amusing, famous, and seemingly paradoxical Gabriel’s Horn is a mathematical object which has 1) finite volume and 2) infinite surface area. These properties are sometimes expressed by saying that Gabriel’s Horn is an object that you can fill up but never paint. Behold.

We start with the curve y = 1/x on the interval [1, +\infty). Here is a part of the graph:

At x = 1 it has a height of 1 and as x \rightarrow +\infty the height goes to zero. We then take this curve and wrap it in three-dimensions about the x-axis. This gives us a surface of revolution, part of which looks like:

(The light green line represents the x-axis.) The shape is something like a horn of infinite length. At the wide end it is a circle of radius 1 and if we cut the horn perpendicular to the x-axis the exposed end is a circle of radius 1/x.

It has finite volume. Why? By the usual formula, the volume V is

\displaystyle{ V = \int_1^{\infty}\! A(x)\, dx }

where A(x) is the cross-sectional area of the shape when we cut it with a plane perpendicular to the x-axis. In this case we get circles of radius 1/x and so A(x) = \pi/x^2. Thus

\displaystyle{ V = \int_1^{\infty}\! \frac{\pi}{x^2} \, dx }

\displaystyle{ = \lim_{R \rightarrow \infty} \int_1^R\! \frac{\pi}{x^2} \, dx }

\displaystyle{ = \lim_{R \rightarrow \infty} \left. \frac{-\pi}{x}\right|_1^R }

\displaystyle{ = \lim_{R \rightarrow \infty} \frac{-\pi}{R} + \pi }

= \pi

It has infinite surface area. Why? The formula for surface area of a surface of revolution is

\displaystyle{ SA = \int_1^{\infty}\! 2\pi y \sqrt{1+(y')^2}\,dx }

where here y = 1/x so

\displaystyle{ SA = \int_1^{\infty}\! 2\pi \frac{1}{x} \sqrt{1+\left(\frac{-1}{x^2}\right)^2}\,dx }

\displaystyle{ = \int_1^{\infty}\! 2\pi \frac{1}{x} \sqrt{\frac{x^4 + 1}{x^4}}\,dx }

\displaystyle{ = \int_1^{\infty}\! 2\pi \frac{1}{x} \frac{\sqrt{x^4 + 1}}{x^2}\,dx }

\displaystyle{ = 2\pi \int_1^{\infty}\! \frac{\sqrt{x^4 + 1}}{x^3}\,dx }

\displaystyle{ = 2\pi \lim_{R \rightarrow \infty} \int_1^R\! \frac{\sqrt{x^4 + 1}}{x^3}\,dx }

\displaystyle{ \geq 2\pi \lim_{R \rightarrow \infty} \int_1^R\! \frac{\sqrt{x^4}}{x^3}\,dx }

(That \geq because x^4 + 1 \geq x^4 of course.)

\displaystyle{ = 2\pi \lim_{R \rightarrow \infty} \int_1^R\! \frac{x^2}{x^3}\,dx }

\displaystyle{ = 2\pi \lim_{R \rightarrow \infty} \int_1^R\! \frac{1}{x}\,dx }

\displaystyle{ = 2\pi \lim_{R \rightarrow \infty} \left. \ln x \right|_1^R }

\displaystyle{ = 2\pi \lim_{R \rightarrow \infty} \ln R - \ln 1 }

= \infty.

A Rational Parameterization of the Unit Circle

November 6, 2009

We’re all familiar with the usual trigonometric parameterization of the unit circle: Each point on x^2 + y^2 = 1 is given by (\cos \theta, \sin \theta) for some real \theta. Less well-known is the parameterization of the unit circle by rational functions.

The line through the point (-1, 0) with slope m is given by y = m(x+1). This line intersects the unit circle in one other point and as we vary m we strike every point on the unit circle. Here’s an illustration for a few values of m:

What are the coordinates of this point? Well, the point (x,y) satisfies both x^2 + y^2 = 1 and y = m(x+1) so we have

x^2 + (m(x+1))^2 = 1

\implies x^2 + m^2(x^2+2x+1) = 1

\implies (1+m^2)x^2 + 2m^2x + (m^2-1) = 0

\displaystyle{ \implies x = \frac{-2m^2 \pm \sqrt{4m^4 - 4(1+m^2)(m^2-1)}}{2(1+m^2)} }

\displaystyle{ \implies x = \frac{-2m^2 \pm \sqrt{4}}{2(1+m^2)}  }

\displaystyle{ \implies x = -1,\ \  \frac{1-m^2}{1+m^2} }

x = -1 corresponds to the point (-1, 0). When x = \frac{1-m^2}{1+m^2} we have

\displaystyle{ y^2 = 1 - x^2 }

\displaystyle{ y^2 = 1 - \left( \frac{1-m^2}{1+m^2} \right)^2 }

\displaystyle{ y^2 = \frac{(1+m^2)^2 - (1-m^2)^2}{(1+m^2)^2} }

\displaystyle{ y^2 = \frac{(1+2m^2 + m^4) - (1-2m^2 + m^4)}{(1+m^2)^2} }

\displaystyle{ y^2 = \frac{4m^2}{(1+m^2)^2} }

\displaystyle{ \implies y = \pm\frac{2m}{1+m^2} }

and we want the \pm to be + so that positive slopes correspond to the upper half of the circle as we illustrated above.

Therefore every point on the unit circle (other than (-1, 0)) is of the form

\displaystyle{ \left(\frac{1-m^2}{1+m^2},\ \frac{2m}{1+m^2}\right) } for some m. As m \rightarrow \pm \infty we have \left(\frac{1-m^2}{1+m^2},\ \frac{2m}{1+m^2}\right) \rightarrow (-1, 0).


We now have two parameterizations of the unit circle. How are they connected? For every m there is a \theta such that

\displaystyle{ \cos \theta = \frac{1-m^2}{1+m^2} \text{ and } \sin \theta = \frac{2m}{1+m^2} }

which gives

\displaystyle{ \tan \theta = \frac{2m}{1-m^2} }.

This calls to mind the tangent addition formula \tan(A+B) = \frac{\tan A + \tan B}{1-\tan A\tan B}. This suggests m = \tan(\theta/2) and \theta = 2\tan^{-1}(m). The usual calculations show that this is correct: If we start with m = \tan^{-1}(\theta/2) we get the desired \cos \theta and \sin \theta.

Real Cubic Equations

November 3, 2009

In our last post we considered complex cubic equations. We found the following.

The complex cubic equation X^3 + pX + q = 0 has roots u + v, u\zeta + v\zeta^2, and u\zeta^2 + v\zeta where

\displaystyle{ u = \sqrt[3]{\frac{-q}{2} + \sqrt{\frac{q^2}{4} + \frac{p^3}{27}} }} and \displaystyle{ v = \sqrt[3]{\frac{-q}{2} - \sqrt{\frac{q^2}{4} + \frac{p^3}{27}} }}

are chosen to preserve uv = -\frac{p}{3},

\displaystyle{ \zeta = \text{cis}(2\pi/3) = \cos(2\pi/3) + i\sin(2\pi/3) = \frac{-1+i\sqrt{3}}{2} }, and \displaystyle{ \zeta^2 = \overline{\zeta} = \frac{-1 - i\sqrt{3}}{2} }.

Suppose now that the coefficients of our cubic are real. The behavior of the roots is largely determined by the sign of the expression \delta = \frac{q^2}{4} + \frac{p^3}{27}. (The discriminant of a cubic X^3 + pX + q is D = -4p^3 - 27q^2. Our \delta is D/(-4\cdot27). We’ll discuss discriminants some other time.) There are two cases to consider.

First, suppose \delta = \frac{q^2}{4} + \frac{p^3}{27} > 0. Then u, v are real. It follows that there is one real root and two complex conjugate roots. Why? The roots of our cubic are u+v (real) and u\zeta + v\zeta^2 and \overline{u\zeta + v\zeta^2} = u\zeta^2 + v\zeta (a complex conjugate pair).

Second, suppose \delta = \frac{q^2}{4} + \frac{p^3}{27} \leq 0. Here there are three real roots. Why? We have

\displaystyle{ -\frac{q}{2} \pm \sqrt{\delta} = -\frac{q}{2} \pm i\sqrt{|\delta|} = r \text{cis}(\pm\theta)  }

where

\displaystyle{ r = \left|\frac{-q}{2} \pm i\sqrt{-\left(\frac{q^2}{4} + \frac{p^3}{27}\right)}\right| = \sqrt{ \frac{q^2}{4} - \frac{q^2}{4} - \frac{p^3}{27} } = \sqrt{ -\frac{p^3}{27} } = \left( -\frac{p}{3} \right)^{3/2} }

and \theta can be calculated if we really need it. Now set u = r^{1/3}\text{cis}(\theta/3) and v = r^{1/3}\text{cis}(-\theta/3) and note we have uv = r^{2/3} = -p/3 as required. Thus our roots are

\displaystyle{u + v} = r^{1/3}\text{cis}(\theta/3) + r^{1/3}\text{cis}(-\theta/3) = 2r^{1/3}\cos\left(\frac{\theta}{3}\right),

\displaystyle{u\zeta + v\zeta^2} = r^{1/3}\text{cis}(\theta/3+2\pi/3) + r^{1/3}\text{cis}(-\theta/3-2\pi/3) = 2r^{1/3}\cos\left(\frac{\theta}{3} + \frac{2\pi}{3}\right), and

\displaystyle{u\zeta^2 + v\zeta} = r^{1/3}\text{cis}(\theta/3+4\pi/3) + r^{1/3}\text{cis}(-\theta/3-4\pi/3) = 2r^{1/3}\cos\left(\frac{\theta}{3} + \frac{4\pi}{3}\right).

Note that when \delta = 0 we have \theta = 0 \text{ or } \pi and in this case (at least) two of our roots above will coincide.

The Cubic Formula

November 2, 2009

In this post we’ll derive a formula for the roots of

aX^3 + bX^2 + cX + d = 0

where the coefficients are any complex numbers and a \neq 0. Before doing this we need to recall a couple of facts from our earlier work.

First, we saw in an earlier post that every complex number z has n n-th roots and that if w is any one n-th root of z the complete set is

w, \zeta w, \zeta^2 w, \ldots, \zeta^{n-1} w

where \zeta = \text{cis}(2\pi/n) = \cos(2\pi/n) + i\sin(2\pi/n). There isn’t a natural way to choose one of these from among the others. (When z is a positive real number we can always choose the unique positive real n-th root.) For this reason the symbol \sqrt[n]{z} is ambiguous. Despite this, when convenient, we’ll use the symbol \sqrt[n]{z} to mean some n-th root of z.

Second, to solve a quadratic equation aX^2 + bX + c = 0 we only need to extract a w = \sqrt{b^2-4ac} and then the roots are (-b \pm w)/2. Since every complex number has a square root it follows that every complex quadratic equation has complex roots. We’ll use this observation a couple of times in what follows.

Now to the cubic equation. Let f(X) = aX^3 + bX^2 + cX + d where the coefficients are complex and a \neq 0. Since we’re interested in the roots of f(X) = 0 we may suppose a = 1. Then by substituting we find

f(X-\frac{b}{3}) = (X-\frac{b}{3})^3 + b(X-\frac{b}{3})^2 + c(X-\frac{b}{3}) + d

= X^3 - 3X^2\frac{b}{3} + 3X\frac{b^2}{9} -\frac{b^3}{27} + b(X^2 -\frac{2b}{3}X + \frac{b^2}{9}) + c(X-\frac{b}{3}) + d

= X^3 - 3X^2\frac{b}{3} + bX^2 + \text{lower degree terms}

= X^3 + pX + q

for some complex numbers p, q.

Thus is suffices to find the roots of f(X) = X^3 +pX+q. This is called a depressed cubic, by the way.

Now f(u+v) = (u+v)^3 + p(u+v) + q

= u^3 + 3u^2v + 3uv^2 + v^3 + p(u+v) + q

= u^3 + v^3 +3uv(u+v) + p(u+v) + q

= u^3 + v^3 + (3uv+p)(u+v) + q

so if we can find complex numbers u, v such that u^3 + v^3 = -q and uv = -\frac{p}{3} then u+v will be a root of f(X) = 0.

Notice that uv = -\frac{p}{3} implies u^3v^3 = -\frac{p^3}{27}. Then we see that u^3, v^3 are roots of the complex quadratic equation

(Y - u^3)(Y - v^3) = Y^2 - (u^3+v^3)Y + u^3v^3 = Y^2 + qY -\frac{p^3}{27}

Therefore, by the quadratic formula, we have

\displaystyle{u^3, v^3 = \frac{-q \pm \sqrt{q^2 + 4\frac{p^3}{27}}}{2}}

\displaystyle{ = \frac{-q}{2} \pm \sqrt{\frac{q^2}{4} + \frac{p^3}{27}}}

Now we extract cube roots of these quantities to find

\displaystyle{ u = \sqrt[3]{\frac{-q}{2} + \sqrt{\frac{q^2}{4} + \frac{p^3}{27}} }} and \displaystyle{ v = \sqrt[3]{\frac{-q}{2} - \sqrt{\frac{q^2}{4} + \frac{p^3}{27}} }}

where we are careful to choose our cube roots to preserve uv = -\frac{p}{3}.

Thus u+v is one root of our cubic. What are the othe two roots? There are three cube roots of u^3 (namely the already chosen u and u\zeta, u\zeta^2) and three cube roots of v^3 (again these are v, v\zeta, v\zeta^2). In order the preserve the condition uv = -\frac{p}{3} we must choose them in the pairs (u\zeta, v\zeta^2) and (u\zeta^2, v\zeta).


To summarize: The complex cubic equation X^3 + pX + q = 0 has roots u + v, u\zeta + v\zeta^2, and u\zeta^2 + v\zeta where

\displaystyle{ u = \sqrt[3]{\frac{-q}{2} + \sqrt{\frac{q^2}{4} + \frac{p^3}{27}} }} and \displaystyle{ v = \sqrt[3]{\frac{-q}{2} - \sqrt{\frac{q^2}{4} + \frac{p^3}{27}} }}

are chosen to preserve uv = -\frac{p}{3} and

\displaystyle{ \zeta = \text{cis}(2\pi/3) = \cos(2\pi/3) + i\sin(2\pi/3) = \frac{-1+i\sqrt{3}}{2} }.


Example: Find the roots of f(X) = X^3 + 3X^2 + 2. To depress the equation we instead consider f(X-1) = X^3 - 3X + 4. Here p = -3 and q = 4 so

\displaystyle{ \frac{p^3}{27} + \frac{q^2}{4} = \frac{-27}{27} + \frac{16}{4} = -1 + 4 = 3 }

and thus

u^3 = -2 + \sqrt{3} and v^3 = -2 - \sqrt{3}. Now these numbers are real and so have unique, unambiguous real cube roots

u = \displaystyle{ \sqrt[3]{-2 + \sqrt{3}} } and v = \displaystyle{ \sqrt[3]{-2 - \sqrt{3}} }

and these satisfy uv = -\frac{p}{3}. Thus the roots of f(X-1) = 0 are u + v, u\zeta + v\zeta^2, u\zeta^2 + v\zeta. Therefore the roots of f(X) = 0 are 1 + u + v, 1 + u\zeta + v\zeta^2, 1+ u\zeta^2 + v\zeta.

Thoughts on Teaching

October 30, 2009

Last night I was at a meeting between elementary school and college-level teachers. An elementary school teacher was explaining all of the different methods she uses and the many little steps required to explain some basic concept to her students. A college-level teacher was asked to comment on this and he said:

“Your goal is to reach every single student and make sure they acheive the material to the best of their individual abilities. My goal is to fail the weakest students in the first month.”

There was a great round of laughter and the conversation moved on.

His comment, though perhaps a joke, does point to a real difference between the goal and focus of teaching at various levels. Regardless of the level of instruction, we have the material to be presented and the group of learners to receive it. Now assuming the teacher is competent and the learners are willing, what are the differences between teaching in elementary school and teaching in college?

In the lower grades the teacher’s primary loyalty is to the student. Yes, there is important and fundamental material to be taught but it is the individual student receiving this material which is the focus. The good teacher will try various techniques and styles to communicate. She may even vary these from student to student. She has an entire school year and many opportunities within it to get each student to master the material to the best of their ability, readiness, and willingness. (Moreover, the same material will be presented in a slightly more sophisticated fashion the next year.)

At the college level things are very different. There is limited time to cover large amounts of material. The material must be mastered on this exposure so that the student can go on to the next section of this course and to the courses which follow. A good teacher here keeps the students in mind but his primary loyalty is to the material itself. The teacher of Calculus I, say, must teach the material at a certain pace and level for the class to cover what is required and for the course to be at the appropriate level. The teacher can try to give additional help to individual students — office hours! — but there can be little flexibility for individual needs.

The material taught in elementary school is crucial for the basic education of everyone. So here any failure robs the student of something essential for his later life. We give the students a half-dozen years or so to learn to read, to calculate, and to understand a little of the workings of the world. College is different; College is entirely optional. A student who fails Calculus may take it again. And if the student is ultimately unable to pass, the consequences are minimal.

Follow

Get every new post delivered to your Inbox.