Math Tutorials and More
by George

Math Tutorials and More
by George

Appendix: Kepler's Laws Derived from Newton's Laws

In this appendix we will derive Kepler's laws from Newton's laws of motion and his law of universal gravitation. Below are the three laws that were derived empirically by Kepler.

  • Kepler's First Law: A planet moves in a plane along an elliptical orbit with the sun at one focus.
  • Kepler's Second Law: The position vector from the sun to a planet sweeps out area at a constant rate.
  • Kepler's Third Law: The square of the period of a planet around the sun is proportional to the cube of the semi-major axis length.
Mathematical Preliminaries

Some of the equations on this page are too long to be viewed in their entirety on mobile devices with narrow screens. If this is the case, you might try viewing this page in landscape mode,

Consider a Cartesian coordinate system with the sun at the origin. Let $(x,y,z)$ denote the position of a planet. Clearly $x$, $y$, and $z$ are functions of the time $t$. We define the position vector $\boldsymbol{r}$, the velocity vector $\boldsymbol{v}$, and the acceleration vector $\boldsymbol{a}$ by

$$\boldsymbol{r}=x\boldsymbol{i}+y\boldsymbol{j}+z\boldsymbol{k},$$ $$\boldsymbol{v}=\dot{x}\boldsymbol{i}+\dot{y}\boldsymbol{j}+\dot{z}\boldsymbol{k},$$ $$\boldsymbol{a}=\ddot{x}\boldsymbol{i}+\ddot{y}\boldsymbol{j}+\ddot{z}\boldsymbol{k}$$

Here the dots represent differentiation with respect to time and $\boldsymbol{i}$, $\boldsymbol{j}$, $\boldsymbol{k}$ are the unit vectors in the $x$, $y$, $z$ directions respectively. Newton's law of motion can be written

\begin{equation} \boldsymbol{F}=m\boldsymbol{a} \tag{1} \end{equation}

where $m$ is the mass of the planet and $\boldsymbol{F}$ is the force on the planet. Let $\hat{\boldsymbol{r}}$ be a unit vector in the $\boldsymbol{r}$ direction. Then Newton's law of gravitation applied to the earth and sun is given by

\begin{equation} \boldsymbol{F}=-\frac{GMm}{r^2}\,\hat{\boldsymbol{r}}=-\frac{GMm}{r^3}\,\boldsymbol{r} \tag{2} \end{equation}

where $G$ is a constant, $M$ is the mass of the sun, and $r$ is the magnitude of $\boldsymbol{r}$. Combining equations (1) and (2), we get

\begin{equation} \boldsymbol{a}=\ddot{\boldsymbol{r}}=-\frac{GM}{r^3}\,\boldsymbol{r} \tag{3} \end{equation}
Planet moves in a plane

By the product rule for differentiation

\begin{equation*} \frac{d}{dt}\,(\boldsymbol{r}\times\boldsymbol{v})=\boldsymbol{v}\times\boldsymbol{v}+\boldsymbol{r}\times\boldsymbol{a}=0 \end{equation*}

since $\boldsymbol{a}$ is in the same direction as $\boldsymbol{r}$ by equation (3). Here the symbol $\times$ represents the vector cross-product. Thus, the vector

\begin{equation*} \boldsymbol{h}=\boldsymbol{r}\times\boldsymbol{v} \end{equation*}

is a constant. It follows that $\boldsymbol{r}$ and $\boldsymbol{v}$ lie in the plane orthogonal to $\boldsymbol{h}$. We will choose our coordinate system so that $\boldsymbol{k}$ is in the direction $\boldsymbol{h}$. Thus,

\begin{equation} \boldsymbol{h}=\boldsymbol{r}\times\boldsymbol{v}=h\boldsymbol{k}\qquad \text{where $h\gt 0$}. \tag{4} \end{equation}
Kepler's Second Law

Figure 10 shows the area swept out by the position vector in a small increment of time. $\Delta\theta$ is the small change of angle.

Figure 10: Area swept out during small time increment

The area OAB is approximately equal to the area of the right triangle OAC for small $\Delta\theta$. Since the length of the line AC is approximately $r\Delta\theta$ and the length of the line OC is approximately $r$, we have

\begin{equation*} \Delta A\doteq \tfrac{1}{2}\,r^2\Delta\theta. \end{equation*}

Letting the time increment approach zero, we see that

\begin{equation} \dot{A}=\tfrac{1}{2}\,r^2\dot{\theta}. \tag{5} \end{equation}

Since the planet moves in the $xy$ plane, we have

\begin{equation} \boldsymbol{r}=x\boldsymbol{i}+y\boldsymbol{j}=(r\cos\theta)\boldsymbol{i}+(r\sin\theta)\boldsymbol{j} \tag{6} \end{equation}

where the polar coordinates $r$ and $\theta$ are functions of $t$. The time derivative of $\boldsymbol{r}$ is given by

\begin{equation} \begin{split} \boldsymbol{v}=(\dot{r}\cos\theta-&r\sin\theta\;\dot{\theta})\boldsymbol{i}+\\ &(\dot{r}\sin\theta+r\cos\theta\;\dot{\theta})\boldsymbol{j}. \end{split}\tag{7} \end{equation}

Substituting equations (6) and (7) into equation (4), we obtain

\begin{align} h&=r\cos\theta(\dot{r}\sin\theta+r\cos\theta\,\dot{\theta})-\\ &\qquad\qquad\qquad r\sin\theta(\dot{r}\cos\theta-r\sin\theta\,\dot{\theta})\\ &=r^2\dot{\theta}. \tag{8} \end{align}

Here we have used the fact that $\boldsymbol{i}\times\boldsymbol{j}=\boldsymbol{k}$ and $\boldsymbol{j}\times\boldsymbol{i}=-\boldsymbol{k}$. It follows from equations (5) and (8) that

\begin{equation*} \dot{A}=\tfrac{1}{2}\,r^2\dot{\theta}=h/2 =\text{constant}. \end{equation*} This is Kepler's second law.
Definition and properties of an ellipse

Before we look at the derivation of Kepler's first law, we need to define what we mean by an ellipse, and look at some of its properties. One common way of drawing an ellipse is to pin the two ends of a string, place a pencil in the loop, and trace a curve while keeping the string taught. Clearly the resulting curve has the property that the sum of the distances from any point on the curve to the two fixed points is a constant (the length of the string). The resulting curve is called an ellipse and the two fixed points are called the foci of the ellipse. Figure 11 shows an ellipse in which the foci are at $(-c,0)$ and $(c,0)$, and $2a$ corresponds to the length of the string.

Figure 11: Drawing of an ellipse

The construction of the ellipse can be represented mathematically as follows

\begin{equation} \sqrt{(x+c)^2+y^2}+\sqrt{(x-c)^2+y^2}=2a \tag{9} \end{equation}

where $a\gt c\gt 0$.

This equation can be rearranged as follows

\begin{equation*} \sqrt{(x+c)^2+y^2}=2a-\sqrt{(x-c)^2+y^2}. \end{equation*}

Squaring both sides, we get

\begin{equation*} \begin{split} (x+c)^2+y^2=4a^2-4a&\sqrt{(x-c)^2+y^2}+\\&\qquad\qquad (x-c)^2+y^2. \end{split} \end{equation*}

Solving for the square root term, we obtain

\begin{align*} \sqrt{(x-c)^2+y^2}&=\frac{1}{4a}\,[4a^2+(x-c)^2-(x+c)^2]\\ &=a-\frac{c}{a}\,x. \end{align*}

Squaring again, we obtain

\begin{equation*} x^2-2cx+c^2+y^2=a^2-2cx+\frac{c^2}{a^2}\,x^2 \end{equation*}

or equivalently

\begin{align*} \left(1-\frac{c^2}{a^2}\,\right)x^2+y^2&=(a^2-c^2)\frac{x^2}{a^2}+y^2\\ &=a^2-c^2. \end{align*}

Dividing through by $a^2-c^2$, we obtain

\begin{equation} \frac{x^2}{a^2}+\frac{y^2}{a^2-c^2}=1.\tag{10} \end{equation} We define the eccentricity $e$ of the ellipse by $e=c/a$. We also define $b$ by \begin{equation} b=a\sqrt{1-e^2}=\sqrt{a^2-c^2} \tag{11} \end{equation} Thus, equation (10) can be written in the standard form \begin{equation*} \frac{x^2}{a^2}+\frac{y^2}{b^2}=1. \end{equation*}

This is the form that is usually specified for an ellipse. It is easy to see that $a$ is one-half the length of the ellipse's major axis and $b$ is one-half the length of the ellipse's minor axis.

An ellipse also has a simple form in polar coordinates if we take our origin to be one of the foci. This situation is pictured in Figure 12.

Figure 12: An ellipse in polar coordinates

Using the definition of an ellipse in terms of the sum of the distances from the two foci being constant, we can write

\begin{equation} r+\sqrt{(r\cos\theta+2c)^2+r^2\sin^2\theta}=2a.\tag{12} \end{equation}

Solving for the square root term and expanding the square terms, we get

\begin{equation*} \sqrt{r^2+4rc\cos\theta+4c^2}=2a-r. \end{equation*}

Squaring this equation gives

\begin{equation*} r^2+4rc\cos\theta+4c^2=4a^2-4ar+r^2 \end{equation*}

or equivalently

\begin{equation*} (a+c\cos\theta)r=a^2-c^2. \end{equation*}

Solving for $r$, we obtain

\begin{align} r&=\frac{a^2-c^2}{a+c\cos\theta}=\frac{a^2(1-\frac{c^2}{a^2}\,)}{a(1+\frac{c}{a}\;\cos\theta)}\\ &=\frac{a(1-e^2)}{1+e\cos\theta}=\frac{k}{1+e\cos\theta}\tag{13} \end{align}

where $k=a(1-e^2)$.

Equation (13) is the desired representation of the ellipse in polar coordinates.

We can also derive our original definition of an ellipse from the polar form. Suppose $r$ and $\theta$ satisfy

\begin{equation} r=\frac{k}{1+e\cos\theta}\tag{14} \end{equation}

where $k\gt 0$ and $0\lt e\lt 1$.

We define $a$ and $c$ by

\begin{equation} a=\frac{k}{1-e^2}=\frac{k}{(1-e)(1+e)}\qquad c=ae.\tag{15} \end{equation}

It follows from equation (14) that $r$ has a maximum value of $\frac{k}{1-e}$ at $\theta=\pi$. Thus,

\begin{equation*} r\leq \frac{k}{1-e}=a(1+e)\lt 2a. \end{equation*}

Equation (14) can be rearranged as follows

\begin{equation*} (1+e\cos\theta)r=k=a(1-e^2). \end{equation*}

Since $e=c/a$, this equation can be written

\begin{equation*} \Bigl(1+\frac{c}{a}\,\cos\theta\Bigr)r=a\Bigl(1-\frac{c^2}{a^2}\,\Bigr)=\frac{a^2-c^2}{a}. \end{equation*}

Multiplying both sides by $a$, we obtain

\begin{equation*} (a+c\cos\theta)r=a^2-c^2. \end{equation*}

Multiplying this equation by four and adding $r^2=(\sin^2\theta+\cos^2\theta)r^2$ to both sides, we obtain

\begin{equation*} \begin{split} (\cos^2\theta+\sin^2\theta)r^2+4ar+&4cr\cos\theta\\ &=4a^2-4c^2+r^2. \end{split} \end{equation*}

This equation can be rearranged as

\begin{equation*} \begin{split} r^2\cos^2\theta+4cr\cos\theta+4c^2&+r^2\sin^2\theta\\ &=r^2-4ar+4a^2. \end{split} \end{equation*}

or equivalently

\begin{equation*} (r\cos\theta+2c)^2+r^2\sin^2\theta=(2a-r)^2. \end{equation*}

Taking the square root of both sides, we obtain

\begin{equation*} r+\sqrt{(r\cos\theta+2c)^2+r^2\sin^2\theta}=2a \end{equation*}

which is the defining equation for the ellipse pictured in Figure 11 [see equation (12)]. Thus, equation (14) defines an ellipse with the origin at one focus. Let $b=a\sqrt{1-e^2}$. Then it follows from equation (15) that

\begin{equation} b=\frac{k}{\sqrt{1-e^2}} \tag{16} \end{equation}
Hamilton's Theorem

In this section we will show that the velocity vector $\boldsymbol{v}$ moves on a circle. Since $r=|\boldsymbol{r}|$, equation (3) can be written

\begin{equation} \dot{\boldsymbol{v}}=\boldsymbol{a}=-\frac{GM}{r^2}\,(\cos\theta\,\boldsymbol{i}+\sin\theta\boldsymbol{j}).\tag{17} \end{equation}

Combining equations (8) and (17), we obtain

\begin{equation} \dot{\boldsymbol{v}}=-\frac{GM}{h}\,\dot{\theta}\,(\cos\theta\,\boldsymbol{i}+\sin\theta\boldsymbol{j}).\tag{18} \end{equation}

By the chain rule for differentiation

\begin{equation} \dot{\boldsymbol{v}}=\frac{d\boldsymbol{v}}{d\theta}\,\dot{\theta}.\tag{19} \end{equation}

It follows from equations (8) and (9) that

\begin{equation*} \frac{d\boldsymbol{v}}{d\theta}=-\frac{GM}{h}\,(\cos\theta\,\boldsymbol{i}+\sin\theta\boldsymbol{j}). \end{equation*}

Integrating this equation, we obtain

\begin{equation} \boldsymbol{v}=\frac{GM}{h}\,(-\sin\theta\,\boldsymbol{i}+\cos\theta\boldsymbol{j})+\boldsymbol{v}_0 \tag{20} \end{equation}

where $\boldsymbol{v}_0$ is a constant. It follows that $|\boldsymbol{v}-\boldsymbol{v}_0|=GM/h$, i.e., $\boldsymbol{v}$ moves on the circle centered at $\boldsymbol{v}_0$ with radius $GM/h$.

Kepler's first law

We choose our coordinate system so that $\boldsymbol{j}$ is in the direction $\boldsymbol{v}_0$, i.e.,

\begin{equation} \boldsymbol{v}_0=v_0\boldsymbol{j}\qquad\text{where $v_0\gt 0$}.\tag{21} \end{equation}

Thus, equation (20) becomes

\begin{equation} \boldsymbol{v}=\frac{GM}{h}\,[-\sin\theta\,\boldsymbol{i}+(\cos\theta+e)\boldsymbol{j}] \tag{22} \end{equation}

where $e=v_0h/GM$. Substituting equation (22) into equation (4) and using equation (6), we get

\begin{align*} h\boldsymbol{k}&=\boldsymbol{r}\times\boldsymbol{v}\\ &=\frac{GMr}{h}\,[\sin^2\theta+(\cos^2\theta+e\cos\theta)]\boldsymbol{k}\\ &=\frac{GMr}{h}\,(1+e\cos\theta)\boldsymbol{k}. \end{align*}

and hence

\begin{equation} r=\frac{h^2}{GM}\,\frac{1}{1+e\cos\theta}=\frac{k}{1+e\cos\theta} \tag{23} \end{equation}

where $k=h^2/GM$. In order for $r$ to remain finite for all $\theta$, we must have $0\leq e\lt 1$. Equation (23) is the equation of an ellipse in polar coordinates with the origin at one focus. This completes the proof of Kepler's first law.

Kepler's third law

Since the rate that area is swept out by the position vector is the constant $h/2$, it follows that

\begin{equation} A=hT/2 \tag{24} \end{equation}

where $T$ is the period of the motion and $A$ is the area of the ellipse. Since translation doesn't change the area, we can consider the area of the ellipse

\begin{equation} \frac{x^2}{a^2}+\frac{y^2}{b^2}=1.\tag{25} \end{equation}

We will calculate the area of the first quadrant ($x\geq 0$, $y\geq 0$) and multiply by four. Solving for $y$ as a function of $x$ from equation (25), we obtain

\begin{equation*} y=b\sqrt{1-x^2/a^2},\qquad 0\leq x \leq a. \end{equation*}

Thus, the area $A$ is given by

\begin{equation} A=4b\int_0^a \sqrt{1-x^2/a^2}\,dx.\tag{26} \end{equation}

If we make the change of variables $x=a\sin\phi$ ($dx=a\cos\phi\,d\phi$) in the integral, we obtain

\begin{align} A&=4ab\int_0^{\pi/2} cos^2\phi\,d\phi\\ &=4ab\int_0^{\pi/2}\frac{1+\cos 2\phi}{2}\,d\phi\\ &=\pi ab.\tag{27} \end{align}

Substituting this value for $A$ into equation (24), we obtain

\begin{equation*} T=\frac{2\pi ab}{h} \end{equation*}

and hence

\begin{equation} T^2=\frac{4\pi^2a^2b^2}{h^2}.\tag{28} \end{equation}

Using equations (15) and (16) along with the relation $k=h^2/GM$, we can write the expression for $T^2$ in equation (28) as follows

\begin{align} T^2&=\frac{4\pi^2k^4}{(1-e^2)^3h^2}=\frac{4\pi^2ka^3}{h^2}\\ &=\frac{4\pi^2a^3}{GM}.\tag{29} \end{align}

Equation (29) is Kepler's third law.