The Fundamental Theorem of Morse Theory
These are notes for a presentation I had to give for my Riemannian Geometry class. They follow Chapters 11-17 of Milnor's book Morse Theory very closely.
Morse Functions
Let $M$ be a manifold and $f:M \to \R$ be a smooth function. A helpful example to keep in mind is the height function of a surface immersed in $\R^3$, the restriction of $f(x,y,z) = z$ to $M \subseteq \R^3$.
At a critical point, $f$ has a well-defined Hessian (second derivative)
We can give a nice coordinate-free expression for the Hessian as follows.
It's not obvious that this is bilinear, or even that it is well-defined (a priori, it could depend on the extension of $Y$). But we can show both of these facts with one neat computation. \[X \cdot (\tilde Y \cdot f)(p) - Y \cdot (\tilde X \cdot f)(p) = [\tilde X, \tilde Y] \cdot f (p) = (df)_p([\tilde X, \tilde Y]) = 0\] Therefore, $H(X,Y) = H(Y,X)$. So the Hessian is symmetric. It is clearly linear in the first argument. So the fact that it is symmetric shows that it is well-defined and bilinear.
For example, consider the height function on the torus $T^2$.
There are 4 critical points: one minimum, two saddle points, and one maximum. These correspond to the CW structure on the torus with one 0-cell, one 1-cells, and one 2-cell.
The Calculus of Variations
We want to treat the space of piecewise smooth paths on a manifold like an infinite-dimensional manifold. I won't make this idea fully formal (mostly because I don't know the full formalism), but we can use analogies with finite-dimensional manifolds to motivate some useful definitions related to this space of paths. By analogy to the finite-dimensional case, we will define tangent vectors to the space of paths.
The Path Space of a Smooth Manifold
Let $M$ be a smooth manifold and let $p$ and $q$ be two points of $M$. $p$ and $q$ are allowed to be the same point. We will denote the set of all piecewise-smooth paths from $p$ to $q$ in $M$ by $\Omega(M;p,q)$. If $p$ and $q$ are clear from context, we will just write $\Omega$. Later, we will topologize this space, but we don't need to worry about that yet.
Before defining the tangent space of $\Omega$ and critical points, we will review their definitions for finite-dimensional manifolds. Given a finite-dimensional manifold $M$, we can think of tangent vectors as the velocities of curves. Concretely, given a tangent vector $v \in T_pM$, we can always find a curve $c:(-\epsilon, \epsilon) \to M$ such that $c(0) = p$ and $c'(0) = v$. We can push forward tangent vectors along a map $\phi:M \to N$ by defining $d\phi_p (v) = (\phi \circ c)'(0)$. We say that $p$ is a critical point of $\phi$ if $d\phi_p = 0$, which is to say that $(\phi \circ c)'(0) = 0$ for all curves $c$ through $p$.
Now, we give analogous definitions on $\Omega$. To define tangent vectors on $\Omega$, we have to generalize the idea of a curve on a manifold $M$. We do so with the idea of a variation.
- $\bar \alpha(0) = \omega$
- There is a subdivision $0 = t_0 < t_1 < \cdots < t_k = 1$ of $[0,1]$ such that the map \[\alpha : (-\epsilon, \epsilon) \times [0,1] \to M\] defined by $\alpha(u,t) = \bar \alpha(u)(t)$ is smooth on each strip $(-\epsilon, \epsilon) \times [t_{i-1}, t_i]$.
We can think of $\bar \alpha$ as a "smooth path" in $\Omega$. Its "velocity vector", $\dd {\bar \alpha} {u} (0)$ is the vector field $W$ along $\omega$ given by \[W(t) = \left.\ddo u \right|_{u=0} \bar \alpha(u)(t) = \left.\pd{\alpha(u,t)} u\right|_{u=0} \] Inspired by this, we define the tangent space to $\Omega$ at a path $\omega$ as follows.
We note that $\dd{\bar\alpha}{u}(0)$ is such a vector field. And given any such vector field, we can find an associated variation by setting \[\bar \alpha(u)(t) := \exp_{\omega(t)}(u W(t))\] Now that we have a definition of tangent vectors, we can define critical points.
The Energy of a Path
On Riemannian manifolds, we often want to talk about the lengths of paths. However, the length functional is kind of annoying because there's a square root involved. To get around this, we define an energy functional which is similar to length, but better behaved.
Recall that the arc-length of a curve from $a$ to $b$ is given by \[L_a^b(\omega) := \int_a^b \left\|\dd \omega t \right\|\;dt\] Using the Cauchy-Schwarz inequality, we can relate the length and energy of a curve. \[(L_a^b)^2 = \left(\int_a^b \left\|\dd \omega t\right\| \cdot 1\;dt\right)^2 \leq \left(\int_a^b \left\|\dd \omega t \right\|^2\;dt\right)\left(\int_a^b 1^2\;dt\right) = (b-a)E_a^b\] Suppose that $\gamma$ is a minimal geodesic with $\gamma(0) = p$ and $\gamma(1) = q$, and $\omega$ is any other path. Then (using the fact that $\|\dot \gamma\|$ is constant) \[E_0^1(\gamma) = \int_0^1 \|\dot\gamma\|^2\;dt = \|\dot\gamma\|^2 = L_0^1(\gamma)^2 \leq L_0^1(\omega)^2 \leq E_0^1(\omega)\] We can only have $L(\gamma)^2 = L(\omega)^2$ if $\omega$ is a reparameterization of a minimal geodesic from $p$ to $q$. And we can only have $L(\omega)^2 = E(\omega)$ if $\omega$ is parameterized proportional to arclength. Thus, we conclude that $E(\gamma) \leq E(\omega)$ with equality iff $\omega$ is a minimal geodesic. That means that the minima of the energy functional are the minimal geodesics from $p$ to $q$.
Now that we understand the minima of the energy functional, we turn to the critical points.
The Hessian of the Energy Functional at a Critical Path
To do Morse Theory, we need to talk about the Hessian of this functional. The Hessian will be a bilinear functional \[E_{**}:T\Omega_\gamma \times T\Omega_\gamma \to \R\] Note that we only define the Hessian at critical points of $E$ (that is, geodesics).
Pick a two parameter variation $a:U \times [0,1] \to M$ where $U$ is a neighborhood of the origin in $\R^2$, so that \[\alpha(0,0,t) = \gamma(t), \; \pd \alpha {u_1} (0,0,t) = W_1(t),\;\pd \alpha {u_2} (0,0,t) = W_2(t)\] Then \[E_{**}(W_1, W_2) := \left.\frac{\partial^2 E(\bar \alpha(u_1, u_2))}{\partial u_1 \partial u_2}\right|_{(0,0)}\]
It's not obvious from this definition that this is actually well defined (i.e. that it depends only on $W_1$ and $W_2$, and not on the particular variation $\bar \alpha$ that you pick). It turns out that it is well-defined. We can see this using the second variation formula.
I won't prove this here, but it's a pretty straightforward computation given the first variation formula, and some other identities, which can be found in Milnor or Lee.
Jacobi Fields and the Null Space of $E_{**}$
Recall that a Jacobi field $J$ is determined by its initial conditions \[J(0), \frac{DJ}{dt}(0) \in TM_{\gamma(0)}\]
The proof is a fairly straightforward computation using the second variation formula.
The Morse Index Theorem
The proof is pretty involved, so we split it up into steps.
Let $T\Omega_\gamma(t_0, t_1, \ldots, t_k) \subseteq T\Omega_\gamma$ be the subspace of vector fields $W$ along $\gamma$ such that
- $W$ restricted to each $[t_i, t_{i+1}]$ is a Jacobi field
- $W(0) = W(1) = 0$.
Let $W \in T\Omega_\gamma$. Since a Jacobi field along a geodesic contained in a uniformly normal neighborhood is determined by its values on the endpoints, there is unique broken Jacobi field in $W_1 \in T\Omega_\gamma(t_0, \ldots, t_k)$ defined by the property that $W_1(t_i) = W(t_i)$ for each $i$. And $W - W_1 \in T'$. Clearly $T\Omega_\gamma(t_0, \ldots, t_k) \cap T' = 0$. So we conclude that $T\Omega_\gamma = T\Omega_\gamma(t_0, \ldots, t_k) \oplus T'$.
Now, we will show that these subspaces are $E_{**}$-orthogonal. Let $W_1 \in T\Omega_\gamma(t_0, \ldots, t_k)$ and $W_2 \in T'$. Applying the second variation formula, we see \[E_{**}(W_1, W_2) = -\sum_t \inrp {W_2(t), \Delta_t \frac{DW_1}{dt}} - \int_0^1 \inrp {W_2} 0 \;dt = 0\]
Finally, we note that $E_{**}$ is positive definite on $T'$. The fact that $E_{**}(W,W) \geq 0$ for $W \in T'$ follows from the fact that $E_{**}(V,V) \geq 0$ on minimal geodesics. Since $W$ vanishes at each $t_i$, and $\gamma$ restricted to $[t_i, t_{i+1}]$ is a minimal geodesic, one can show that $E_{**}(W,W) \geq 0$.
Now, we will show that $E_{**}(W,W) = 0$ only if $W = 0$. Suppose $E_{**}(W,W) = 0$. We will show that $W$ must lie in the null space of $E_{**}$. We know that $E_{**}(W, W') = 0$ for $W' \in T\Omega_\gamma(t_0, \ldots, t_k)$. Now, suppose $W_2 \in T'$. By bilinearity of $E_{**}$, we see that \[0 \leq E_{**}(W + cW_2, W + cW_2) = 2cE_{**}(V_2, W) + c^2E_{**}(W_2,W_2)\] Since this is true for all $c$ (in particular for all negative $c$), we see that $E_{**}(W_2, W) = 0$. Therefore, $W$ is in the null space of $E_{**}$, which means that it is a Jacobi field. Since the only Jacobi field in $T'$ is 0, we conclude that $W = 0$. So $E_{**}$ is positive definite on $T'$.
Thus, the index of $E_{**}$ equals the index of $E_{**}$ restricted to $T\Omega_\gamma(t_0, \ldots, t_k)$. This shows our claim that the index is finite, since $T\Omega_\gamma(t_0, \ldots, t_k)$ is finite-dimensional.
Now, we will prove the formula for the index. Let $\gamma_\tau$ be the restriction of $\gamma$ to $[0,\tau]$, and let $\lambda(\tau)$ be the index of the associated Hessian $(E_0^\tau)_{**}$. We are going to show a formula for $\lambda(1)$.
It's not actually symmetric in $\tau$ and $\tau'$. We may conclude that $\lambda(\tau') \geq \lambda(\tau)$, and that for $\tau''$ sufficiently close to $\tau$, we have $\lambda(\tau'') \geq \lambda(\tau')$. But there's no guarantee that $\tau$ is close enough to $\tau'$ to ensure that $\lambda(\tau) \geq \lambda(\tau')$, which we would need to be the case for $\lambda$ to be locally constant.
Next, we will show that $\lambda(\tau + \epsilon) \geq \lambda(\tau) + \nu$. Let $W_1, \ldots, W_{\lambda(\tau)}$ be a basis for the negative-definite subspace of $H_\tau$. Let $J_1, \ldots, J_\nu$ be a basis for the null space of $H_\tau$. Note that the vectors \[\frac{DJ_i}{dt}(\tau)\in TM_{\gamma(\tau)}\] must be linearly independent (since the Jacobi fields are all zero there). Thus, we can choose $\nu$ vector fields $X_1, \ldots, X_\nu$ along $\gamma_{\tau + \epsilon}$ so that the matrix \[\left(\inrp{\frac{DJ_k}{dt}(\tau)}{X_k(\tau)}\right)\] is equal to the $\nu \times \nu$ identity matrix. (Just invert the matrix $(\frac{DJ_k}{dt}(\tau))$ and extend the vectors to vector fields along $\gamma$). Now, extend the vector fields $W_i$ and $J_k$ to $\gamma_{\tau + \epsilon}$ by setting them to 0 for $\tau \leq t \leq \tau + \epsilon$. Using the second variation formula, we see that \[(E_0^{\tau + \epsilon})_{**}(J_h, W_i) = 0\] \[(E_0^{\tau + \epsilon})_{**}(J_h, X_k) = 2\delta_{hk}\]
Now, let $c$ be small and consider the $\lambda(\tau) + \nu$ vector fields \[W_1, \ldots, W_{\lambda(\tau)}, c^{-1}J_1 - cX_1, \ldots, c^{-1}J_\nu - cX_\nu\] along $\gamma_{\tau + \epsilon}$.
Let $A$ be the matrix of $(E_0^{\tau + \epsilon})_{**}$ on $(W_i, X_k)$ and $B$ be the matrix of $(E_0^{\tau + \epsilon})_{**}$ on $(X_h, X_k)$. Then, the matrix of $(E_0^{\tau + \epsilon})_{**}$ is \[\begin{pmatrix} (E_0^\tau)_{**}(W_i,W_j) & cA \\ cA^t & -4 \mathbb{I} + c^2 B\end{pmatrix}\] Clearly this is negative definite for small $c$.
This finishes our proof of the Morse Index Theorem.
A Finite-Dimensional Approxmination to $\Omega^C$
Finally, we will put a topology on $\Omega$. Let $\rho$ denote $M$'s topological metric which is induced by its Riemannian metric.
Fix a partition $0 = t_0 < t_1 < \cdots < t_k = 0$ of the unit interval, and define $\Omega(t_0, \ldots, t_k)$ to be the set of piecewise geodesics with vertices at these times. Then define $\Omega(t_0, \ldots, t_k)^c := \Omega^c \cap \Omega(t_0, \ldots, t_k)$ and $\Int \Omega(t_0, \ldots, t_k)^c := (\Int \Omega^c) \cap \Omega(t_0, \ldots, t_k)$.
Let $\{t_i\}$ be a partition fine enough that $t_i - t_{i-1} \leq \epsilon^2/c$. Then for any broken geodesic $\omega \in \Omega(t0, \ldots, t_k)^c$, we have \[(L_{t_{i-1}}^{t_i} \omega)^2 = (t_i - t_{i-1})(E_{t_{i-1}}^{t_i}\omega) \leq (t_i - t_{i-1})(E \omega) \leq \epsilon^2\] So $\omega$ is determined by its values at its vertices. Thus, we can identify $\Omega(t_0, \ldots, t_k)^c$ with a subset of $M^{\times k}$. We can pull back the smooth product structure to get a smooth structure on $\Int \Omega(t_0, \ldots, t_k)^c$.
For convenience, we will write the manifold of broken geodesics $\Int \Omega(t_0, \ldots, t_k)^c$ as $B$. Let $E':B \to \R$ be the restriction of the energy functional to $B$.
We will define an explicit retraction $r:\Int \Omega^c \to B$. We start with $\omega \in \Int \Omega^c$. Let $r(\omega)$ be the broken geodesic in $B$ that agrees with $\omega$ at its vertices. Now, we will show that this is a deformation retraction. Let $r_u:\Int \Omega^c \to \Int \Omega^c$ be defined as follows. For $t_{i-1} \leq u \leq t_i$, let
\[\begin{cases} r_u(\omega)|_{[0, t_{i-1}]} = r(\omega)|_{[0, t_{i-1}]}\\ r_u(\omega)|_{[t_{i-1}, u]} = \text{minimal geodesic from}\;\omega(t_{i-1})\;\text{to}\;\omega(u)\\ r_u(\omega)|_{[u,1]} = \omega|_{[u,1]} \end{cases}\]Clearly $r_0$ is the identity, $r_1 = r$, and $r$ is smooth. So $B$ is a deformation retract of $\Int \Omega^c$. It's clear that the critical points of $E'$ lie in $B$, since geodesics are broken geodesics, and the first variation formula tells us that these are still the only critical points. And we saw earlier that restricting to broken Jacobi fields does not change the index of $E_{**}$.
The Topology of the Full Path Space
The topology we put on $\Omega$ is kind of weird. A more natural topology for this space is the so-called "compact open topology", in which a sequence of functions converges whenever it converges uniformly on every compact subset of the domain. An equivalent description of this topology is that it is induced by the metric \[d^*(\omega, \omega') = \max_t \rho(\omega(t), \omega'(t))\]
I will describe the proof here. Let $a_0 < a_1 < \cdots$ be a sequence of real numbers which are not critical values of the energy functional $E$. Pick the numbers so that each interval $(a_i, a_{i+1})$ contains exactly one critical value. Now, consider the sequence \[\Omega^{a_0} \subset \Omega^{a_1} \subset \Omega^{a_2} \subset \cdots\]
We see that each $\Omega^{a_{i+1}}$ is homotopic to $\Omega^{a_i}$ with a finite number of cells attached, corresponding to the finitely many geodesics in $E^{-1}((a_i, a_{i+1}))$. So we can construct a sequence of $CW$ complexes \[K_0 \subset K_1 \subset K_2 \subset \cdots\] such that for each $i$, we have a homotopy equivalence $\Omega^{a_i} \to K_i$. We can take a direct limit to get a map $f:\Omega \to K$. Clearly $f$ induces isomorphisms of homotopy groups in every dimension. Since $\Omega$ is homotopy equivalent to a CW complex, it follows by Whitehead's theorem that $f$ is a homotopy equivalence.