Representations of Compact Groups (Part 1)
I've written a previous post on representation theory for finite groups. The representation theory of finite groups is very nice, but many of the groups whose representations we care about are not finite. For example, representations of $SU(2)$ are important for understanding the behavior of particles with nonzero spin. So we want to extend representation theory to more general groups. A nice family of groups to consider are compact groups. In many ways, compactness is a generalization of finiteness. To use an example from that link, every real-valued function on a finite set is bounded and attains its maximum. This is untrue for real-valued functions on infinite sets: consider the functions $f(x) = \tan x$ and $f(x) = x$ respectively on the interval $(0, \frac\pi 2)$. However, continuous real-valued functions on a compact interval must be bounded and must attain their maxima. Similarly, compact groups generalize finite groups, and many of the nice features of the representation theory of finite groups extend to the representation theory of compact groups. This post will mostly follow the notes about representations of compact groups available here
Compact Topological Groups
First, we will start with some nice properties of compact topological groups. Recall that a topological group is a group endowed with a topology so that multiplication and the inverse map are continuous. When studying the representation theory of finite groups, it was often convenient to sum over the elements of the group (e.g. to define our inner product on the space of characters). Clearly we cannot always sum over the elements of an infinite group. But for compact groups, we have a nice theorem that tells us that we can integrate over the group instead, which is just as good.
This theorem is tricky to prove for locally compact topological groups. But for Lie groups, it is fairly easy. So we will just show a version of the theorem for Lie groups.
First, we will show existence. Let $n$ denote the dimension of $G$. Recall that as long as $G$ is oriented, an $n$-form on $G$ induces a measure. Furthermore, we recall that if we can find a nonvanishing $n$-form on $G$, then $G$ must be orientable. So it is sufficient to find a left-invariant nonvanishing $n$-form on $G$. Let $\Lambda^nT_eG$ denote the space of $k$-covectors on the tangent space to the identity of $G$. Pick any nonzero $\omega_e \in \Lambda^nT_eG$. Now, we can extend $\omega_e$ to a differential form on $G$. Let $L_g$ denote the automorphism of $g$ given by left-multiplication by $g$. This is continuous. $L_{g^{-1}}$ sends $g$ to $e$, so we can pull $\omega_e$ back along this map to define $\omega_g = (L_{g^{-1}})^* \in \Lambda^n T_gG$. This defines a differential $n$-form $\omega \in \Omega^n(G)$ on all of $G$. $\omega$ is left-invariant by construction. \[((L_h)^* \omega)_g = (L_h)^* \omega_{hg} = (L_h)^* (L_{(hg)^{-1}})^* \omega_e = (L_{(hg^{-1})h})^* \omega_e =(L_{g^{-1}})^* \omega_e = \omega_g\] Clearly this differential form is nonvanishing. And by negating $\omega$ if necessary, we see that $\omega$ is positive with respect to $G$'s orientation, so it defines a left-invariant measure on $G$.
Now, you might be wondering why it is important that $G$ is compact, because the above theorems don't require compactness. The nice thing about compactness is that measures only let us integrate functions with compact support - but if $G$ is compact, then every function has compact support. So we can integrate any real- (or complex-) valued functions on $G$. We will write the integral of $f$ with respect to the Haar measure as $\int_G f(g)\;dg$.
In particular, we can integrate the constant function $f(g) = 1$ over compact groups. It is convenient to normalize our Haar measure so that $\int_G 1 \;dg = 1$. I will assume that all Haar measures are normalized in this way.
From now on, I'll assume that all groups are compact Lie groups unless I explicitly state otherwise.
Basic Definitions
We'll start with a whole bunch of definitions. They're essentially the same as the analogous definitions for finite groups, except we require that our maps are continuous. To do so, we have to put topologies on the vector spaces involved.
Useful Constructions
There are several simple subrepresentations we can consider.
- For any representation $V$, $\{0\} \subseteq V$ is a subrepresentation because $g \cdot 0 = 0$.
- Similarly, $V$ is a subrepresentation of itself.
- We also have a subrepresentation $V^G = \{v \in V\;|\; g\cdot v = v\}$, the subspace of $G$-invariants. Note that the action of $G$ on $V^G$ is trivial.
- Given any $G$-linear map $A \in \Hom_G(V,W)$, the kernel is a subrepresentation of $A$ and the image is a subrepresentation of $W$.
Given two representations $(V, \phi)$ and $(W, \psi)$, there are several ways we can build new representations out of them.
- We can define a representation of $G$ on the dual space $V^* = \Hom(V, k)$ (where $k$ is the base field) by setting $g(A)(v) = A(g^{-1}v)$ for $A \in \Hom(V,k)$.
- We can define a representation of $G$ on the conjugate space $\overline V$. We define $\overline V$ as follows: it is the same topological abelian group as $V$, but the scalar multiplication is changed. Let $v$ denote an element of $V$ and $\overline v$ denote the corresponding element of $\overline V$. Then we set $\lambda \overline v = \overline{\overline \lambda v}$. That is to say, we scalar multiply by the conjugate of $\lambda$ instead of by $\lambda$ itself. The action of $G$ on $\overline V$ is the same as the action of $G$ on $V$.
- We can define a representation of $G$ on $V \oplus W$ by setting $g(v,w) = (gv, gw)$.
- We can define a representation of $G$ on $V \otimes W$ by setting $g(v \otimes w) = (gv) \otimes (gw)$.
- We can define a representation of $G$ on $\Hom(V,W)$ by using the isomorphism $\Hom(V,W) \cong W \otimes V^*$ for finite-dimensional $W,V$ and using our constructions for taking tensor products and duals of representations.
Note that $f^j(g^{-1}v)$ is a scalar and $ge_i$ is a vector in $W$. So this is just $f^j(g^{-1}v)(ge_i)$. Since $g$ acts by a linear map, we can factor out the $g$ to obtain \[(g \cdot (e_i \otimes f^j))(v) = g \cdot (f^j(g^{-1}v) e_i) = g \cdot ((e_i \otimes f^j)(g^{-1}v))\] So given any $A \in \Hom(V,W)$, we have $(g \cdot A)(v) = g\cdot A(g^{-1}v)$.
First, suppose that $A \in \Hom(V,W)^G$. Then $g \cdot A = A$, so in particular we have $(g \cdot A)v = Av$ for any $v \in V$. Using the formula for $g \cdot A$, we see that $g \cdot A(g^{-1} v) = Av$ for all $g \in G, v \in V$. Multiplying both sides by $g^{-1}$, we find that $A(g^{-1}v) = g^{-1} Av$. Since this is true for all $g \in G$, we conclude that $A$ is $G$-linear. So $A \in \Hom_G(V,W)$.
Conversely, suppose that $A \in \Hom_G(V,W)$. Then $A(gv) = g(Av)$ for all $g \in G, v \in V$. So $g^{-1}A(gv) = Av$. Letting $h = g^{-1}$, we see that $g A (h^{-1}v) = Av$ for all $h \in G, v \in V$. So $A$ is in the subspace of invariants $\Hom(V,W)^G$.
Complete Reducibility and Schur's Lemma
These all have determinant one, and are thus invertible. Furthermore, the product of two upper-triangular matrices is an upper-triangular matrix, so this is a group. This group has a natural action on $\R^2$ given by the usual matrix-vector product. This defines a representation of $G$ on $\R^2$.
Note that this representation fixes the subspace $V \subseteq \R^2$ given by
\[V = \left\{\left. \begin{pmatrix}\lambda\\0\end{pmatrix}\;\right|\;\lambda\in\R\right\}\]But it doesn't fix any other nontrivial subspaces. So $\R^2$ is neither an irreducible representation nor a completely reducible representation of $G$.
It's kind of frustrating that not all representations are completely reducible. One of the nice features of finite groups is that all representations of finite groups are completely reducible. We will show that compact groups are nice in this way as well- all representations of compact groups are completely reducible as well.
- Let $V, W$ be irreps of $G$. Let $A \in \Hom_G(V,W)$. Then $A$ is either 0 or an isomorphism.
- Let $V$ be a complex irrep of $G$. Then $\End_G(V) = \C \cdot \Id_V$ (i.e. any $G$-linear endomorphism of $V$ is a scalar multiple of the identity)
- Since $A$ is $G$-linear, we know that $\ker A, \im A$ are subrepresentations. Since $V,W$ are irreps, this implies that $\ker A$ is either $0$ or all of $V$, and $\im A$ is either $0$ or all of $W$. Thus, the only way for $A$ to be nonzero is if $\ker A = 0$ and $\im A = W$. This means that if $A$ is nonzero, it must be an isomorphism.
- Since $A$ is a complex matrix, it has an eigenvalue $\lambda$. Clearly $\lambda \Id$ is a $G$-linear endomorphism of $V$. Thus, $A - \lambda \Id \in \Hom_G(V,W)$. But $A-\lambda \Id$ cannot be an isomorphism. So it must be $0$. Thus, $A = \lambda \Id$.
First, we will construct one such projection. Explicitly, we define \[Av_G(v) := \int_G g \cdot v \;dg\] This operation averages over the group action, which is why we named the projection $Av_G$. To show that $Av_G$ is a projection, we have to show that it restricts to the identity on its image. First, we note that the image of $Av_G$ is simply $V^G$. We see that $\im Av_G$ is contained in $V^G$. Let $v$ be any vector in $V$. Then for any $h \in G$, we have \[h \cdot Av_G(v) = h \cdot \int_G g \cdot v \;dg = \int_G (hg) \cdot v\;dg = \int_G (hg)\cdot v \;d(hg)\] the last equality follows from the left-invariance of our measure. So we see that $h \cdot Av_G(v) = Av_G(v)$, which implies that $\im Av_G \subseteq V^G$.
Furthermore, any vector of $V^G$ is itself fixed by $Av_G$. If $v \in V^G$, then $Av_G(v) = \int_G g \cdot v\;dg = \int_G v\;dg = v$. So in particular, $v \in \im Av_G$. Thus, we see that $\im Av_G = V^G$, and $Av_G$ acts as the identity on its image. So it is a projection.
Now, we will show uniqueness. First, note that $Av_G$ commutes with any other $T \in \End_G(V)$. \[Av_G \circ T(v) = \int_G g \cdot (Tv)\;dg = \int_G T (g\cdot v) \;dg = T \int_G g \cdot v\;dg = T \circ Av_G (v)\] Suppose that $P$ is another projection onto $V^G$. In particular, it is an element of $\End_G(V)$, so it commutes with $Av_G$. Thus, \[P = Av_G \circ P = P \circ Av_G = Av_G\]
Let $\inrp \cdot \cdot$ be any hermitian inner product on $V$. We can view $\inrp \cdot \cdot$ as an element of $\Hom(\overline V \otimes V, \C) \cong \Hom(\overline V, V^*)$. We can think of the $G$-invariant inner products as elements of $\Hom(\overline V, V^*)^G$. So $Av_G \inrp \cdot \cdot$ gives us a $G$-invariant inner product.
Explicitly, this just means that we can define a $G$-invariant inner product $\inrp \cdot \cdot _G$ by the formula \[\inrp v w_G := \int_G \inrp {gv}{gw}\;dg\]
Let $\inrp \cdot \cdot$ be a $G$-invariant inner product on $V$. Let $U = W^\perp$.
We note that $U$ is a subrepresentation of $V$. Let $u \in U$. By definition, $\inrp u w = 0$ for all $w \in W$. Since the inner product is $G$-invariant, $\inrp {gu} {w} = \inrp u {g^{-1}w}$. Since $W$ is a subreresentation, $g^{-1}w \in W$, so $\inrp u {g^{-1}w} = 0$. Thus, $\inrp {gu} w = 0$ for all $w \in W$, so we conclude that $gu \in U$.
Therefore, $V = W \oplus U$.
Just reply Maschke's theorem repeatedly. Since our vector space is finite-dimensional, this process must terminate.
This follows from the above corollary and Shur's lemma.
Characters
Let $A:V \to W$ be a ($G$-linear) isomorphism. Then $\psi(g)Av = A \phi(g) v$. So $\phi(g) = A^{-1}\psi(g) A$. By the cyclic property of the trace, $\tr(A^{-1} \psi(g) A) = \tr(\psi(g))$. Thus, \[\chi_V(g) = \tr(\phi(g)) = tr(A^{-1}\psi(g)A) = \tr(\psi(g)) = \chi_W(g)\]
- $\chi_{V^*} = \chi_V^*$ where $\chi_V^*(g) = \chi_V(g^{-1})$
- $\chi_{\overline V} = \overline{\chi_V}$ where $\overline{\chi_V}(g) = \overline{\chi_V(g)}$
- $\chi_{V \oplus W} = \chi_V + \chi_W$
- $\chi_{V \otimes W} = \chi_V \cdot \chi_W$
- $\chi_{\Hom(V,W)} = \chi_V^* \cdot \chi_W$
- $\chi_{V^G} = av(\chi_V)$ where $av(\chi_V) = \int_G \chi_V(g)\;dg$ considered as a constant function
- Since $g$ acts on $V^*$ by $\phi(g^{-1})^T$, and transposing does not change the trace, we see that $\chi_{V^*}(g) = \chi_V(g^{-1})$.
- Since scalar multiplication on $\overline V$ is conjugated, we have to take the complex conjugate of the entries in the matrix $\phi(g)$ to get the matrix which acts on $\overline V$. Thus, $\chi_{\overline V} = \overline{\chi_V}$.
- $\chi_{V \oplus W}(g) = \tr (\phi(g) \oplus \psi(g)) = \tr(\phi(g)) + \tr(\psi(g)) = \chi_V(g) + \chi_W(g)$.
- $\chi_{V \otimes W}(g) = \tr (\phi(g) \otimes \psi(g)) = \tr(\phi_V(g)) \cdot \tr(\psi_W(g)) = \chi_V(g)\chi_W(g)$.
- $\chi_{\Hom(V,W)} = \chi_{W \otimes V^*} = \chi_V^* \cdot \chi_W$.
- This one is more complicated. We need to compute $\chi_{V^G}$. To do so, we use a trick involving the averaging projection.
Note that the averaging projection $Av_G:V \to V^G$ acts as the identity on $V^G$ and acts as $0$ on the orthogonal complement to $V^G$. Thus, $\phi(g) \circ Av_G$ acts as $\phi(g)$ on $V^G$ and acts as $0$ on the orthogonal complement to $V^G$. So $\tr_{V^G} \phi(g) = \tr_V (\phi(g) \circ Av_G)$. (Here $\tr_{V^G}$ denotes the trace over $V^G$ and $\tr_V$ denotes the trace over $V$)
Therefore, \[\chi_{V^G}(g) = \tr_{V^G} \phi(g) = \tr_V (\phi(g) \circ Av_G) = \tr_V \left(\phi(g)\int_G \phi(h)dh\right) = \tr_V\int_G \phi(gh)dh\] Since our measure is left-invariant, this is just \[\chi_{V^G}(g) = \tr_V \int_G \phi(g)dg = \int_G \tr_V \phi(g)dg = \int_G \chi_V(g)dg = av(\chi_V)\]
The previous propositions tell us that characters of $G$ are a decategorification of the category of finite-dimensional representations of $G$. Decategorification is the process of taking a category, identifying isomorphic objects and forgetting all other morphisms. This eliminates a lot of useful information, but often makes the category easier to work with. For example, if we decategorify the category of finite sets, we identify all sets with the same cardinality, and forget about all other functions. This just leaves us with the natural numbers, because sets are classified by their cardinality.
Frequently, there are nice structures in the category that still make sense after decategorification. For example, decategorifying disjoint unions of finite sets gives us addition of natural numbers, and decategorifying cartesian products gives us multiplication of natural numbers.
Above, we saw that characters are a decategorification of finite-dimensional $G$-representations. Two characters are equal if the corresponding representations are isomorpic, and the direct sums, tensor products, etc. of $G$-representations translate nicely into operations on characters. One of the most interesting aspects of this decategorification is that $\Hom$s turn into inner products.
Recall that a pair of adjoint functors are functors $F:\mathcal{C} \to \mathcal{D}, G:\mathcal{D} \to \mathcal{C}$ such that $\Hom_D(F(X), Y) \cong \Hom_C(X, G(Y))$ for all $X \in \Ob(C), Y \in \Ob(D)$. Adjoint functors are so named in analogy with adjoint linear operators (Recall that two operators $T,U$ on a Hilbert space are adjoint if $\inrp {Tx} y = \inrp x {Uy}$ for all vectors $x,y$.) This connection between inner products and Hom sets can be formalized to give a categorification of Hilbert spaces.
For an introduction to (de)categorification, you can look here (for a simpler introduction) or here (for a more complicated introduction).
Application: Irreducible Representations of $SU(2)$
Before proceeding, let's use some of this machinery we have built up so far to find all irreducible representations of $SU(2)$.
It will be helpful to use another characterization of $SU(2)$ as well.
Recall that the quaternions are defined by \[\H = \{a + jb\;|\; a,b \in \C\}\] where $jb = \overline b j$. Then $S^3 = \{a + jb \in \H\;|\; |a|^2 + |b|^2 = 1\}$. We have a natural action of $S^3$ on $\H$ by left-multiplication. This gives us a two-dimensional complex representation of $S^3$. Writing it out explicitly, we see that \[\begin{aligned} (a+jb) : 1 &\mapsto a+jb\\ (a+jb) : j &\mapsto - \overline b + j \overline a\\ \end{aligned}\] Thus, our representation is given by \[ (a+jb) \mapsto \begin{pmatrix} a & -\overline b\\b & \overline a \end{pmatrix} \] This matrix is unitary, and has determinant $|a|^2 + |b|^2 = 1$. So this is clearly a continuous bijection from $S^3$ to $SU(2)$. You can check that this bijection is a group isomorphism.
Since $SU(2)$ acts on $\C^2$, we also get an action of $SU(2)$ on $\C[z_1, z_2]$, the space of complex polynomials in 2 variables. Given $A \in SU(2), p \in \C[z_1, z_2]$, we define \[(A \cdot p)\left(\vvec{z_1}{z_2}\right) = p \left(A^{-1} \vvec {z_1}{z_2}\right)\] We note that this action does not change the degree of monomials. Thus, the space of homogeneous polynomials of degree $k$ is invariant under this action. So it is a subrepresentation. Let $V_k \subseteq \C[z_1, z_2]$ denote the space of homogeneous polynomials of degree $k$. We will show that $\{V_k\}$ are nonisomorphic irreducible representations, and every irreducible representation of $SU(2)$ is isomorphic to some $V_k$. First, we'll start with a lemma about the structure of $S^3$.
- Every element of $S^3 \subseteq \H$ can be written $ge^{i\theta}g^{-1}$
- For fixed $\theta$, $\{ge^{i\theta}g^{-1}\}$ is a 2D sphere with radius $\sin \theta$ intersecting $\C$ at $e^{i\theta}, e^{-i\theta}$
- Using our identification of $S^3$ with $SU(2)$, we can think of points on the sphere as special unitary matrices. Unitary matrices are unitarily-diagonalizable. Clearly we can rescale these matrices so that the matrices we diagonalize with are in $SU(2)$. Finally, note that a diagonal matrix in $SU(2)$ must have the form \[\begin{pmatrix} a & 0 \\ 0 & \overline a\end{pmatrix}\] for $a \in \C$, $|a|^2 = 1$. Thus, diagonal matrices in $SU(2)$ correspond to points $e^{i\theta}$ on the sphere.
-
Since the quaternion norm is multiplicative, and elements of $S^3$ have norm 1, we see that $|g e^{i\theta}g^{-1}| = |g||e^{i\theta}||g|^{-1} = 1$. Furthermore, \[\begin{aligned} ge^{i\theta}g^{-1} &= g (\cos \theta + i \sin \theta)g^{-1}\\ &= \cos \theta + \sin \theta \; g i g^{-1} \end{aligned}\] Now, we will consider the map $\pi:g \mapsto g i g^{-1}$. Note that for unit quaternions, $g^{-1} = \overline g$, the conjugate of $g$. So we can also write this map $\pi:g \mapsto g i \overline g$. Note that \[\overline{g i \overline g} = g \overline i \overline g = -g i \overline g\] So $gi\overline g$ is purely imaginary. Furthermore, since $g, i$ and $\overline g$ are all unit quaternions, so is their product. Thus, we can think of $\pi$ as a map $\pi : S^3 \to S^2$, where we view $S^3$ as the unit quaternions and $S^2$ as the unit imaginary quaternions.
Furthermore, $\pi$ is surjective. If we represent vectors in $\R^3$ as imaginary quaternions, then $v \mapsto g v \overline g$ is a representation of $SU(2)$ on $\R^3$ which acts by rotations. Since we can write all rotations in this form, and the rotation group $SO(3)$ acts transitively on the two-sphere, we see that $\{gi \bar g\}_{g \in SU(2)}$ covers all of $S^2$. So $\pi$ is surjective.
So since $g e^{i\theta} g^{-1} = \cos \theta + \sin \theta \pi(g)$, we see that $\{g e^{i\theta} g^{-1}\}$ is a sphere with radius $\sin \theta$. Now, we just have to check that the intersection of this sphere with $\C$ is $e^{\pm i \theta}$. Note that $\im \pi$ is the imaginary unit quaterions, and the only imaginary unit quaternions that lie in $\C$ are $\pm i$. Thus, the intersection of $\{g e^{i\theta} g^{-1}\}$ with $\C$ is $\cos \theta \pm i \sin \theta$.
To prove this, we will use characters. For convenience, let us write $\chi_{V_k}$ as $\chi_k$. Recall that the image of $e^{i\theta} \in S^3$ in $SU(2)$ is the matrix \[\begin{pmatrix} e^{i\theta}& 0 \\ 0 & e^{-i\theta}\end{pmatrix}\] Note that the eigenspaces of this operator on $V_k$ are $\{\C z_1^\ell z_2^{k-\ell}\}_\ell$ with eigenvalues $\{e^{(2 \ell - k)i \theta}\}_\ell$. Therefore, \[\begin{aligned} \chi_k(e^{i\theta}) &= \sum_{\ell=0}^k e^{(2 \ell - k) i \theta}\\ &= \frac{e^{(k+1)i\theta} - e^{-(k+1)i\theta}}{e^{i\theta} - e^{-i\theta}}\\ &= \frac{\sin[(k+1)\theta]}{\sin\theta} \end{aligned}\] Note that all of these characters are different. This shows that all of the representations $V_k$ are distinct. Now, we will show that the characters are orthonormal.
Recall that the inner product on characters is given by \[\inrp{\chi_k}{\chi_\ell} = \int_{S^3} \chi_k(g) \overline{\chi_\ell(g)}\;dg\] Since the volume of $S^3$ is $2\pi^2$, we can write $dg = \frac 1 {2\pi^2}d\sigma$ where $d\sigma$ is the standard volume element on $S^3$. So we want to compute \[\inrp{\chi_k}{\chi_\ell} = \frac 1 {2\pi^2} \int_{S^3} \chi_k(g) \overline{\chi_\ell(g)}\;d\sigma\] Recall that characters are constant on conjugacy classes. Since every element of $SU(2)$ is conjugate to exactly two unit complex numbers, we have \[\inrp{\chi_k}{\chi_\ell} = \frac 1 {2\pi^2} \int_0^\pi \chi_k(e^{i\theta}) \overline{\chi_\ell(e^{i\theta})}\vol(\text{orbit})\;d\theta\] Above, we showed that these orbits are spheres with radius $\sin \theta$. Therefore, the volume of an orbit is $4 \pi \sin^2 \theta$. Substituting this and our expressions for the characters, we see that our inner product is \[\inrp{\chi_k}{\chi_\ell} = \frac 2 {\pi} \int_0^\pi \sin[(k+1)\theta)\sin[(\ell+1)\theta]\;d\theta\] Because sines with different frequencies are orthogonal, we conclude that \[\inrp{\chi_k}{\chi_\ell} = \delta_{k\ell}\] So our characters are orthonormal.
Finally, we will show that these are all of the irreducible representations. Suppose that $W$ was another irreducible representation. Then \[0 = \inrp{\chi_W}{\chi_k} = \int_G \chi_W(g) \overline{\chi_k(g)}\;dg\] Using the same computational tricks, we see that \[0 = \frac 2 \pi \int_0^\pi \chi_W(e^{i\theta}) \sin[(k+1)\theta]\sin\theta\;d\theta\] Since sinces form an orthonormal basis for the set of square-integrable functions on the circle, we see that $\chi_W(e^{i\theta}) = 0$, which is impossible. Thus, every irreducible representation must be isomorphic to some $V_k$.
Characters made it fairly easy to classify all of the irreducible representations of $SU(2)$. Later on, we will generalize some of the computational techniques we used here to find the Weyl character formula and Weyl Integration Formula, which will be very useful for understanding representations. But that will have to be another post, since this one is already much longer than I realized it would be.
No comments: