Tensor Products
In quantum mechanics, we represent a particle as a vector in a 'state space' $V$. If we have two particles, we represent the pair as a vector in a product vector space $V_1 \otimes V_2$. This product space is called the 'tensor product'. But what is this tensor product? And why is does it represent pairs of particles?
Warm Up: The Direct Product
Before we talk about the tensor product of vector spaces, we'll go over a more intuitive way of taking the product of vector spaces. Given vector spaces $V$ and $W$, the direct product, $V \times W$, is defined as the set of all pairs of vectors $(v, w)$ for $v \in V$ and $w \in W$. We add together vectors in this new space component by component. \[ (v_1, w_1) + (v_2, w_2) = (v_1 + v_2, w_1 + w_2) \] And we scale up vectors by scaling up both components \[ \lambda(v, w) = (\lambda v, \lambda w)\]
Tensor Products of Vector Spaces
To get the tensor product $V \otimes W$, we can modify the direct product. We still want to look at pairs $(v, w)$. But we'll change the definitions of multiplication and addition a bit. In our new definition of scalar multiplication, multiplying our vector by a scalar only scales one of the components. \[ \lambda(v, w) = (\lambda v, w) = (v, \lambda w)\] For addition, we now only define addition if one of the components matches. \[ (v_1, w) + (v_2, w) = (v_1 + v_2, w)\] The sum only works because the second component in each term is $w$. We get a similar sum if the first components are equal and the second components are different. For all other sums, we just define them as themselves. $(v_1, w_1) + (v_2, w_2)$ is just defined to be itself. It cannot be simplified.
Finally, instead of writing $(v, w)$, we instead write $v \otimes w$. This way, it looks different from the elements of $V \times W$. We call this new space the tensor product of $V$ and $W$.
Simple Example
Let's look at the tensor product of $\mathbb{R}$ with $\mathbb{R}$. The simplest elements of $\mathbb{R} \otimes \mathbb{R}$ look like $1 \otimes 2 + 3 \otimes 4$. Using our rules we defined earlier, we can do things like
\[\begin{aligned} 2 \otimes 3 + 4 \otimes 6 &= 2 \otimes 3 + 4 \otimes 2\cdot 3\\ &= 2 \otimes 3 + 8 \otimes 3\\ &= 10 \otimes 3 \end{align*}But Why?
The definition of a tensor product looks fairly arbitrary. But it winds up having some nice properties that turn out to be interesting, and surprisingly natural, to study. In order to describe these properties and why they're useful, we will have to make an expedition into the wonderful land of algebra.
Over the last 100 years, mathematicians have realized that when studying mathematical objects, it is incredibly useful to study functions between these objects. When you're looking at vector space, the natural functions to study are linear functions. A linear function is a function that commutes with addition and scalar multiplication. That is to say, it is a function $f:V \to W$ such that $f(v_1 + v_2) = f(v_1) + f(v_2)$, and $f(\lambda v) = \lambda f(v)$.
A lot of functions that take in multiple arguments also have a similar property. For example, let's look at multiplication of real numbers. Mathematically, we can write this as a function that takes in two numbers and spits one number back out. i.e., $m : \mathbb{R} \times \mathbb{R} \to \mathbb{R}$. If we fix the first number and vary the second number, this is a linear function!
\[\begin{aligned} m(a, b_1 + b_2) &= a(b_1 + b_2)\\ &= ab_1 + ab_2\\ &= m(a, b_1) + m(a, b_2)\\ m(a, \lambda b) &= a\cdot \lambda b\\ &= \lambda(a b)\\ &= \lambda \cdot m(a, b) \end{aligned}\]But it actually has a stronger property. If we fix the second argument and vary the first argument, we also get a linear function out. So this function $m$ is a linear function in either argument. We call such a function bilinear (or multilinear if it takes more than 2 arguments). Multilinear functions pop up all over the place. The cross product, dot product and determinant are all multilinear.
From the definition of multilinearity, we get some identities that multilinear functions have to satisfy. Suppose $f$ is multilinear. Then
\[\begin{aligned} f(a, \lambda b) &= \lambda f(a, b) = f(\lambda a, b)\\ f(a, b_1 + b_2) &= f(a, b_1) + f(a, b_2)\\ f(a_1 + a_2, b) &= f(a_1, b) + f(a_2, b) \end{aligned}\] Do these equations look familiar? They're exactly the rules that we made the tensor product follow.Tensor Products: An Algebraic Perspective
To an algebraist, tensor products are fundamentally about linear maps. The tensor product of two vector spaces $V$ and $W$ is defined by a universal property. Suppose $h$ is a function that takes in an element of $V$ and an element of $W$ and returns an element of a third vector space $Z$. We can write this as a function $h:V \times W \to Z$. Furthermore, let $h$ be bilinear. Then we can extend $h$ to a unique linear map $\bar h$ from $V \otimes W$ to $Z$. If we want to be fancy, we can draw this requirement as the following commutative diagram
This is what tensor products are really about. The confusing definition given above is made specifically so that tensor products play nicely with bilinear maps. Whenever you see a tensor product, you should look around for bilinear maps.
Back to Quantum Mechanics
In quantum mechanics, linear functions play a central role. You look at quantum states as vectors in some large vector space, and special linear functions on this vector space correspond to quantities you can observe. These are called "observables".
Now, what if we have two particles? Each one independently is a vector in some vector space, but how do we describe the pair? Suppose we have some observable that we can measure on the pair. When restricted to one particle, it should be a linear operator like an ordinary observable. So really, our observables for the pair should be bilinear functions from the direct product of the vector spaces. This means that they are linear operators on the tensor product space!
The commutative diagram is from wikipedia