Are tensor products in QM unnatural?

Off-topic: Your humble correspondent will be incorporated in the next edition of the Holy Scripture

One aspect of quantum mechanics that makes beginners – including permanent beginners – feel uncomfortable is the fact that its "space of possible states" seems too large to them. The Hilbert space may seem large and if we want to describe composite systems, we need to employ a tensor product of the smaller Hilbert spaces. This product seems both large and mathematically complicated.

For example, a user named Joe asked a would-be historical question that made is sound as if the tensor products in quantum mechanics were very unnatural and they had to be randomly encountered after some long history of trials and errors.

This ain't the case. The tensor products are completely natural, inevitable, and they were known to be relevant for the description of systems with many degrees of freedom from the beginning – when Werner Heisenberg and pals wrote their first papers about quantum mechanics.

What's going on here?

As I have mentioned many times, the "plus sign" in the sum of several terms that make up a wave function or a density matrix roughly means "OR". When the state vector is\[

\ket\psi = 0.6 \ket{\rm red} + 0.8i \ket{\rm blue},

\] it is in a particular state and both the absolute values of the amplitudes as well as their relative phase(s) matter. If you neglect the phases for a while, the numbers \(0.6\) and \(0.8\) are probability amplitudes. By squaring them, you get numbers \(0.36\) and \(0.64\), respectively. These are probabilities, 36% and 64%, that the system is red or blue, respectively.

The superposition doesn't mean that there is a red system and there is a blue system. There is only one system and its properties are determined in the probabilistic manner.

Quantum mechanics, and not just Windows 8, is as mean as a wolf, as sharp as a tooth, as deep as a bite, as dark as the night, as sweet as a song, as right as a wrong, as long as a road, as ugly as a toad. Quantum mechanics is all I wanna be. "Everything At Once" (Lumo-enhanced audio: lenka-once3.wma) by Czech-dadded Australian singer Lenka Křipač[ová] (who doesn't even know the right diacritical signs in her own surname) got to #2 in the Czech radio hitparades during the last week even though it's fundamentally a 2-year-old song.

The intuitive reason why beginners – including permanent beginners – tend to interpret the "plus sign" as having "many objects" is that adding many objects is the simplest operation with objects they may imagine (2 apples and 3 apples are 5 apples) and addition is the simplest operation they may think of. So these two simplest things have to be the same.

That wouldn't be a waterproof argument but it would have some logic. What's primarily wrong about the argument is that the elementary entities that are being "quantified" in quantum mechanics aren't objects themselves. The elementary entities that are being quantified are propositions about observables and the truth values or probabilities that such propositions are true. And this makes a difference.

In classical physics, we may also discuss whether some propositions are true or false. Is the distance between the Earth and the Sun greater than 100 million kilometers? Yes, it is. But these propositions are propositions about objects and the objects are fundamental. In classical physics, the objects objectively exist and the true or false propositions about them just reflect the underlying, more fundamental reality: the objective reality in a classical world.

It ain't the case according to quantum mechanics. Quantum mechanics discusses the truth value – and its continuous generalization, the probability – of propositions directly. The truth value and probability of propositions isn't derived from any auxiliary player such as "objectively existing objects". There aren't any objectively real objects with objective properties such that the validity or probability of propositions would be a mere reflection of them. According to quantum mechanics, such objects would be just parasitic intermediaries that would steal some "money" when we want to achieve a well-defined goal, namely to decide whether a proposition is true. We don't need such intermediaries. Quantum mechanics gives you the engine that relates the truth values and probabilities of various propositions about Nature directly, without any intermediaries. In fact, it bans any objective intermediaries.

So one must understand that \(\ket\psi\) or \(\rho\) aren't mathematical representations of some classical objects and their properties. They're not classical fields, they're not classical degrees of freedom of any type. They're not even observables according to the definition (observables in quantum mechanics have to be linear operators whose eigenvalues may be measured) and they're not even observable according to the colloquial meaning (we can't measure the values of a wave function or a density matrix by making one measurement in one repetition of the situation).

As Niels Bohr used to say, the physics research can't be described as the search for answers "how things in Nature are" but the search for the "true propositions we can make about Nature". In classical physics, we ultimately did the latter as well but we could always imagine that the true propositions were corollaries of the former. In quantum mechanics, we can no longer assume such a thing.

Instead, we must literally accept that \(\ket\psi\) and \(\rho\) describe something like the probabilities of various statements. They're "slight" generalizations of the concept of the probability distribution in classical physics, not a "slight" generalization of a classical field of a sort! And this makes a difference. A lot of them.

First, it immediately follows that the addition means "OR". Just to be really clear about this point, let me mention that "OR" is an operation with two truth values that is also known as the logical sum, \(p \lor q\). The special symbol is written as \(\backslash {\rm lor}\), "logical or". You adopt rules such as \(0\lor 0=0\), \(0\lor 1=1\lor 0=1\), and... \(1\lor 1=1\), sorry for that. The last formula means that using \({+}\) instead of \(\lor\) would be misleading. But in the example of "red" and "blue" above, it can't happen that the object is simultaneously red and blue (these two states are orthogonal to one another) so this problem is inconsequential.

Instead, if we want the word "AND" in between two propositions, the logical values simply get multiplied. That's why "AND" is known as the logical product and in this case, it may be written as \(p\cdot q\) without any problems. We have \[

0\cdot 0 = 0,\\
0\cdot 1 = 0,\\
1\cdot 0 = 0,\\
1\cdot 1 = 1.

\] These rules – which are identical to the rules for multiplication of integers \(0\) and \(1\) – say that \(p\) AND \(q\) is a valid proposition (\(1\)) only if both \(p\) is true (\(1\)) and \(q\) is true (\(1\)). Well, the previous sentence is worse than just a tautology; it is a tautologically written tautology. ;-) At any rate, the word "AND" inserted in between objects such as \(\ket\psi\) and \(\rho\) must be naturally interpreted as multiplication because these objects contain the information about the truth values and "AND" means a logical product.

While the truth values \(p\) may be \(0\) (false) or \(1\) (true), quantum mechanics also needs to use the generalized truth values, the so-called probabilities. Probabilities \(p\) are between \(0\) and \(1\) and the extreme values \(0,1\) still mean "false" and "true" while the values in between mean various degrees of "maybe". The product of probabilities still corresponds to the word "AND". By this statement, I mean that if we have two propositions about independent – uncorrelated – systems and their probabilities are \(p\) and \(q\), then the probability of "statement 1 AND statement 2" is simply equal to \(p\cdot q\). We often write \(p\cdot q\) as \(pq\) because multiplication is so important, omnipresent, and natural. Despite some people's feelings, it may be said to be a more elementary or more natural method to join two objects or propositions than addition. That's why \(pq\) means \(p\cdot q\) and not \(p+q\).

Fine. What about the tensor products in quantum mechanics?

We must first understand what the tensor products mean mathematically. Let's start with a different type of a product of vectors, the "inner product" (or the "dot product" or the "scalar product" because the result is a scalar):\[

\vec x\cdot \vec y = x_1 y_1 + x_2 y_2+ x_3 y_3.

\] We had two objects, vectors \(\vec x\) and \(\vec y\), that had three components (each) labeled by the index whose value was \(1,2,3\). When we try to multiply these vectors, we a priori don't know what it means because we only know how to multiply numbers. But each of the vectors has three components (this number gets generalized in many contexts when we talk about more abstract and general vectors, e.g. state vectors in quantum mechanics) so we may get \(3\times 3 =9\) different products of the components \(x_i y_j\) where the indices \(i,j\) are independent of each other.

The inner product above takes the sum of the three products \(x_i y_i\) for \(i=1,2,3\). The cross product is something that has three components by itself – it is a vector – and each of these three components is some combination of products \(x_i y_j\) for various values of \(i,j\).\[

\vec x \times \vec y = (x_2 y_3-x_3 y_2,x_3 y_1-x_1 y_3,x_1 y_2-x_2 y_1)

\] Now, the tensor product is a different, in some sense simpler, version of the product of two vectors because we simply keep all \(3\times 3 = 9\) components:\[

(\vec x\otimes \vec y)_{ij} = x_i y_j.

\] It looks simpler than the previous two products, doesn't it? We don't need to remember which indices must be written down and which summations have to be performed. We just keep all the products of components. The pair of indices \(ij\) of the tensor product object goes from \(11\) to \(33\), over nine possible combinations. We may just relabel them as \(1,2,\dots ,9\) according to some more or less natural rule. This rule is a pure convention, of course. You can decide that \(1,2,3,4,5,6,7,8,9\) means\[

11,12,13,21,22,23,31,32,33

\] but it may also mean\[

11,21,31,12,22,32,13,23,33

\] or any other permutation, for that matter. In fact, you may choose a convention – very important for real-world calculations in physics – in which you don't write these 9 components in any order. Instead, you write some 9 linear combinations of these components, combinations that are independent of each other. Sometimes you may prefer a convention in which some of the components get repeated, and so on.

Whatever you do, the list of all the products still remembers some "structure" from the vector indices. The most general tensor – the object with 9 components \(T_{ij}\) – transforms in the same way as the tensor product of two vectors.

Up to this moment, I may have convinced you that the tensor product is a natural product of two vectors. If the two vectors have more than \(3\) components – for example \(N\) components where \(N\) is the dimension of the appropriate Hilbert space – the rules for the tensor product may still be easily written down (unlike, for example, the rules for cross products which are special operations done on 3-dimensional vectors only).

What is the real psychological obstacle isn't the question how two particular vectors are tensor-multiplied which is a rather natural operation. The real problematic question for beginners – including permanent beginners – is why the tensor product of the whole Hilbert spaces is the right mathematical object to describe composite systems.

But this is really inevitable, too. At the beginning, I mentioned an example and declared that the numbers \(0.6\) and \(0.8i\) were related to probabilities that an object (a flag) was red or blue, respectively. You may consider a composite system composed of a flag and a cat. The cat may be dead or alive. The point is that there had to be state vectors in the Hilbert space that describe a red flag or a blue flag; an alive cat or a dead cat. And because we want to describe a composite system whose two objects must a priori be allowed to be independent, there must be four basis vectors describing all conceivable arrangements of the properties of the flat and the cat:\[

\ket{\rm red,alive},\ket{\rm red,dead},\ket{\rm blue,alive},\ket{\rm blue,dead}.

\] More generally, if the first subsystem has \(d_1\) possible mutually exclusive states and the second subsystem has \(d_2\) possible mutually exlusive states (these are the dimensions of the subsystems' Hilbert spaces, often infinite), the Hilbert space describing the tensor product must have \(d_1d_2\) basis vectors because all the arrangements of the properties must be allowed. We just want the systems to behave independently. It must be possible for the systems to exhibit no correlation so no states may be missing.

I just said that "independence" implies that we need the tensor product Hilbert space. We can be more "algebraic" about this point. What we really want is for the observables \(F_m\) describing the flag and the observables describing the cat \(C_n\) to commute with each other:\[

\forall m,n: \quad F_m C_n = C_n F_m

\] This condition means that measurements of the flag don't affect the cat and vice versa. They may be done independently and the order of the measurements doesn't matter. At this point, I have to emphasize that this assertion holds even when the flag and the cat are replaced by two entangled particles. Despite hugely widespread misconceptions and misinterpretations, it's still true that the measurement of one entangled photon doesn't influence the other photon and vice versa! The two photons are just prepared in an entangled i.e. correlated state but the correlations are (i.e. entanglement is) due to the common past of the two photons, not due to any disruptions caused by the measurements. The measurement of one entangled particle doesn't do anything to the other particle. Their observables commute with each other so there's no "uncertainty principle" applying to them. The measurements only pick some results whose character reflects the (quantum probabilistic) state that the two particles have had before the measurements. But because the measured operators commute with one another, one doesn't "disturb" the other and the ordering doesn't matter.

Let me return to the flag and the cat.

Because the flag observables and the cat observables have to commute with each other, we learn something about the (composite system's) Hilbert space on which these observables act. Note that \(F_m\) may be pretty much any matrix whose size is \(d_{\rm flag}\times d_{\rm flag}\). In the same way, \(C_n\) is any matrix whose size is \(d_{\rm cat}\times d_{\rm cat}\). We want these observables to commute with each other. How do we achieve that?

It's obvious that we need a much larger Hilbert space. On the larger Hilbert space, the observables are represented by tensor products with the unit matrix:\[

F_m\to F_m\otimes {\bf 1},\\
C_n\to {\bf 1}\otimes C_n.

\] Note that both types of operators, flag operators and cat operators, were upgraded to operators acting on a larger Hilbert space, the tensor product of the original two spaces. In particular, the operators themselves are the tensor products of the original operators acting on the appropriate "factor" in the larger (tensor product) Hilbert space and the identity operator (unit matrix) acting on all the other factors. Using a mathematical jargon, the tensor product linear space is the smallest representation of the mutually commuting operators for the flag and for the cat (or their algebraic relations, if you wish).

The identity operators are needed to preserve all the algebraic relationships between the operators. For example,\[

(F_m\otimes {\bf 1})\cdot (F_k\otimes {\bf 1}) = (F_m\cdot F_k)\otimes {\bf 1}.

\] This simple rule says that when it comes to relationships between the flag operators (identities saying that the product of two particular operators is a third one), it doesn't matter whether we add the "tensor multiplied by the unit matrix [acting on the cat space]" portion to each or not. The same identity applies to the cat operators:\[

({\bf 1}\otimes C_n)\cdot ({\bf 1}\otimes C_\ell) = {\bf 1}\otimes (C_n\cdot C_\ell).

\] Needless to say, the same identities would hold if the \(F_m\) or \(C_n\) operators were added rather than (matrix) multiplied. We want to preserve the algebraic relationships between the observables i.e. operators acting on the flag's Hilbert space and/or the cat's Hilbert space. And we want these two sets to commute with each other. The minimal way to achieve so is to promote the Hilbert space to the tensor product – which has a basis vector for each pair of basis vectors of the flag's Hilbert space and the cat's Hilbert space. And the operators themselves are promoted to tensor products of the original ones with the unit matrix (identity operator).

Note that this construction implies, among other things, that if you describe two particles by their wave functions as a function of the position, the wave function for two particles has to be a function of 6 coordinates (more generally \(3N\) for \(N\) particles):\[

\ket\psi\leftrightarrow \psi(x_A,y_A,z_A,x_B,y_B,z_B).

\] One function of 3 variables or two functions of 3 variables just isn't the right thing to consider. The operators – observables – must act on a single wave function because it must be possible to multiply (compose) the operators. If they were acting on different spaces, you couldn't multiply them e.g. as in \(x_A z_B\). But a smaller Hilbert space of wave functions than the tensor product just couldn't respect the a priori independence of the subsystems (individual particles). I stress that \(x_B\) is as independent a coordinate in the phase space (and configuration space) from \(x_A\) as \(y_A\) is independent from \(x_A\). There's really no mathematical difference here and because a particle in 3 dimensions needed a function of 3 spatial variables, it's clear that two particles in 3 dimensions need a function of 6 spatial variables. Those things were clear to Werner Heisenberg and others from the beginning although they focused on the operators themselves and not on the wave functions upon which the operators ultimately act.

On his blog, Scott Aaronson has discussed a crackpot paper, one of the millions of 19th century-style anti-quantum delusions attempting to replace the rules of quantum mechanics by classical solitons and pretending that everything could be fine except that it's totally obvious that nothing whatsoever may continue to work. In the discussion under Scott's blog entry, the authors of the crackpot paper as well as tons of other assorted nuts offer their preposterous "wisdom" that everything could still work. In some cases, the "wisdom" boils down to the folks' misunderstanding of the relationship between the Shannon and von Neumann entropy (and the logarithm of the volume of a region of the phase space) and similar rudimentary ignorance about basic maths and physics.

But the authors didn't want to write just a random nonsensical model that has no chance to work. They also added some (indefensible) claims such as the claim that quantum computers can't work because the quantum mechanical description of the 5 qubits is already wrong, and so for, and it only randomly works for up to 3 or 4 qubits. This is a totally preposterous statement because quantum mechanics has been tested for huge systems that often have as many as \(10^{30}\) qubits and it always works. Quantum mechanics isn't just a theory of one or two elementary particles. It's demonstrably the theory of Nature as such. It's been tested for small systems and large systems, too. For large systems, we may often use a simpler theory, the classical limit, but quantum mechanics still works!

And quantum mechanics is actually essential for us to understand the structure (and calculate the properties) of metals, superconductors, and all other materials, too. In all similar cases, we are ultimately trying to find the ground state of the Hamiltonian and a certain number of low-lying excitations (energy eigenstates with energies just slightly above the ground state energy). But aside from their having a low energy, these states are just some general elements of the Hilbert space, generic superpositions of its basic vectors. By experimentally demonstrating their relevance, we're really demonstrating that we need all the superpositions.

A lab system of 5 or 8 qubits is such a mundane, low-energy, not-at-all-new example of quantum mechanics that it's absolutely implausible that something about quantum mechanics could break down for such systems. Quantum mechanics has been tested for diverse systems that are very similar to this one – both smaller systems as well as larger systems – and the errors of quantum mechanics were found to be zero within the error margins. It's preposterous to believe that the errors could suddenly be of order 100 percent for just another system.

In that thread, I argued that it has never happened in the history of physics – and it will probably never happen – that a self-consistent and verified (compatible with experiments) theory would be replaced, as a description of a non-extreme, well-defined physical situation, by a newer description whose "space of allowed states" would be qualitatively smaller. At most, the newer theories deform the older ones; add new degrees of freedom that were invisible; add quantization rules or uncertainties that are negligible in the non-extreme real-world situation; or change other things that are demonstrably inconsequential and may only become important in an extreme regime defined by extreme values of some observables.

Various people repetitively show their stupidity when they try to offer various would-be counterexamples. Of course that there are no counterexamples.

The tensor product Hilbert space looks too large to those people. I think that the whole discussion of the basis vectors of the tensor product Hilbert space has to be ultimately accepted by these folks. If the first subsystem has \(d_1\) mutually exclusive states and the other subsystem has \(d_2\) mutually exclusive states, we simply need \(d_1 d_2\) mutually exclusive states of the composite system – all the possible pairings – if we want to preserve the a priori independence of the subsystems.

What's really uncomfortable for these folks is the superposition principle, namely that not only the \(d_1 d_2\) basis vector are allowed but arbitrary complex superpositions of these basis vectors are allowed, too. The whole linear Hilbert space just looks too large to them.

For the system I just discussed, the Hilbert space has \(d_1 d_2\) basis vectors so we need \(d_1 d_2\) complex amplitudes to describe a wave function. Each of them is a complex number. Their real parts and imaginary parts are between \(-1\) and \(+1\). Imagine that these parts are written down with the precision \(0.002\). So there are roughly \(1,000\times 1,000=1,000,000\) possible values of each complex probability amplitude so the number of vectors in this pixelated Hilbert space is something like\[

1,000,000^{d_1\cdot d_2}.

\] It's an exponentially large power of a million and this number simply looks too large (especially because both \(d_1\) and \(d_2\) are numbers of the type \(10^{26}\times\infty\) for macroscopic objects) to those quantum mechanical beginners – including permanent beginners. Their inherent mathphobia always wins. They are really imagining the Hilbert space as the classical phase space and it shouldn't be this insanely high and multi-dimensional.

However, the Hilbert space isn't the classical phase space (space of classical states) and quantum mechanics isn't classical physics, stupid. The fact that we must allow arbitrary complex linear superpositions isn't making the theory any more complicated because all the laws of the theory (e.g. the evolution operators) must be linear. So it's always enough to know how an operator (observable and/or the evolution operator) acts on a basis – and we immediately know how it acts on any vector in the Hilbert space.

So the intuition that the "Hilbert space is insanely large" coming from the pixelation above is bogus. The freedom for the probability amplitudes to be arbitrary continuous (complex) numbers is just a (complex) version of the rule that probabilities may be continuous numbers. The wave function (or density matrix) are "slight" quantum generalizations of probability distributions in classical physics. And no one has been surprised that the values \(\rho(x,y,z,p_x,p_y,p_z)\) of a probability distribution could have been arbitrary continuous numbers. So why are people so stunned by the totally analogous fact in quantum mechanics?

The answer is that they're trying to imagine the wave function as a counterpart of a classical field, e.g. \(\vec E(x,y,z)\), instead of a probability distribution. They're doing so because they know that \(\ket\psi\) or \(\rho\) is "sort of fundamental" and they just (totally incorrectly) believe that fundamental things should have the same interpretation as the classical fields!

Just to be sure, quantum mechanics isn't equivalent to a classical theory with a phase space probability distribution. In quantum mechanics, interesting operators still refuse to commute with each other – in classical physics, they always commute. The probabilities aren't quite fundamental, they are calculated by squaring the absolute values of more fundamental complex probability amplitudes but it is a technical detail of a sort. The nonzero commutators combined with the complexity of the amplitudes (the latter actually inevitably follows from the former – it's enough to realize that we often have \([A,B]=C\) and if \(A,B\) are Hermitian operators, one may prove that \(C\) is an anti-Hermitian operator so at least some of the entries of at least one of the three operators have to be complex, at least if \(C\) is diagonal!) implies many intrinsically quantum phenomena such as the interference of the wave functions.

But the counting of the "number of values stored in the wave function" isn't too much different from the counting in classical physics with a probability distribution on the phase space. The exponentially large ensemble of these values is self-evident because for each mutually exclusive state (basis vector of the Hilbert space), we must allow a (complex) probability amplitude.

The fact that the composite system may be prepared in different states than the simple tensor product of some basis vectors\[

\ket{f_m}\otimes \ket{c_n}

\] is known as entanglement. In general, we need linear superpositions of such tensor products of basis vectors:\[

\ket\psi = \sum_{m,n} a_{mn}\ket{f_m}\otimes \ket{c_n}

\] But this freedom to mix different tensor products of the basis vectors of two subsystems isn't "qualitatively different" from the classical counterpart of this statement, namely that two subsystems may be correlated with each other. If \(p(m,n)\) quantifies the probability that the first classical subsystem is found in the \(m\)-th state and the other is found in the \(n\)-th state, then \(p(m,n)\) that don't factorize\[

\nexists p_1(m),p_2(n):\quad p(m,n) = p_1(m) p_2(n)

\] simply describe two subsystems whose properties have become correlated (due to their common origin or mutual interactions sometime in the past – that's the only way how correlations may arise). Entanglement is nothing else than the most general correlation describable by the quantum state vectors – whose probability amplitudes are allowed to be complex, not just real (one novelty) and that describe observables that don't commute with each other (another novelty, the key one).

Due to the novelties, quantum mechanics allows much tighter correlations (violating Bell's inequalities etc.) than classical physics and many things contradict our classical intuition (and any attempt to invent a "classical model" of Nature). However, the essence is still the same. The general superpositions of the basis vectors on the tensor product Hilbert space just correspond to the fact that subsystems may become correlated in a general way. In quantum mechanics, non-commuting observables may acquire correlations of a general kind which forces us to admit that the probability amplitudes \(a_{mn}\) are complex numbers. But as I said, it's just a technicality.

Now, if some people – like Mr Ross Anderson and Mr Robert Brady – think that they may severely cripple quantum mechanics by denying that a simple system of 8 qubits is described by a 256-dimensional complex Hilbert space i.e. it needs at least 256 independent complex probability amplitudes to describe the general state and if they believe that their alternative theory's "much more narrow" space of states won't pose any problems for the compatibility of their alternative theory with the observations, then they are displaying a complete lack of knowledge of physics and intuition about physics (and maths).

This just ain't possible. The relevance of the tensor product Hilbert spaces as well as the legality of all complex linear combinations of state vectors (the superposition postulate of quantum mechanics) have been verified in millions of diverse, very different situations with a few particles, many particles, intermediate numbers of particles and it has always worked, often with the \(10^{-10}\) relative accuracy (or better). The idea that a mundane, low-energy system with 8 qubits – no physically meaningful characteristic quantity is extreme in any way (the speed of heavy components doesn't approach the speed of light, the entropy density per unit are is far from Planckian, the angular momenta and actions are close to Planck's constant but it's OK because we do use quantum mechanics) – will suddenly contradict quantum mechanics by errors of order 100% is as "reasonable" as the belief that a 30-kilometer inflated version of Elvis Presley will be found by a satellite on the other side of the Moon once we look there again. One can't "rigorously" prove that it's impossible – nothing in natural sciences can be quite rigorously proven to be impossible – but it's just preposterous, a failure of scientific thinking and intuition.

People who try to deny quantum mechanics – the tensor product structure of the composite Hilbert space and/or the superposition principle – for objects as simple as those with 8 qubits constructed from low-energy, intermediate size collections of elementary particles are hopeless cranks.

And that's the memo.

Are tensor products in QM unnatural?

0 comments:

Post a Comment

Popular Posts

Recent Comments

Arsip Blog