Matrix theory: a novel alternative to second quantization etc.

I still consider Matrix theory (BFSS, 1996) one of the most conceptually original developments in theoretical physics of the last 20 years.

It is a relatively unusual way to describe physics in the 11-dimensional asymptotically flat vacuum of M-theory – and in other sectors of string theory. The physical phenomena are completely equivalent as in other descriptions; but the way how they're encoded in the mathematics looks very different.

There are several natural pedagogical ways to get to Matrix theory and I will try to sketch the following three of them:

try to study some obviously beautiful quantum mechanical models in extreme limits (the infinite number of colors) and try to find a simplified description of this limit;
start with M-theory whose explicit equations weren't known before BFSS 1996 and transform it via dualities and tricks into something that you may describe;
try to invent a completely new framework (different from quantum field theory and second quantization) to describe multiparticle states and interactions between the particles, among other things, that may lead to the same kind of physics.

It's the last approach that was used in the title of this blog entry.

Some hindsight is needed to describe the situation in this way and some of this hindsight is often missing in the pioneering papers; BFSS 1996 is no exception. But we have it now and we may look at the situation from various perspectives and avoid some speculative comments that were discussed at the beginning but that were found to be invalid.

The third approach seems to be the most conceptually valuable one but let me start from the first approach.

Maximally supersymmetric gauge quantum mechanics

This year, the \(\NNN=4\) gauge theory in \(d=4\) celebrated its 35th birthday. It's obviously a beautiful theory, the most supersymmetric non-gravitational theory in \(d=4\) you may find. It may be derived as a dimensional reduction of the 10-dimensional supersymmetric theory; as a low-energy limit of the dynamics of D3-branes; and in many other ways.

It has many cool symmetries including the \(SL(2,\ZZ)\) S-duality group (which is able to exchange electricity and magnetism, the weak coupling and the strong coupling, electrically charged particles with magnetic monopoles), the superconformal symmetry, the dual superconformal symmetry, their infinite-dimensional extended and completed union, the Yangian, and others. For those reasons, various scattering amplitudes, while nonzero, dramatically simplify relatively to the form you would expect in a field theory with a similar Lagrangian but fewer supercharges. Many of those features may be exposed in the twistor-based approaches.

As I have mentioned above, this 4-dimensional theory may be viewed as the dimensional reduction of the 10-dimensional gauge theory whose action looks like this:\[

\eq{
S &= \int \dd^{10} x\, \LL,\\
\LL &={\rm Tr}\left[-\frac{1}{4}F_{\mu\nu}F^{\mu\nu} + i\bar\Psi D^\mu \gamma_\mu \Psi
\right]
}

\] It's an ordinary Yang-Mills Lagrangian for a gauge field associated with a gauge group we will take to be \(U(N)\), i.e. \(SU(N)\times U(1)\) including the \(U(1)\). The unitary groups are the least complicated infinite family of simple compact Lie groups, and for a Majorana-Weyl (real-and-chiral) fermion field (covariant derivatives have to be used). This innocent combination is enough to produce an exactly supersymmetric classical (or effective) field theory. It's non-renormalizable in \(d=10\) but its dimensional reductions to \(d\leq 4\) are renormalizable.

By dimensional reductions, we mean that the "excessive" dimensions are compactified on a torus – they are made periodic – and the radii of the torus are sent to zero. For this reason, finite-energy excitations are forced to be constant in these dimensions. Moreover, the rotational symmetry between the compactified and uncompactified dimensions becomes badly broken. It is no longer sensible to consider the components of the gauge field \(A_\mu\) along the compactified dimensions \(\mu\) to be components of a gauge field. It's better to consider them scalars.

Off-topic but fun: A reader has pointed out that in the new version of the Hartle-Hawking-Hertog paper, your humble correspondent and HB are thanked for the innocent sign error whose impact so far looks isolated (but I still believe that there must be some other errors unless the paper shows some really important loophole in the Ehrenfest "theorem" way of thinking about the evolution in quantum gravity).

For example, if we dimensionally reduce from \(d=10\) to \(d=4\), six of the components of the gauge field (whole matrices in the adjoint of the gauge group) will become six scalars: they're the source of the \(SO(6)\sim SU(4)\) "R-symmetry" of the \(d=4\) gauge theory. Once we get some scalars, some gauge couplings for fermions also produce the Yukawa couplings as a result. It's convenient to decompose the fields under the unbroken \(SO(d-1,1)\) i.e. \(SO(3,1)\) Lorentz symmetry acting on the large dimensions which is a subgroup of the original \(SO(9,1)\) symmetry.

This 2007 model was called Toyota Matrix M-theory. No kidding. Those East German engineers who refused the German reunification plan to produce a competing car, The Trouble With Trabant: Not Even Wrong.

The 35th anniversary article discussed the reduction to \(d=4\) which gives the prettiest "descendant" of the 10-dimensional gauge theory, one that has the most amazing properties at the quantum level. But we may dimensionally reduce the theory to lower dimensions, too. For example, we may reduce it to \(d=1\) spacetime dimension. That's the minimum you may naively think of even though the reduction to \(d=0\) may actually be important, too.

What does it mean to have a \(d=1\) quantum field theory? It has one spacetime dimension. We want at least one dimension of time. So a simple counting, \(1-1=0\), implies that time is the only dimension on which our fields depend. What does it mean? Well, it means that it's not a full-fledged "quantum field theory" anymore. It's a quantum mechanical model. Instead of fields such as \(\Phi^6(x,y,z,t)\), we have "fields" that may be renamed as \(X^6(t)\). They're quantum observables. You see that the field content is completely analogous to the textbook models of non-relativistic quantum mechanics.

In similar contexts, "quantum mechanics" is often used as a synonym for a 1-dimensional quantum field theory. (Of course, "quantum mechanics" also means – and mainly means – the general framework of modern physics given by the Copenhagen school's postulates that all quantum theories, including higher-dimensional quantum field theories and string theory, obey.)

You might argue that quantum mechanics has advantages over higher-dimensional quantum field theories. It doesn't need any renormalization and similar stuff. The wave function is just \(\psi(x_1,x_2,\dots)\) and obeys some partial differential equations in a finite number of variables. There's no room for infinities. Everything looks simple. Everything that annoys you about regularization, renormalization, renormalization group etc. is fundamentally absent here. (Well, it's not quite true because some of these things have nontrivial analogues in \(d=1\) but it's still true that you may choose a fundamentally unequivocal, finite description of quantum mechanical models.)

Fine, so what is the quantum mechanical model we are considering here? Its Lagrangian looks like this:\[

\eq{
S &= \int \dd t\,\LL\\
L &= \frac{1}{2g}{\rm Tr} \zav{
\dot X^i \dot X^i + 2\theta^T\dot\theta+\frac{1}{2}[X^i,X^j]^2-2\theta^T \gamma_i[\theta,X^i]
}
}

\] You see that the Lagrangian contains some Klein-Gordon kinetic terms for the nine scalars \(X^i\), Dirac-like kinetic terms for the 16 components of the fermions \(\theta\), a quartic commutator-squared-based potential for \(X^i\) that arises from the quartic terms in \(F_{\mu\nu}F^{\mu\nu}\) in the 10-dimensional gauge theory, and a Yukawa term. All the terms are traces over the \(N\times N\) matrices, just like in any other version of the \(U(N)\) gauge theory with fields in the adjoint.

I should be more explicit in explaining how many degrees of freedom the theory actually has. Well, all the fields transform as Hermitian matrices – i.e. adjoint representation – under the \(U(N)\) gauge group. Moreover, the bosonic fields \(X^i\) arising from \(A_\mu\) carry an extra index \(i=1,2,\dots,9\) which corresponds to the 9 spatial dimensions we dimensionally reduced. The remaining component \(A_0\) may be (but doesn't have to be) set to \(A_0=0\) by the gauge redundancy.

So the fields \(X^i\) are nine Hermitian \(N\times N\) matrices. Because we ultimately want to study the quantum theory, all the components of these matrices are operators. Much like in normal non-relativistic quantum mechanics, we find out that there are canonical momenta \(\Pi^i\), also nine Hermitian matrices, and they have the commutators\[

[X^i_{kl},\Pi^j_{mn} ] = i\hbar \delta^{ij} \delta_{nk}\delta_{lm}

\] where the pairing of the indices is determined by the \(U(N)\) symmetry: just appreciate that in the index pairs \(kl,mn\), one index is "lower" and one is "upper" (we suppressed the difference to streamline the notation only) and each Kronecker delta-symbol has to have one upper index and one lower index, too. I've restored \(\hbar\) to emphasize that this is a "modest" extension of the undergraduate quantum mechanics models.

Now, we must also discuss the fermionic degrees of freedom \(\theta\). They arose from the Majorana-Weyl (real chiral) spinor of the 10-dimensional gauge theory so they include 16 real components that now transform as a 16-component real spinor of \(SO(9)\), the manifest rotational symmetry acting on the scalars \(X^i\) as well. When promoted to the adjoint of \(U(N)\), each component becomes a Hermitian matric again. Much like in the usual quantization of Dirac fields, these degrees of freedom are Grassmann i.e. fermionic variables that are canonical momenta to themselves (because the kinetic term in the Lagrangian only contains one time derivative):\[

\{ \theta^a_{kl},\theta^b_{mn} \} = \hbar \delta^{ab}\delta_{nk}\delta_{lm}

\] where \(a,b=1,2,\dots, 16\) and the remaining commutators (all the anticommutators have already been written because they're only appropriate for pairs of fermionic objects) vanish.

So except for the high number of dimensions indicated by the indices such as \(i,j\) that take 9 values and except for the extra degeneracy of the coordinates given by the gauge indices \(k,l,m,n=1,2,\dots,N\), and except for the extra fermionic variables, this is a pretty normal quantum mechanical model similar to the non-relativistic toy models you know from your undergraduate course of quantum mechanics (or its equivalent if you are self-taught).

We really want to say that the state of the physical system described by this \(U(N)\) quantum mechanical model may be encoded in a wave function \(\psi(x,\theta)\). How many variables does it depend on?

Well, there are \(9N^2\) real components in the matrices \(X^i_{mn}\): note that the Hermiticity reduces the number of real (i.e. Hermitian, when treated as operators on the Hilbert space) components \(2N^2\) exactly to one-half of that. The canonical momenta \(\Pi^i_{mn}\) are just \(-i\hbar\) times the derivatives with respect to the variables \(X^i_{mn}\). And then we have the fermionic coordinates; out of the \(16N^2\) components of \(\theta^a_{mn}\), one-half of them are taken to be the coordinates and the remaining one-half may be taken to be the canonical momenta. So we have \(8N^2\) Grassmann variables.

Because the Taylor expansion in the Grassmann variables terminates, we may rewrite the wave function in terms of components and their number is simply \(2^{8N^2}\) because each of the \(8N^2\) variables is either present or absent in the given monomial in the power law expansion. For example, for \(N=5\), we have \(200\) fermionic coordinates and the number of component functions is therefore \(2^{200}\sim 10^{60}\). Each of these component functions depends on \(9N^2=225\) real bosonic variables.

To solve the quantum mechanical model for \(N=5\) by the brute force, you simply calculate some coupled linear partial differential equations for \(10^{60}\) functions that depend on 225 variables. It's a trivial task you've been doing in your kindergarten, at least in principle, and you may totally avoid any complications with regularization and renormalization.

I am partly kidding because the number of components and coordinates looks horrifyingly high and for realistic simulations of the large \(N\) limit, you really don't want \(N=5\) but something like \(N=100\). But it's just a problem for the naive brute-force approaches. If you understand the model well and you know some maths, you may find methods to calculate properties of the model which are doable on one line or two. Or several pages. It shouldn't shock you that the naive brute-force simulation is neither the only way nor the recommended way to attack a seemingly mathematically difficult problem although this claim might be controversial among many laymen including physics fans.

If it makes you happier, the number of independent variables on which the wave function for a physical state depends is effectively \(8N^2\) and not \(9N^2\) because all physical states have to be invariant under the \(N^2\) generators of \(U(N)\); well, only \(N^2-1\) of these conditions are nontrivial. After all, it's a gauge theory and therefore this gauge-invariance condition for the physical states comes from varying the action with respect to \(A_0\): the corresponding charge \(Q=J_0\) – which is an \(N\times N\) matrix – has to vanish.

Fine, we know a certain supersymmetric quantum mechanical model – a quantum field theory in 0+1 dimensions – that may be in principle studied. Remarkably enough, the union of the Hilbert spaces from all these \(U(N)\) models for all integer values of \(N\) gives you the Hilbert space of M-theory in 11 dimensions! It contains everything you expect in a consistent theory of quantum gravity, including gravitons and their superpartners, their interactions, gravitational force obeying the equivalence principle, the 10+1-dimensional Lorentz symmetry, evaporating black holes that preserve the information, M2-branes and M5-branes (and strings and D-branes in compactified, stringy versions of the matrix model, including the right stringy interactions), and other things.

The quantum mechanical model was known before it was conjectured by BFSS that the model was relevant for M-theory in 11 dimensions. It was known as the description of D0-branes, point-like particles moving in the 10-dimensional type IIA string theory, at low energies. Just like D3-branes in type IIB string theory give rise to the \(d=4\) 35-years-old gauge theory at long distances, a similar statement holds for the D0-branes.

However, you should only say that you qualitatively understand what the model describes for a finite and small enough value of \(N\). When some parameters such as \(N\), the size of the matrices, are sent to infinity, the most important (and lowest-energy) objects and processes in the theory may describe something that admits (if not requires) a qualitatively different language. In this case, the particles originally viewed as D0-branes in a 10-dimensional string theory (where they're just a part of physics and interact with other objects) actually become the complete description of an 11-dimensional theory of quantum gravity, namely M-theory.

This conclusion may look shocking – how an ordinary undergraduate quantum mechanical model (which isn't even a quantum field theory, so we don't see how it could contain multiparticle states with indistinguishable particles and/or Lorentz invariance) could contain all the marvels of quantum gravity including Lorentz invariance, graviton scattering, equivalence principle, black holes? But remarkably enough, it does.

I will postpone the explanation why the matrix model includes the right building blocks of M-theory and why there's no detectable contradiction to the final section of this blog entry – about the conceptual leap. But before we get there, let us look at a proof that the matrix model is correct.

Seiberg's derivation of Matrix theory in DLCQ

The October 1997 proof was originally presented by Nathan Seiberg and, in a slightly less complete version (but more extended in some other directions we're not interested in here), by Ashoke Sen. So why is the matrix model correct?

Consider M-theory in the asymptotically flat 11-dimensional Minkowski spacetime parameterized by \(X^0\dots X^{10}\). We want to find an explicit Lagrangian that describes all the objects in this 11-dimensional theory of quantum gravity – which may be obtained as the strong coupling limit of type IIA string theory or heterotic-E string theory.

It's useful to single out two light-like directions or coordinates,\[

X^\pm = \frac{X^{0}\pm X^{10}}{\sqrt{2}},

\] and use \(X^\pm\) instead of \(X^0\) and \(X^{10}\). The light-like coordinates are very useful in relativity. For example, the expression \(t^2-z^2\) appearing in the invariants may be rewritten as \((t+z)(t-z)\): the difference of two terms may be written as one term that factorizes. Correspondingly, it's useful to consider the light-like components of the momentum etc. such as \(P^+\).

The states in M-theory have continuous, non-negative values of \(P^+\). However, it's very convenient to put the theory in a certain "box" for the momentum to become discrete. However, we don't want the box to matter so its size has to be sent to infinity. More explicitly, we want to make the following identification of the light-like coordinate \(X^-\):\[

X^- \approx X^- + 2\pi R.

\] As long as \(R\) is huge – imagine that it is hundreds of billions of light years (although the citizens of the 11-dimensional empire don't necessarily use these units for the distances in their completely different world) – the identification doesn't affect the local physics. If a copy of your local experiment is happening hundreds of billions of light years away from you, it shouldn't matter in your lab. We implicitly use some "modest version" of locality in M-theory here.

However, the change of the \(X^-\) coordinate isn't a pure translation in space; it contains a translation in time by the same amount. In fact, you should be worried about the consistency of this light-like compactification. If we made this periodic identification for \(X^0\), a time-like coordinate, we would create closed time-like curves which would make the theory inconsistent. On the other hand, a spatial compactification of \(X^{10}\) would be harmless. What about the marginal case, the light-like compactification?

Well, we may take it as the limit of spatial compactifications that are OK which should be enough for you to believe that it's OK, too. So we're taking an M-theoretical spacetime and identify points that differ by something like the vector\[

(M,0,0,0,0,0,0,0,0,0,M+\epsilon)

\] in the usual coordinates \(X^{0\dots 10}\). It's a nearly null interval but it's a little bit spacelike because of the extra \(\epsilon\) in the last coordinate. We may scale the huge \(M\sim R\) and \(\epsilon\) in such a way that the the proper length of the vector above is tiny, much shorter than the 11-dimensional Planck length.

However, if that's so, we may use the Lorentz symmetry of the 11-dimensional M-theory – an assumption supported by lots of evidence, including the Lorentz symmetry of the 11-dimensional supergravity, the low-energy limit of M-theory – and boost the vector above to a simple vector of the form\[

(0,0,0,0,0,0,0,0,0,0,\epsilon')

\] whose last spatial coordinate is much shorter than the 11-dimensional Planck length. Note that the momenta of all the particles in the original M-theory get boosted by a correspondingly dramatic boost. But we know the spacetime in which these boosted objects propagate: it's M-theory compactified on a very short spatial circle given by \(\epsilon'\). But it's nothing else than type IIA string theory at a weak coupling!

Moreover, the null component of momentum \(P^+\) which is the complementary variable to \(X^-\) is quantized because \(X^-\) is compact, \[

P^+ = \frac{N}{R},\quad N\in\ZZ

\] We wanted to describe objects with a fixed and finite momentum in the 11-dimensional Planck units, i.e. \(P^+\) is finite, and because \(R\) needs to be sent to infinity, the integer \(N\) has to be sent to infinity for these interesting states, too.

However, as we boosted the light-like interval defining the light-like compactification to a spatial one, we have a new interpretation for \(N\): it's the number of units of momentum in the direction of the short compactified coordinate \(X^{10}\sim X^{10}+\epsilon'\) we mentioned at some point above. But the momentum in the extra 11th dimension is nothing else than the number of D0-branes. So the original state of M-theory is equivalent, via this boost and in the limit, to a state involving \(N\) D0-branes in type IIA string theory.

When you analyze the right role of the \(R\to\infty\), \(\epsilon'\to 0\) limit, the relevant estimated magnitude of the energy of the D0-branes (as a function of the finite momenta and energies in the original M-theory spacetimes), the precision we need for this energy in the limit, and the states and objects that may be neglected in this limit because they're infinitely times heavier, you will find out that what you need to describe the original states with \(P^+=N/R\) in the M-theory spacetime is nothing else than the low-energy limit of the dynamics of \(N\) D0-branes in type IIA string theory at a vanishingly low coupling (because of the small \(\epsilon'\)).

It's nothing else than the supersymmetric quantum mechanics model we have already written above; for some time, the model was also referred to as the DKPS model because of a pre-matrix-theory paper that studied it.

So the Hilbert space of the modestly light-like-compactified M-theory is the direct sum of the Hilbert spaces of the \(U(N)\) matrix quantum mechanical models for \(N\) D0-branes. The Hamiltonian of the matrix model gets directly translated as the light-like Hamiltonian (generator of evolution in a light-like direction) in M-theory. The relevant value of \(N\) for the generic states in M-theory with finite energies is the \(N\to\infty\) limit.

As we have seen, there exists a proof that the BFSS matrix model is an equivalent description of M-theory. The proof only relies on the Lorentz invariance of M-theory; its well-known relationship with type IIA string theory; some elementary derivable properties of D0-branes in type IIA string theory; and the irrelevance of the near-light-like compactification if the radius is sufficiently long.

But we may still be shocked by the claim that this seemingly naive non-relativistic quantum mechanical model with 9 bosonic matrices \(X^i\) describes an 11-dimensional, and not just 10-dimensional, theory that is moreover Lorentz-invariant in the large \(N\) limit and includes gravity, gravitons, gravitinos, multiparticle states with indistinguishable particles, membranes, fivebranes, black holes, and their interactions including all the right loop corrections that could be derived from effective quantum field theories, among many other things.

Can we see that those things are present in the theory? This brings me to the final "megapoint" I want to make, namely that Matrix theory is a really cool and original way to mathematically describe many processes that used to be described – and are still being described, in most cases – by totally different mathematics.

All roads lead to string theory: describing quantum gravity in totally new, revolutionary yet consistent languages

In order to understand what kind of physical phenomena Matrix theory allows and implies, it's useful to start with small values of \(N\). For such small values, the M-theoretical physics does depend on the details of the light-like compactification; we're away from the large \(N\) limit. However, many things acquire their expected properties already for small \(N\).

Start with \(N=0\). The degrees of freedom are \(0\times 0\) matrices. In other words, there are no degrees of freedom. There is a unique wave function (up to normalization) that depends on them, namely \(\psi() = 1\). This state vector may be identified with the vacuum state of M-theory as it carries no energy or momentum.

That was easy: too easy. Let's continue with \(N=1\). That's the last matrix model that will be easy and fully solvable, of course. This model should describe all states in M-theory that have \(P^+=1/R\), the minimal positive value of the light-like momentum. We are dealing with a \(U(1)\) gauge theory in 0+1 dimensions. The bosons \(X^i\) and \(\theta^a\) are \(1\times 1\) matrices so they're not really matrices at all.

Because \(U(1)\) is an Abelian group (you know it from electromagnetism), all the commutator terms vanish. So the Hamiltonian is just quadratic, namely\[

P^0_{\rm BFSS} = \frac{(\Pi^i)^2}{2}.

\] That's it. It's just like the non-relativistic kinetic energy – for a particle in 9 spatial dimensions. No potential because the potential terms vanish. No terms from fermions, either. You might be puzzled what this non-relativistic formula has to do with the relativistic physics in 10+1 dimensions. The answer is that the right interpretation of the energy in the quantum mechanical model is \(P^-\), a light-like component of the energy-momentum vector that is treated as energy in the light cone quantization.

Massless particles' energy and momentum obeys \[

(P^0)^2-(P^i)^2 - (P^{10})^2 = 0, \quad i=1,2,\dots ,9

\] which may be rewritten, using the light-like components as\[

2P^+ P^- - (P^i)^2 = 0, \quad P^- = \frac{(P^i)^2}{2P^+}

\] which has the simple quadratic, non-relativistic form! So the light-like description which is the ultimate limit of the "infinite momentum frame" which described ultrarelativistic particles is mathematically analogous to non-relativistic physics. Note that \(2P^+\) is just a constant in the sector with \(P^+=1/R\) i.e. \(N=1\).

You could even say that the light-like momentum component \(P^+\) plays the role of the non-relativistic mass \(m\). Also, if you use the light-like coordinates, you may find a copy of the Galilean group that is embedded, without any deformation, right into the Lorentz group. At low speeds, relativity reduces to non-relativistic physics approximately; but at the speed of light, a copy of non-relativistic physics is embedded in the relativistic one exactly!

We see that the \(N=1\) model describes particles with arbitrary values of the nine "totally transverse" components of the momentum \(P^i={\rm Tr}(\Pi^i)\). The tenth component treated as a spacelike one, the light-like component \(P^+\), is equal to \(1/R\) which is a fixed constant. That's OK because we're considering a sector of the Hilbert space only; the states with higher values of \(P^+\) will be found in the models with higher values of \(N\).

The remaining component, i.e. the other light-like component \(P^-\), is treated as energy and the BFSS Hamiltonian gives us the right dispersion relations already for \(N=1\). We shouldn't forget about the fermionic degrees of freedom. There are 16 Hermitian objects \(\theta^a\) transforming as the real spinor of \(SO(9)\). They're serving as momenta to themselves so you may combine them into 8 Grassmann variables and their 8 Grassmann derivatives.

By Taylor expanding the wave function in the 8 Grassmann variables, you get \(2^8=256\) components. That's exactly the number of polarizations in the graviton supermultiplet of 11-dimensional supergravity. Recall that these get decomposed to 128 fermionic components (of the gravitino) and 128=44+84 bosonic components including the physical polarizations of the graviton as well as those of the C-field three-form potential.

One may actually define the 32 real supercharges we expect in M-theory. 16 of them are nothing else than the 16 supercharges of the maximally supersymmetric gauge theory; the remaining 16 of them are trivial or kinematical and these generators are given simply by \({\rm Tr}(\theta^a)\). This decomposition of supercharges to the "complicated dynamical" ones (one-half) and the "trivial kinematic" ones (the other half) is a consequence of the light cone treatment. To get from the maximally supersymmetric gauge theory to the maximally supersymmetric supergravity, we needed to double the number of supercharges (from 16 to 32) and the existence of the kinematic or trivial supercharges did the job for us.

We have seen that already the non-interacting \(N=1\) model knows about the number (and representations under rotations) of the components of the supergraviton multiplet and the right dispersion relation (relationship between energy and momentum). That's quite a success for such a simple model. Note that we only have one particle; our derivations tell us that there can't exist any multiparticle or otherwise complicated states that would have the minimal value of the longitudinal momentum.

Screws and Nuts by Mandrage, a Pilsner band, has been among the Czech radios' top 3 songs for more than half a year. The observation that men and women are like screws and nuts obviously has a sexual connotation but most physicists don't appreciate that the Czech word for nuts in this sense, "matice", is the same word as one for "matrices", and moreover, matrix strings are screwing around the matrices as the monodromy introduces a permutation. This breathtaking ignorance of rudimentary Czech among most physicists and their lacking sense of humor has led to the widespread but flawed terminology "matrix strings" for what should actually be called "screwing strings". ;-)

But the real fun only starts with \(N=2\) when we obtain the first real matrices. Let us talk about all the values \(N\gt 1\) simultaneously. The first observation we want to make is that the \(U(N)\) matrix model does contain states composed of several objects.

In quantum field theory, one may create multiparticle states as\[

\ket\psi = a^\dagger_\text{here} a^\dagger_\text{there}\ket 0.

\] One simply acts with several creation operators at different places to create several objects. As long as the places are sufficiently distant, the Hilbert spaces associated with the places are independent and the whole Hilbert space isn't far from a tensor product of the Hilbert spaces associated with the individual regions.

However, the matrix model isn't a quantum field theory. It doesn't have any creation and annihilation operators for particles. How can it possibly contain multiparticle states? The right indistinguishability conditions and statistics? And other things? It looks like some fully well-defined but seemingly non-relativistic undergraduate quantum mechanical model.

The matrices are the answer. A funny thing about large (or any) matrices is that they may take special matrix values that are block-diagonal. Imagine that the matrices for all \(X^i\) and similarly \(\theta^a\) have the following form:\[

X^i = \pmatrix{ \text{WOLF}^i & \heartsuit^i \\ \heartsuit^{i,\dagger} & \text{BUNNY}^i }

\] Here, WOLF is a square matrix that contains the information about positions of particles that make up an object called WOLF. BUNNY is a similar square matrix whose size may be the same or different. And the hearts are some generally rectangular off-block-diagonal blocks.

We see that large enough matrices are capable of including several objects, such as WOLF and BUNNY. The longitudinal momentum \(P^+\) is linked to the linear size of the matrices so \(P^+\) of the WOLF-BUNNY composite states is simply the sum of the values of \(P^+\) that these two animals would carry separately. And indeed, if we set \(\heartsuit=0\) for a while, the Hamiltonian for the large matrices simply reduces to the sum of the Hamiltonians for both animals: just recall how a trace of a block-diagonal matrix behaves. The two animals evolve independently.

What about the heart? Because of the commutator terms in the Lagrangian, the off-diagonal elements of \(X^i\) in the off-block-diagonal rectangles \(\heartsuit\) (and their Hermitian conjugate ones) actually acquire a mass. By a mass, we mean the term in the Hamiltonian that you would associate with a massive field such as the W-boson field, \(m^2 \cdot \heartsuit^\dagger \heartsuit/2\). The value of the mass is actually proportional to the distance between WOLF and BUNNY, assuming that their internal sizes are much smaller than their relative distance.

In other words, these off-block-diagonal matrix elements become harmonic oscillators with huge frequencies. The further WOLF and BUNNY are, the higher frequencies we face, the further the equally spaced energy levels of the harmonic oscillators are from each other, and the more we may neglect the possibility that the harmonic oscillator could actually be found in an excited state. We may approximate the \(\heartsuit\) harmonic oscillators by assuming that they're in the ground state most of the time (or, classically, at the \(\heartsuit=0\) point). Only when WOLF and BUNNY get close enough to each other, a significant chance that the harmonic oscillator gets excited emerges. The virtual effects of this harmonic oscillator – its propagators – transmit influences between WOLF and BUNNY. Of course, the interaction between them is love, a reason I picked \(\heartsuit\). ;-)

Once again, we see that the matrices may have a block-diagonal form. The blocks on the diagonal remember the coordinates of particles in the subsystems while the off-block-diagonal blocks have \(X\) mostly equal to zero but their ability to get excited or nonzero is actually the one and only source of the interactions between the subsystems!

I won't be proving that the interactions induced in this way are exactly those you expect from M-theory, e.g. that they reduce to supergravity at very low energies. But it's true. Instead, let us discuss some simpler but still very interesting issues.

We said that there were many harmonic oscillators with frequencies scaling with the distance between WOLF and BUNNY. Don't they give us huge zero-point energies \(\hbar\omega/2\) that would also depend on the WOLF-BUNNY distance which would drive WOLF and BUNNY towards each other with a constant force? The answer is that such terms really do arise but they get exactly canceled!

The other compensating sources we haven't considered are the fermions. The off-block-diagonal elements of those 16 \(\theta^a\) also behave as harmonic oscillators, but the fermionic ones. The zero-point energies will scale as \(-\hbar\omega/2\) where \(\omega\) is again proportional to the BUNNY-WOLF distance. And if you're careful about all the factors, you will find out that those 16 Hermitian fermions exactly cancel the bosonic oscillators from 8 matrices \(X^i\) and that's exactly the right number because the ninth one is the longitudinal one, parallel to the BUNNY-WOLF separation, and it doesn't get massive at all. Supersymmetry actually plays a key role in making the individual separated objects independent and keeping their force low at very high separation. It's rather unlikely to find a non-supersymmetric matrix model that would have the same desirable properties.

That's great. We have multi-particle states. But do we also have gravitons and gravitinos with \(P^+\gt 1/R\)? The answer is Yes. They have \(P^+=N/R\) and are represented by the \(N\times N\) blocks. In other words, we may find such a graviton as the only object in a state of the \(U(N)\) matrix model. Now, \(U(N)\) and its adjoint representation roughly decomposes to \(SU(N)\times U(1)\). The \(U(1)\) part behaves much like the \(N=1\) model, except that the value of \(P^-\), our Hamiltonian, gets correctly rescaled: recall that \(P^-\) is inversely proportional to \(P^+\) and that's what a calculation yields (after the \(P^i\) momentum is fairly divided among the \(N\) entries to keep the trace constant).

The relative, interacting, \(SU(N)\) degrees of freedom describe the relative coordinates of those \(N\) D0-branes. And one may prove via index theorems that this model has exactly one state of a vanishing energy: there must exist a mathematically fascinating ground state wave function with a vanishing energy that solves a rather complicated differential equation in \(9(N^2-1)\) bosonic variables even though no one can write how this wave function looks like too explicitly (the most understandable argument in favor of the state's existence that I know is based on the interpolation between the BFSS model and screwing string theory which is "more solvable" in the weakly coupled limit). When this state is tensor-multiplied with the degrees of freedom of the \(U(1)\) model, we get the right single-graviton or single-gravitino states with all the required polarizations and all the required values of the momenta.

As you have noticed, the matrix model differs from non-relativistic multiparticle quantum mechanics by having the off-diagonal entries of all the matrices \(X^i_{mn}\). Otherwise the diagonal entries \(X^i_{nn}\) (no summing) behave in a very similar way to \(X^i_n\) in non-relativistic quantum mechanical models where \(i\) labels a direction in space and \(n\) labels a particle. But the BFSS matrix model also yields the off-diagonal entries, whole matrices! There is a sense in which this extension of position operators to matrices is analogous to the conceptual transformation that was imposed by the quantum revolution. But we're doing it at another level: each matrix entry \(X^i_{mn}\) in the BFSS model is an operator acting on the Hilbert space; we're just saying that there are many such operators so that one may organize them into new matrices whose matrix indices remotely resemble the indices labeling one of many particles in multiparticle quantum mechanics. But there are two such indices for each matrix of position operators!

Another thing you could worry about is the following issue: are the gravitons indistinguishable particles? And do they have the right statistics that depends on the spin? Once again, the answer is Yes. But the reasons are technically very different from those in quantum field theory. In quantum field theory, the multiparticle states are created by the creation operators constructed from the quantum fields. There's only one field for each particle species (and it commutes or anticommutes with itself at spatial separations) so the resulting states inevitably end up being symmetric or antisymmetric wave functions for commuting and anticommuting field operators, respectively.

How can the same symmetry and antisymmetry of the wave functions be realized in the matrix model? The answer is the \(U(N)\) gauge symmetry. Some time ago, I have mentioned that the physical states also have to be gauge-invariant: the BFSS matrix model is a gauge theory, after all. We said that this effectively reduced the number of bosonic coordinates from \(9N^2\) to \(8N^2\) – well, it is really \(8N^2+1\).

But we're dealing with a pretty large system and many interesting things follow from the \(U(N)\) symmetry, too. For example, consider a system of \(N\) gravitons such that each of them has the minimum longitudinal momentum \(P^+=1/R\). A funny thing of the \(U(N)\) group is that it has an \(S_N\) subgroup, the permutation group. The physical states must be invariant under it, too. If you consider states that are only supported by simultaneously diagonalizable matrices, the invariance under \(S_N\) condition simply means that the wave functions have to be symmetric! And if the graviton-like particles are actually gravitinos, you will get the antisymmetry because of some extra permutations of fermions that the permutation operator induces. You will get the right indistinguishability of the particles with otherwise identical quantum numbers. And the right spin-statistics relationship for the symmetry or antisymmetry emerges, too!

Needless to say, this (anti)symmetrization also applies to particles with larger values of \(P^+=N/R\), represented by larger blocks: the permutation group exchanging equally large blocks is a subgroup of \(U(N)\), too.

It's kind of incredible: the \(S_N\) permutation of particles was a purely discrete operation, a bookkeeping device, in quantum field theory. But in the BFSS matrix model, it is actually enhanced to a much more nontrivial structure, a whole continuous \(U(N)\) group. When particles of the same kind are far from each other, the \(U(N)\) symmetry is broken to \(S_N\) and the (anti)symmetry of the wave functions is the only residual condition. But Matrix theory shows that if the particles are coincident (or at least very close), the permutation group gets enhanced to a whole unitary group!

You may have believed that Nature is hiding some cool symmetries but they are broken to smaller symmetries. But would you be able to invent – by pure thought – that an important example of this phenomenon is a secret \(U(N)\) symmetry that is broken to the permutation group exchanging particles? Nature and mathematics are clearly more creative than we are. We were literally forced to discover the matrix description of indistinguishable particles (much like if we observed it experimentally); attempts by arrogant but limited self-described "seers" to social engineer it in a man-made way would have almost certainly failed. Humans are pretty smart but there are just many clever mysteries and mechanisms that are much more likely to be discovered only once our skulls hit them.

This is the kind of a revolutionary description that was promoted from the very title of this blog entry. You could look for theories extending general relativity that are as similar to non-relativistic quantum mechanics as possible. If you were creative and lucky, you would ultimately introduce matrices of degrees of freedom, with the right commutators, and a simple enough supersymmetric Hamiltonians would be surprisingly found to describe not only a Lorentz-invariant theory in the large \(N\) limit but actually a theory that has all the wonders of a consistent theory of quantum gravity.

For a while, you would think that you have found a completely new framework to understand quantum gravity, a competitor of string/M-theory. However, at some time, you would realize that it's actually exactly equivalent to string/M-theory or at least one of its Hilbert space's superselection sectors! That's not a coincidence; despite the great diversity and amazing phenomena how various things are related and represented, all these structures are just projections of a single theory as long as they are consistent. As Joe Polchinski once stated, all roads lead to string theory.

A moving human constructed out of screwing string theory (sometimes incorrectly called "matrix string theory").

Because this blog entry has gotten pretty long, I will wrap it up at this point. Sometimes in the future, I plan to explain why other objects with the right features – including (spherical, toroidal, and other) membranes, \(E_8\) gauge bosons (and whole gauge supermultiplets) on Hořava-Witten domain walls, and strings with the right perturbative string interactions in the weakly coupled limit (in compactified versions of the BFSS matrix model, something I was fortunate to discover) emerge out of the matrix model description of string/M-theory.

So far, I thank you for your patience if you managed to penetrate up to this point...

Matrix theory: a novel alternative to second quantization etc.

0 comments:

Post a Comment

Popular Posts

Recent Comments

Arsip Blog