Why M(atrix) theory contains membranes

A week ago, I wrote a long article on Matrix theory. I planned a few shorter additions and this is the first one.

M-theory may stand for "mother", "magic", "mystery", "matrix", upside down "W" for "Witten" (this interpretation was discovered by Sheldon Glashow), but it may also stand for "membrane". And Matrix theory actually contains states that look like membranes. It has to contain such states because M-theory contains membranes and Matrix theory should be physically equivalent to M-theory.

The BFSS, discoverers of Matrix theory. I am only able to identify Willy Fischler, a part-time paramedic, in the middle.

In this text, I want to explain why the matrix model contains configurations (and quantum states) that look like two-dimensional sheets of a supersmooth atomless Lorentz-invariant charged paper. We will focus on membranes of spherical and toroidal topology but the conclusion is more general.

The insight that the matrix model describes membranes wasn't quite new in 1996. In fact, the matrix model was encountered a decade earlier in efforts to "discretize" a theory of membranes in order to make it as well-defined as a theory of strings. However, only in 1996, BFSS were able to find out that the matrix model is actually an exact description of the M-theoretical membranes as well as everything else. They also resolved some puzzling issues about degenerate shapes of membranes and their topology change.

All this discussion is linked to noncommutative geometry, a subdiscipline of mathematics whose physically meaningful part has been tightly incorporated (by Nature) into string/M-theory. The particular constructions of noncommutative geometry resembling a sphere and a torus are known as the fuzzy sphere and fuzzy torus, respectively.

The Hamiltonian of Matrix theory

We will begin with a particular form of the Matrix theory Hamiltonian:\[

P^- \equiv H = R\cdot {\rm Tr} \zav { (\Pi^i)^2 - ([X_i,X_j])^2}, \quad P^+ = \frac{N}{R}

\] I am neglecting coefficients of order one and I am neglecting Yukawa-like terms with the fermions. The fermionic matrices \(\theta^a\) where \(a\) takes 16 possible values give you new degrees of freedom; but these degrees of freedom are also necessary to cancel some terms that would cause inconsistencies such as the "zero-point energies" arising from the off-diagonal modes – as I mentioned in the first blog entry on Matrix theory. So if you do a quantum calculation involving the bosons only and you run into some pathological quantum effect that shouldn't be there, chances are that this effect would be cancelled if you added the fermions.

Note that the light-like components of the spacetime momentum, \(P^+\) and \(P^-\), include the factors of \(1/R\) and \(R\), respectively. The quantity \(R\) may be rescaled by a multiplicative factor which is nothing else than a Lorentz boost: if we increase \(P^+\) by a factor and reduce \(P^-\) by the same factor, it's the same operation as if we Lorentz transform or boost the \(P^0\)-\(P^{10}\) plane.

Ultimately we want to describe objects with finite, continuous values of \(P^+\) and \(P^-\) in an eleven-dimensional spacetime. So we have to take \(R\) to infinity to achieve the continuity while \(N/R\) is kept fixed. Because the Hamiltonian \(P^-\) has to be finite as well, the trace in it has to scale as \(1/N\) for \(P^+ P^-\) to stay finite. So only the states of the large \(N\) matrix models whose energy scales like \(1/N\) i.e. drops appropriately with \(N\) as \(N\to\infty\) are relevant for the decompactified M-theory.

Because we have understood the factor \(R\) or \(1/R\) in various quantities, we may ignore it and effectively set \(R=1\), a full-fledged light-like compactification in which the periodicity of the spacetime coordinate \(X^-\) may a priori be very visible. The mathematical essence of the matrix model involves\[

P^- \equiv H = {\rm Tr} \zav { (\Pi^i)^2 - ([X_i,X_j])^2}, \quad P^+ = N

\] That's very simple: the kinematic, quantized longitudinal light-like momentum \(P^+\) is simply \(N\), the size of the matrices, while the dynamical light-like momentum \(P^-\) that we treat as the Hamiltonian is the trace of the kinetic term and the quartic potential term without any additional factors (except the factors of order one that we neglected). And we're only interested in the large \(N\) models and low-energy states of \(P^-\) that scale like \(1/N\). The states we care about must kind of exist for each value of \(N\) and their energy has to go down with an increasing \(N\). Quite generally, the dualities mapping the model to M-theory imply that in the large \(N\) limit, the model has to become \(N\)-independent up to the simple scalings.

A detail: note that I wrote a minus sign in front of the quartic, commutator term. That's because the commutator (computed only from matrices, regardless of whether the entries are classical or quantum observables!) of two Hermitian matrices is antihermitian and its square (which we want to sum over \(i,j\)) is negatively definite. The minus sign is needed for the positivity.

Minimizing the Hamiltonian

Look at the Hamiltonian above classically. How do we make it small for large \(N\)? In the quantum theory, we're constrained by the uncertainty principle so if \(X^i\) are too sharply defined, then \(\Pi^i\) is highly uncertain and we will get a high contribution from the kinetic energy \(\Pi^2\). To minimize the total energy, we really have to make both terms small, including the potential energy (think about the virial theorem and the way how the Hydrogen atom minimizes the energy in a "balanced way").

I just wanted to say something that is intuitively obvious. The commutator term \([X^i,X^j]^2\) has to be small, too. How do we make it small? Note that it is the sum of the traces of all the squared commutators of pairs of those nine \(X^i\) matrices. In fact, the trace may be obtained as follows: compute all the \(9\times 8/2\times 1 = 36\) commutators of the matrices and sum the squared absolute values in all these \(36\) commutator-matrices.

Well, a simple way to reduce this term is to make all the commutators vanish,\[

\forall i,j:\quad [X^i,X^j] = 0

\] It means that all the matrices \(X^i\) may be simultaneously diagonalized. But such a state has a simple interpretation. We have seen in the previous article that block-diagonal matrices describe states with several independent, separated objects. Fully diagonal matrices are an extreme example which is composed from gravitons carrying the minimal unit value of \(P^+\); gravitons with other, larger values of \(P^+=N\) may be interpreted as larger blocks in which \(X^i\) are proportional to the unit matrix, i.e. conglomerates of coincident and overlapping minimal gravitons. (In the quantum theory, there exists exactly one zero-energy ground state wave function for the \(SU(N)\) problem describing all the "relative" coordinates between the D0-branes. There also exist lots of nonzero energy states that are similarly localized and they generically describe black hole microstates.)

But that's not new. We have already discussed the decomposition into blocks. We want some nontrivial solution in which the commutators aren't strictly zero. We want a new solution in which the configuration of matrices is irreducible; it can't be decomposed to smaller objects via the block diagonal decomposition. Can we find a solution? Yes. A beautiful class of such configurations is physically identified as membranes, 2-dimensional submanifolds floating in the 9-dimensional transverse space parameterized by the coordinates \(X^i\).

(The locations of the points on the membrane in the remaining, tenth "spatial" coordinate \(X^-\) is obtained by a Fourier transform because we know how the complementary momentum \(P^+\) is uniformly divided among the bits of the membrane. I don't want to get into this technicality here but only the nine "purely transverse" coordinates are truly independent and physical here.)

Fuzzy torus

We really want to find values of matrices \(X^i\) such that they're of order one but they're large matrices whose size is \(N\) and all the commutators of these matrices, while nonzero, naturally scale like \(1/N\). Let me just immediately give you a solution. Consider the matrices\[

U = \pmatrix{1&0&0&\cdots& 0 \\
0&\omega&0&\cdots &0\\
0&0&\omega^2&\cdots &0\\
\vdots & \vdots&\vdots & \ddots& \vdots\\
0&0&0&\cdots&\omega^{N-1}
}, \quad \omega\equiv e^{2\pi i/N}

\] and \[

V = \pmatrix{0&1&0&\cdots& 0 \\
0&0&1&\cdots &0\\
0&0&0&\cdots &0\\
\vdots & \vdots&\vdots & \ddots& \vdots\\
1&0&0&\cdots&0
}.

\] The matrix \(U\) is diagonal and the diagonal entries are the \(N\)-th roots of unity. All of the possible \(N\) roots appear on the diagonal, in a kind of nicely ordered way, much like when a clock hand is showing the current time. As you circle around the diagonal, the complex unit is going from one midnight to another midnight (or is it a noon?). That's why \(U\) is known as the clock operator.

Analogously, \(V\) is the cyclic permutation matrix acting on the \(N\) basis vectors, it's the so-called shift operator. If you think about it, the eigenvalues of \(V\) are the \(N\)-th roots of unity, too. The eigenvectors will remind you of the (discrete) Fourier transform. All these comments mean that \(U\) and \(V\) are actually similar to each other. There exists a unitary matrix \(M\) such that \(U=MVM^{-1}\). This matrix \(M\) is actually the defining matrix of the discrete Fourier transform and you may pick\[

M_{ab} = \frac{\omega^{ab}}{\sqrt{N}}.

\] Pure phases bilinearly depending on both indices \(a,b=1,2,\dots, N\). Sorry if there should be \(-ab\) in the exponent. I didn't have to add the \(1/\sqrt{N}\) factor but it's helpful if you want \(M^{-1}\) to be similar to \(M\), including the normalization.

A funny thing is that \(U,V\) don't commute with one another but they're damn close to commuting. Why don't they commute? Well, if you first decorate \(N\) coordinates of a complex vector by the clock phases and then you cyclically permute them, it's different from permuting them at first and then decorating them with the clock phases. It's because you need the clock phases shifted by one i.e. by the factor of \(\omega\).

At any rate, if you understand the previous sentence or if you compute the products explicitly, you will find out that\[

UV = VU \cdot \omega

\] Sorry if there should be \(\omega^{-1}\) here; I don't want to waste a minute here, it's an exercise for you. The products are the same up to a factor of \(\omega\). The previous relationship may be rewritten as\[

UV-VU =(\omega-1) VU.

\] But \(\omega=\exp(2\pi i/N)\) and for a large value of \(N\), it's very close to a one. Keeping the first subleading term in the expansion of the exponential, we see that\[

UV-VU \sim \frac{2\pi i}{N} VU

\] Here, \(VU\) on the right hand side is "of the same order" as the normal matrices and doesn't depend on \(N\). However, there's an extra factor of \(1/N\). That's exactly what we need for the low-energy states in Matrix theory. All the previous insights mean that if all the matrices \(X^i\) are simple functions of \(U,V\) and their inverses, the commutators will be small. They will scale as \(1/N\) which, by the way, will also produce \(1/N\) terms after we square the commutators – that will change \(1/N\) to \(1/N^2\) – and trace over the matrices – that will add a factor of \(N\) again, thus boosting \(1/N^2\) to \(1/N\).

That's great. What the form of the matrices is? It is\[

X^i = \sum_{k,l=-T}^{T} \xi^i_{k,l} U^k V^l

\] with lots of coefficients \(\xi^i_{k,l}\). I have truncated the sum to an interval of integers between \(-T\) and \(T\). To decompose the most general matrix \(X^i\), we need \(T\) to be approximately \(N/2\) so that each sum over \(k\) and \(l\) goes exactly over \(N\) possible values. In such a case, the sum will have \(N^2\) independent terms which is exactly what you need to reconstruct a general \(N\times N\) matrix.

However, imagine that \(T\) is taken to be much smaller than \(N/2\). Imagine that \(N\) is one million but we will only take \(T\sim 1,000\), summing a much smaller number of terms.

Now, what is the physical interpretation of the polynomial formula for \(X^i\) above? The interpretation is easily obtained if we make the following replacement or identification:\[

U\equiv \exp(i\sigma_1), \quad V\equiv \exp(i\sigma_2)

\] Note that the eigenvalues of \(U,V\) are almost "any" complex numbers whose absolute value equals one (if \(N\) is large) so it is legitimate to say that they may be written as the complex exponential of a phase. The angles \(\sigma_1,\sigma_2\) are two independent variables with periodicity \(2\pi\). They clearly parameterize a torus. In the normal geometry, they commute with each other. In our geometry, the commutator is nonzero but small.

So the formula for \(X^i\) written in terms of \(U,V\) is nothing else than a Fourier decomposition of a function \(X^i(\sigma_1,\sigma_2)\) on a torus! If the configuration of \(X^i\) matrices is close to our Ansatz, the matrices know exactly everything about the shape of a membrane in a 9-dimensional space! The membrane has a toroidal topology. In some sense, it resembles a periodic phase space with \(N^2\) phase cells.

If you rewrite the Hamiltonian for the matrices \(X^i\) as a Hamiltonian for the non-matrix functions on the torus, \(X^i(\sigma_1,\sigma_2)\), you will get nothing else than a Hamiltonian for a membrane (a supersymmetric membrane if you also include the fermions and do the analogous operations for them). This Hamiltonian for the membrane will look like a higher-dimensional generalization of the Hamiltonians for a string. The usual quadratic Hamiltonians for strings may be obtained as a clever rewriting of the Nambu-Goto action (the proper area of the world sheet); and in the same way, the Hamiltonian we get for the membranes are also physically identical to a higher-dimensional generalization of the Nambu-Goto action (the volume of the world volume).

(Instead of the conformal gauge, i.e. the condition that the string's world sheet metric is flat up to a position-dependent Weyl rescaling, the natural gauge condition we automatically get in the membrane case is the condition that the determinant of the induced 2+1-dimensional membrane metric is constant. So no scaling or conformal symmetry is allowed for the membranes. In this way, the gauge-fixed Hamiltonian contains quartic terms, and not just "free quadratic" terms such as the stringy ones, and many things about the membrane are more complicated. They're still pretty important and mathematically natural – after all, the BFSS matrix model may have been obtained from the maximally supersymmetric gauge theories etc.)

Membranes have been considered and for a while, people thought that you could treat them just like strings, obtaining a higher-dimensional generalization of string theory. However, this "membrane theory" didn't quite enjoy the same remarkable properties as string theory. The higher-dimensional world volume theory was less well-behaved at short distances; the topology of the membranes could degenerate more uncontrollably than the strings' topology; the constant quantifying the strength of the interactions between the membranes wasn't really adjustable so it couldn't be made small.

I will return to these questions but before I do, let's ask: is that just the toroidal topology or can we explicitly describe membranes of another shape, e.g. spherical ones? The answer is Yes.

Fuzzy sphere

In the fuzzy torus case, we made the identification\[

U\equiv \exp(i\sigma_1), \quad V\equiv \exp(i\sigma_2)

\] of two unitary matrices \(U,V\) with two coordinates which are complex numbers of absolute value one (equivalently, two independent angular variables). The commutator of the matrices was scaling like \(1/N\) which was fine, and so on. Can we do something similar for the sphere? Yes, the answer is the fuzzy sphere.

We will describe the \(r=1\) unit sphere not by \(\theta,\phi\), the spherical coordinates, but by three coordinates \(x,y,z\) that just satisfy\[

x^2+y^2+z^2=1.

\] Because we will require the relationship above to hold automatically, one of the coordinates will actually fail to be quite independent from the other two (except for its adjustable sign). By using \(x,y,z\), we will avoid the problems with the singular poles, \(\theta=0\) and \(\theta=\pi\), where new checks and balances would have to be imposed.

Now, \(x,y,z\) play exactly the same role as \(\exp(i\sigma_1)\) and \(\exp(i\sigma_2)\) in the fuzzy torus case. In the fuzzy torus case, we identified them with two matrices. So the analogous operation for the fuzzy sphere must find \(N\times N\) matrices representing the coordinates \(x,y,z\). Moreover, these matrices have to satisfy \(x^2+y^2+z^2=1\) automatically. Can we find them?

Yes, we can. Just make the following identification:\[

(x,y,z) = \frac{1}{\sqrt{J(J+1)}} (J_x,J_y,J_z).

\] Here, \(J=(N-1)/2\) so that \(N=2J+1\) gives us the right size of the matrices and \(J_x,J_y,J_z\) are simply matrices of the angular momentum in an \(N\)-dimensional irreducible representation, i.e. in a representation with a large value of the spin \(J\). Note that \(x^2+y^2+z^2=1\) is identically satisfied if \(x,y,z\) represent the associated matrices simply because \[

J_x^2+J_y^2+J_z^2 = J(J+1)\cdot 1_{N\times N}

\] is the well-known "Casimir" that is proportional to the unit matrix. Note that in this case, we associated Hermitian, and not unitary, matrices to \(x,y,z\) because unlike \(\exp(i\sigma_{1,2})\), the variables \(x,y,z,\) are real.

Now, you may verify that for a large \(N=2J+1\), the commutators such as \([x,y]\) go like \(1/N\) times \(iz\) because \(\sqrt{J(J+1)}\sim J\) for a large \(J\) or \(N\) so we pick \(1/J\) from each \(x\) and \(y\), getting \(1/J^2\) in total which is equal to \(1/J\) times \(z\) because \(z\) contains one \(1/J\) by itself.

Just like we could write \(X^i\) as a simple Fourier-transform-like function of \(U,V\) in the toroidal case, we may write each of the nine \(X^i\) matrices as a simple polynomial of the matrices that represent \(x,y,z\). In this fashion, you may approximate any spherical harmonic or any function on the \(x^2+y^2+z^2=1\) surface, on the two-sphere.

So the matrices \(X^i\) may also naturally and beautifully encode a function on the two-sphere, via a Fourier-like decomposition that is closely associated with the decomposition into spherical harmonics. For a finite \(N\), it is not a perfect approximation: we only allow spherical harmonics up to \(L\sim N\) or so. Also, the multiplication is a bit noncommutative. If you analyze the maths carefully, you will find out that all the refusal of the matrices to commute may be explained by a totally equivalent feature of the multiplication: the multiplication of the matrices representing spherical harmonics simply eliminates all the spherical harmonics in the product that have \(L\gt N\) or so.

Dynamics, degeneration, topology change

I have mentioned that one may rewrite the Matrix theory Hamiltonian as a 2+1-dimensional Hamiltonian for functions such as \(X^i(\sigma_1,\sigma_2)\) in the toroidal case or \(X^i(\theta,\phi)\) in the spherical case, if I rewrite \(x,y,z\) using the polar coordinates again. The local integrand of these Hamiltonians (which dictates how the waves are propagating along the membranes, and related things) will be the same – for any topology, we will derive the same local dynamics on a piece of a membrane – so it's not hard to guess that the Hamiltonian of the matrix model actually allows orientable membranes of any topology.

Only the torus and the sphere can be written down very explicitly; that's because of their isometries, \(U(1)\times U(1)\) and \(SO(3)\), respectively. However, membranes of all topologies are allowed in the matrix model (and with some more abstract maths, we may describe them somewhat explicitly, too). It's not hard to figure out that the matrix model allows the topology to change. After all, the space of matrices is completely connected; it isn't really decomposed into several isolated "sectors with a different topology".

The same is true for single-membrane and multi-membrane states.

Imagine that someone in the 1980s wrote the continuous Hamiltonian – a 2+1-dimensional quantum field theory – for a membrane of a spherical topology. Yes, several people did. And now imagine that they had the creativity to regulate the membranes using the matrices. Well, they even did that. What they found was a problematic model because it didn't prevent the spherical membrane from developing sharp spikes on the surface and it was plausible that these spikes get connected and change the topology of the previously spherical membrane into a torus or a higher-genus topology. Or the sphere could get pinched near the equator and the thin tube connecting the previous hemispheres could have broken, thus producing a pair of spherical membranes.

Those folks in the 1980s were not able to deduce what was really happening in these extreme limits and whether the membrane was allowed to do so; these questions boiled down to some short-distance properties of the 2+1-dimensional model that weren't well-defined because 2+1 dimensions is already too high for a world volume theory to be fully consistent in the ultraviolet.

But if you rewrite the spherical-membrane Hamiltonian in terms of the matrix model, you will be able to find the answer to the question whether the degeneration processes above are allowed etc. They are allowed. The reason is that the correct and correctly evaluated and interpreted matrix regularization of the single-spherical-membrane model actually describes membranes of arbitrary orientable topologies – and an arbitrary number of disconnected membranes, too!

This is just another manifestation of the remarkable feature of the matrix models that gave the title to the first blog entry on Matrix theory. In quantum field theory using the second quantization, the multi-particle and other multi-object states have to be created "out of pieces", i.e. as a product of several independent creation operators acting on the vacuum state. But Matrix theory erases the qualitative difference between single-particle and multi-particle objects – and between single-membrane and multi-membrane states. All of them are represented by certain configurations of matrices (or, quantum mechanically, by wave functions whose support is located near these classical configurations). The multi-object states correspond to classical configurations that may be simultaneously block-diagonalized.

(Matrix string theory, a matrix model that also describes strings in string theory, also erases the qualitative difference between one-string and multi-string states, a feature I had believed to be fundamentally necessary in a complete formulation of string theory at least since 1992 when I read the first Green-Schwarz light-cone-gauge superstring articles from the early 1980s. Matrix theory just came to me as a marvelous framework in which my "dreams of the teen age" could have been transformed into a working reality which was a reason why I could explain how strings were included in Matrix theory much more quickly than anyone else, within a week of thinking about Matrix theory. The feeling when you discover something relatively important – the first nonperturbatively complete definition of type II strings' dynamics and why it agrees with the previous approximate descriptions – before anyone else in the world and when you're sure that it actually works is special. I will postpone the discussion of matrix string theory and some thoughts I had when I discovered it to a future blog entry.)

However, the realistic wave functions in the quantum mechanics are nonzero for configurations of diagonalizable as well as non-diagonalizable matrices. So the interactions between the membranes, gravitons, and other objects – including the splitting of membranes into pieces and changing the membrane topology – is always predicted to occur with a nonzero probability. You can't eliminate them from the picture. In perturbative string theory, the string interactions are uniquely determined once you specify the single-string dynamics; the local character of the world sheet remains the same and you just allow more complicated topologies.

Now we see that something analogous holds for membranes, too. String/M-theory, in this case its matrix model definition, uniquely dictates how the membranes should split, merge, degenerate, and simplify once you write a consistent regularization of the theory for a single membrane of any topology. In as consistent and robust theories as string/M-theory – and string/M-theory is probably the only element in this set – learning about a small piece of the structure (e.g. the free limit describing a single object) uniquely determines everything else (including interactions). Once you find a small (but not too small) piece of a mosaic, the whole mosaic may be "holographically reconstructed" out of it.

Noncommutative geometry and noncommutative field theories: promotion of a future text

The fuzzy torus and the fuzzy sphere above are the most canonical and simplest examples of noncommutative geometry. Studied in very abstract mathematics by folks such as Alain Connes, the noncommutative geometry was found to be a natural exact description of some pretty elementary configurations in string theory. There exist whole quantum field theories in which the supporting spacetime is noncommutative; they may be obtained by replacing the ordinary commutative point-wise product of classical fields by the star-product.

The Feynman diagrams for these theories have extra phases associated with vertices; these theories relate \(d\)-dimensional theories (and branes) with \((d-2)\)-dimensional and \((d-2k)\)-dimensional theories (and branes) in natural ways; and – which is related – they have very natural and interesting solutions (solitons) that use some mathematical tricks we know from phase spaces in quantum mechanics (although their interpretation is different). This text has gotten pretty long so I will postpone the discussion of noncommutative geometry to a future blog entry.

Why M(atrix) theory contains membranes

0 comments:

Post a Comment

Popular Posts

Recent Comments

Arsip Blog