Deriving the Lorentz transformations from a rotation of frames of reference about their origin with real time Wick-rotated to imaginary time

Well-known for their central role in Einstein’s Special Relativity, the Lorentz transformations are derived from the rotation of two frames of reference in standard configuration while time is taken to be an imaginary unit of spacetime. This is rarely seen in the wild. Not many undergraduate textbooks or online texts show the details of the working. Hence, this article.

Download PDF

Introduction

One might think this means that imaginary numbers are just a mathematical game having nothing to do with the real world. (…) It turns out that a mathematical model involving imaginary time predicts not only effects we have already observed but also effects we have not been able to measure yet nevertheless believe in for other reasons. So what is real and what is imaginary? Is the distinction just in our minds?
S. Hawking[1]

Even though there are many derivations of the Lorentz transformations to be found in textbooks, in syllabi, and online, to me, one of the most elegant remains the version Henri Poincaré once alluded to[2], which Hermann Minkowski then toyed with a bit further—to put it unreasonably mildly—in what we now call Minkowski space, but is rarely expounded in the aforementioned places.

Henri Poincaré noted that, when the time axis of the two coordinate systems has been made imaginary, i.e. the imaginary axis in the complex plane, the transformations set forth by Hendrik Lorentz pop out automatically after a rotation of two reference frames in that complex plane.

In this document, we show how this is done. We assume the reader is familiar with complex numbers.

The aim is to derive the following set of Lorentz transformations:

\begin{align}
t’ &= \frac{t-vx/c^2}{\sqrt{1-v^2/c^2}}, \label{eq:Lorentz t-prime} \\
x’ &= \frac{x-vt}{\sqrt{1-v^2/c^2}}, \label{eq:Lorentz x-prime} \\
y’ &= y, \\
z’ &= z,
\end{align}

where $(t,x,y,z)$ and $(t’,x’,y’,z’)$ are the coordinates of an event in two frames. The primed ($’$) frame is, seen from the unprimed frame, moving with speed $v$ in the $x$-direction. The speed of light in a vacuum is denoted by $c$. As a side note, the recurring term $(\sqrt{1-v^2/c^2})^{-1}$ is called the Lorentz factor and it is usually denoted by the letter $\gamma$.

Standard configuration

Suppose, Hermann is standing still on the ground. Albert is driving his car and moves away from Hermann at speed $v$. We then have two frames of reference. There is Hermann’s frame $\mathcal{M}$ (the ground), with its origin $O$ at Hermann’s feet on the ground. And there is Albert’s frame $\mathcal{E}$ (the car), with its origin at Albert’s bottom on his chair. Their frames of reference are said to be in standard configuration as depicted by Figure 1. This means that at time $t=0$ in frame $\mathcal{M}$ where $x=0$ as well, the time $t’=0$ and position $x’=0$ in frame $\mathcal{E}$, too, and that one frame is in uniform (constant) motion relative to the other. In other words, $\mathcal{M}$ and $\mathcal{E}$ are said to be synchronised when the spacetime coordinates

\[ (t,x) = (t’,x’) = (0,0). \]

Of course, in the real world, there are four spacetime coordinates for each frame, i.e. $(t,x,y,z)$ and $(t’,x’,y’,z’)$, but to make our calculations a little bit easier, we consider the temporal coordinate $t$ (and $t’$) and spatial coordinate $x$ (and $x’$) only.

So, looking at Figure 1, we can see from Hermann’s point of view – standing in the origin $O$ of $\mathcal{M}$ – that $\mathcal{E}$’s origin $O$ moves at speed $v$ relative to the $x$-axis of $\mathcal{M}$. Speed $v$, of course, just means that $\mathcal{E}$ moves at a certain amount of units of $x$ (say, metres) per a certain amount of units of $t$ (say, seconds). This is nothing new, but it is for our derivation of the Lorentz transformations important to repeat our secondary education for a little bit:

\[ v = \frac{\Delta x}{\Delta t}, \]

in Hermann’s frame of reference $\mathcal{M}$. More or less conversely, if we want to calculate how many spatial units frame $\mathcal{E}$’s origin has moved from frame $\mathcal{M}$’s origin, we rewrite the last equation into the perhaps more familiar law of uniform motion:

\begin{equation}
\Delta x = v \Delta t,
\label{eq:x=vt}
\end{equation}

in Hermann’s frame of reference $\mathcal{M}$.

As a side note, do realise that to Albert, his car is not moving at all; he is sitting in it. (Rather, it is the rest of the world that is moving with respect to his car and himself.) If the car were moving with respect to Albert, an accident with potentially serious consequences would be impending. So, for Albert’s sake, his speed within his own frame $\mathcal{E}$ (the car) is expressed as $v’=0$, as long as he stays put and buckled up in his chair. And so, the law of uniform motion of Albert, within his frame $\mathcal{E}$, becomes:

\[ \Delta x’ = v’ \Delta t’ = 0 \Delta t’=0. \]

Invariances

If $\mathcal{E}$ is in constant motion with respect to $\mathcal{M}$, in one direction, the $x$-direction, as expressed in Equation \eqref{eq:x=vt}, then, mathematically, we call this a geometric translation in the $x$-direction. In physics, it is called a translational motion in the $x$-direction.

Figure 2: Albert fires a photon. *At time t=t’=0, Albert fires a photon P into direction x. Both the photon and Einstein’s frame E move into that same x-direction.*

Suppose, at time $t=t’=0$, Albert activates his special on-board photoelectric cannon, firing exactly one photon $P$ in the $x$-direction. Figure 2 shows how the photon is travelling through the spaces of both frames of reference.

Looking at the diagram, we see that the spatial coordinates in the $y$-direction remain unchanged, $y=y’=0$, so we leave this out of our equations further on, to keep it simple. However, since $\mathcal{E}$ is moving with respect to $\mathcal{M}$ in the $x$-direction, we do know that $x\neq x’$ for $t>0$. And since we do not know for certain that $t=t’$ for $t>0$, only that $t=t’=0$, we will have to conclude that the position of $P$ differs:

\begin{equation}
\begin{aligned}
\text{in Albert’s }\mathcal{E}\text{: }P &= (t’,x’), \\
\text{in Hermann’s }\mathcal{M}\text{: }P &= (t,x).
\end{aligned} \label{eq:coordinates of P}
\end{equation}

Fortunately, accepting Einstein’s Voraussetzungen[3], we know that the speed of light, $c$, is the same for every frame of reference. Using Equation \eqref{eq:x=vt}, $x=vt$, and the fact that $v=c$, in this case, we can write for the distance travelled of photon $P$ – the yellow line in the diagram – in the coordinates of the respective frames of reference:
\begin{align}
\text{in Albert’s }\mathcal{E}\text{: }\Delta x’ &= c\Delta t’, \\
\text{in Hermann’s }\mathcal{M}\text{: }\Delta x &= c\Delta t.
\end{align}

As it is possible for any coordinate system to have points which lie on the negative side of the origin of a spatial dimension such as $x$ in our case, and thus for light to travel in the negative $x$-direction, we simply square both equations to always obtain a positive value.

\begin{align}
(\Delta x’)^2 &= (c\Delta t’)^2, \\
(\Delta x)^2 &= (c\Delta t)^2.
\end{align}

If we then rearrange this,

\begin{align}
(\Delta x’)^2 – (c\Delta t’)^2 &= 0, \\
(\Delta x)^2 – (c\Delta t)^2 &= 0,
\end{align}

we see that both are equal to zero, allowing us to write

\begin{equation}
(\Delta x’)^2 – (c\Delta t’)^2 = (\Delta x)^2 – (c\Delta t)^2.
\label{eq:interval}
\end{equation}

This is a beautiful result, because it tells us that no matter what frame of reference you happen to be in, besides $c$, Albert and Hermann agree on the quantity $(\Delta x)^2 – (c\Delta t)^2$, despite the fact that the coordinates of $P$ are not necessarily the same in every frame of reference as we saw in \eqref{eq:coordinates of P}. In other words, both $c$ and $(\Delta x)^2 – (c\Delta t)^2$ are said to be invariant.

You might wonder, what is this invariant quantity $(\Delta x)^2 – (c\Delta t)^2$, exactly? Well, this will be discussed in another post called What is a spacetime interval? And now, we might have just told you what it is. Anyway, let us move on to deriving the Lorentz transformations, and just keep in mind that $(\Delta x)^2 – (c\Delta t)^2$ is a wonderfully invariant quantity, equal in both frames of reference. Let us move on to imaginary time.

Wick rotation and imaginary time

Number sets

Figure 3: real number line. *A segment of the real number line, the set R of all real numbers, which goes on to infinity on either side.*

As many of us should know, Figure 3 depicts (a segment of) the real number line, that is the set $\mathbb{R}$ of all real numbers. It formed the culmination of all the previous extensions of the then-known set of numbers. Starting with the natural numbers, a set usually denoted by $\mathbb{N}$, containing all positive integers, arisen from the natural act of counting, the numeric repertoire was then extended by the notion of negative integers. Instead of just 1,2,3, we could now also count to -1,-2,-3 etc. This extension is denoted by $\mathbb{Z}$. Needless to say that $\mathbb{N}\subset\mathbb{Z}$, but we just did anyway.

Of course, some people were clever, acknowledging the need for another extension: numbers which represented ratios, better known as rational numbers, such as 1/2, 1/-3, 1/4, -1/100, in other words, quotients of two integers. These numbers would sit in-between the integers in $\mathbb{Z}$. The symbol is $\mathbb{Q}$, and it is superfluous to add that $\mathbb{N}\subset\mathbb{Z}\subset\mathbb{Q}$.

While specified on a tablet, found in Susa (Iraq) in 1936, dated as used by Babylonians around 2000 BCE, that

\[ \frac{3}{\pi}=\frac{57}{60}+\frac{36}{(60)^2}, \therefore \pi = \frac{25}{8}=3.125, \]

it wasn’t until 1761 that a proof that $\pi$ is irrational was found by Johann Heinrich Lambert[3] meaning that it could not be constructed by any ratio of integers, as were many other numbers, such as $\sqrt{2}$. And so, yet again, an extension of the existing number line was needed. This was the aforementioned line representing the set $\mathbb{R}$, or, to be precise, $\mathbb{N}\subset\mathbb{Z}\subset\mathbb{Q}\subset\mathbb{R}$.

And then, in the 16th century, people such as rivals Tartaglia and Cardano independently recognised that solutions to cubic equations sometimes required the manipulation of square roots of negative numbers, such as $\sqrt{-1}$. Later, Bombelli developed proper operations such as addition and subtraction. A whole slew of subsequent mathematicians then developed over several decennia what is now known as the complex plane or gaussian plane[5], representing the set $\mathbb{C}$, extending the real number line with an imaginary axis with multiples of the imaginary unit $i=\sqrt{-1}$. (It is, obviously, the solution to the quadratic $x^2+1=0$.) We realise mentioning that $\mathbb{N}\subset\mathbb{Z}\subset\mathbb{Q}\subset\mathbb{R}\subset\mathbb{C}$ is utterly redundant at this point.

Translation and rotation

Figure 4: Number sets. Every consecutive number set is an extension of the previous one. We can move from a simpler set to a more complex one, for instance, by simply multiplying our current position by a number only present in the more complex set. Although it seems like we are tumbling from one set to another, we are really just ‘sliding left or right’, one-dimensionally, on the number line of the more complex set. This sliding is called a translation. Note: the amount of ticks in Q is much larger, but for obvious reasons of legibility, we only ticked every 1/2-ratio.

Let us have another look at the natural number line of $\mathbb{N}$. If we would want to convert the number 1 to a number that could not exist in $\mathbb{N}$ but could exist on the integer number line of $\mathbb{Z}$, let us then simply multiply the natural number 1 with a number from $\mathbb{Z}$, the negative integer $-1$. Since $1\times-1=-1$, we have transitioned from $\mathbb{N}$ to $\mathbb{Z}$. We ‘slid’ from 1 to $-1$, albeit in a different number set, which, mathematically, is the same as a translation by $-2$. This is easily expressed as starting from position 1, adding $-2$, and ending up at position $-1$ on the number line of, at least, $\mathbb{Z}$ (but not $\mathbb{N}$): $1+-2=-1$. Figure 4a aims to depict this.

We can do the same with moving from position $-1$ in $\mathbb{Z}$ to a number not in $\mathbb{N}$, nor in $\mathbb{Z}$, but at least in $\mathbb{Q}$ by simply multiplying by a fraction, such as $-1/2$, which is also a number not in $\mathbb{N}$, nor in $\mathbb{Z}$. This is, again, actually a translation, though now by adding $3/2$: $-1+3/2=1/2$. Figure 4b aims to depict this.

Similarly, transforming from position 1 in $\mathbb{Q}$ to $\mathbb{R}$, we multiply by, for instance, $\sqrt{2}$, which exists in $\mathbb{R}$ but not in $\mathbb{Q}$, and so, the result, $1\times\sqrt{2}=\sqrt{2}$ is in at least $\mathbb{R}$ but not in $\mathbb{Q}$, nor in $\mathbb{Z}$, nor in $\mathbb{N}$. The result is also a translation of $1+(\sqrt{2}-1)=\sqrt(2)$ in $\mathbb{R}$. Figure 4c aims to depict this.

Figure 5: Wick rotation of 1 and real time. Compared to the increasing complexity of the number lines in Figure 4, this one is the most complex so far. Two Wick rotations into the complex (number) plane C, where (a) the Re-axis is the real number line of the set R and the Im-axis is the imaginary unit line of the set C. Note that a complex number consists of both: a real part and an imaginary part. For instance, the complex number z is written in the form z = a + bi, with i = √-1. The real part of z is Re(z) = a, and the imaginary part Im(z) = b. And so, z = 1 + i, z = 1/2 + 3i, z = 3, are all complex numbers, where the latter has an imaginary part of Im(z) = 0, which we simply leave out as 0i = 0. A complex number is thus two-dimensional, embedded in a plane with a real axis and an imaginary axis. (b) Wick-rotating all real numbers on the time axis to the imaginary axis, transforms real time into imaginary time.

Note that, so far, the transformation of 1 or another number has involved a simple ‘sliding’ motion on the number lines, that is, one-dimensionally. Every time a new kind of number was introduced – the negative integers, ratios, and, lastly, the real numbers – a new set of numbers was created, and the number line evolved from discrete ($\mathbb{N}$) to a line continuum $\mathbb{R}$. The question is, what would be the next extension and what would it look like?

As stated earlier, in the 16th century it became clear that a new type of number was necessary to solve a slew of quadratic equations. Owing to people such as Wallis, Wessel, Argand, Buée, Mourey, Warren, Français, Bellavitis, Gauss, and Euler[5][6], the idea to extend the real number line of $\mathbb{R}$ with an imaginary, perpendicular number line came to fruition. This created the so-called complex (geometric) plane, sometimes called the $z$-plane, Gauss plane or Argand plane. It is important to note that transforming a real number in $\mathbb{R}$ to a complex number in $\mathbb{C}$ involves not a translation but a rotation. Multiplying a real number in $\mathbb{R}$, say 1, by a number that only exists in $\mathbb{C}$, say $i$, is the same as geometrically rotating our position 1 on the real axes by $\pi/2$ onto a position $i$ on the imaginary axis as is depicted in Figure 5a.

What if we did this with the entire real time axis in a space-time diagram as shown in Figure 5b? Every element of real time $t$ is multiplied by $i$. In other words, every part is rotated in the complex plane to become an entire imaginary axis of time $it$. This procedure is called a Wick rotation, named after theoretical physicist Gian Carlo Wick, who described such a procedure to solve problems in quantum and statistical mechanics[7].

This seems promising and is what Henri Poincaré alluded to fifty years earlier. Before we continue, we have to do one other little thing. It has something to do with units of measurement.

Minkowski diagrams

Figure 6: The distance-time-diagram we all grew up with. The independent variable time t as the x-axis, and the dependent variable distance, x, as the y-axis. Four particles are travelling through time and (one-dimensional) space, each with its own distance function of time, that is, each with its own speed. Note that P travels the most amount of distance over the same period, in other words, it is the fastest. Note that S travels exactly zero distance in that same amount of time.

We all grew up learning to read and use a type of diagram as depicted in Figure 6 during our physics classes. Time is put on the $x$-axis and distance $x$ is put on the $y$-axis. Somewhat confusingly, at first, as one might have gotten accustomed to using values of $x$ on the $x$-axis during the maths lessons. Of course, one learns that it is less about the names of variables and axes, rather, it is a matter of which is the independent and which is the dependent variable. The independent one, in this case, time $t$ (time flies, whether we want to or not), is then laid out over the axis called $x$ (which has not much to do with the variable named $x$), and the dependent one, a variable which happened to be named $x$, is projected onto the $y$-axis.

In this diagram, we see four particles. The fastest, $P$, is moving away in the $x$-direction (which is up, but not necessarily up into the sky, do realise that!) covering more units of $x$ than the other three after the same time $t_1$ has passed. This is why it has a steeper slope. The slowest one is the one that is not moving at all, the stationary particle $S$. It is moving in time, which is why it exists at time $t_1$, but, spatially, it does not exist at a certain amount of units of $x$ away from the origin. In fact, it exists in exactly the same place, the origin.

Figure 7: t–x-diagram. A physicist’s diagram, where distance in space, x, is projected on the x-axis and distance in time t is projected on the y-axis. Note that the faster a particle travels, the smaller the angle of its ‘line’ through space and time with the x-axis. If the particle is stationary, it only ‘travels’ through time and the angle with the x-axis is maximised at π/2. In other words, it just goes straight up.

Well, get yourself out of the habit: turns out that professional physicists like to flip the axes when it comes to time. In other words, they project the distance variable $x$ onto the $x$-axis, while the time variable $t$ almost invariably gets to be projected onto the $y$-axis. Yes, you heard it correctly. Time goes up in a physicist’s diagram. The esteemed professor Leonard Susskind, a theoretical physicist at Stanford University, even postulated, in part jokingly, during a lecture on the principle of least action that physicists are the only type of people who do this(beginfootnote)See, for instance, https://youtu.be/3apIZCpmdls?t=1447(endfootnote). And so, we flipped our diagram as you can see in Figure 7.

Speaking of units, usually, time is measured in seconds and distance in metres. Usually. Though, remember when you went to visit those new friends of your parents and that one of the first things they assured their hosts is that their hometown was actually not that distant and that it was just ‘a two-hour drive’? Distance, while usually measured in kilometres between two places, is now expressed in units of time. Assuming that people legally drive – from door to door – at an average speed of $100\text{ km/h}$, the distance will be around 200 km.

Why do people like to express distance in terms of units of time sometimes? Well, in some cases, people aren’t interested in the exact amount of kilometres, but rather tend to focus on how much of our valuable time a certain activity consumes, hence, an answer in units of time makes sense.

Astrophysicists do another interesting distance-as-time conversion when it comes to distances between galaxies, for instance. They work with visible light reaching their telescopes, and other types of radiation. Moreover, the distances they work with are ridiculously large, especially when expressed in kilometres. So, they work with light-years, which sounds like a unit of time, but denotes a certain distance. One light-year is the distance light travels in one Julian year, which is $365.25$ days. Light travels at a speed of $c=299792458\text{ ms}^{-1}$ in the vacuum. To calculate the number of seconds in a Julian year, we multiply the number of seconds in one minute times the number of minutes in one hour times the number of hours in one day times the number of days in one Julian year:

\begin{align}
60\text{ s} &\times 60\text{ minutes} \times 24\text{ hours} \times 365.25\text{ days} \\
&= 31557600\text{ s}.
\end{align}

Using Equation \eqref{eq:x=vt} to calculate distance $x$ light travels in one Julian year, we get

\begin{align}
x &= vt \text{, and because }v=c\text{, we write:} \\
x &= ct,\label{eq:x=ct} \\
&= 299792458\text{ ms}^{-1} \times 31557600\text{ s} \\
&= 9460730472580800\text{ m}, \\
&= 9460730472580.800\text{ km}.
\end{align}

Since light travels this ridiculously large number of kilometres, it makes perfect sense for astrophysicists to use this fact to express the distance of stars and galaxies. This way, the nearest major galaxy, Andromeda, is only about $2.5$ light-years away. This is obviously more practical than $23651826181452\text{ km}$.

So, about describing distance in terms of units of time, we learnt that

in the case of a ‘normal scale’ distance such as between two towns, expressing a spatial distance in units of time makes it easier to compare with the amount of time one wishes to spend on travelling – it becomes like comparing time with time;
in the case of larger scale distances such as between two galaxies, expressing a spatial distance in light-units of time makes it easier to handle the impractically large numbers of the original units.

Let us go back at our diagram in Figure 7 again. The units of both axes are not the same. The $x$-axis is the distance, which is expressed in spatial units, such as metres. The $y$-axis is the time, expressed in temporal units, such as seconds. It is hard to compare the two: the units are not the same. Also, as particle physicists are usually dealing with extremely fast particles, near the speed of light, it is impractical to be using the standard units of time. So, physicists have devised a solution to both problems. Number one: what if we expressed time in units of distance? So, that is the other way around: not distance in units of time, but time in units of distance.

To do that, we simply use the formula as expressed in Equation \eqref{eq:x=ct}: $x=ct$. In other words, if we multiply time $t$ with the speed of light $c$, we get a distance. A little analysis of units checks out. If we multiply the units of the speed of light with the unit of time, we get a unit of distance:

\begin{equation}
\text{m s}^{-1} \times \text{s} = \text{m s}^{-1}\text{s} = \text{m}\frac{\text{s}}{\text{s}} = \text{m}.
\end{equation}

Figure 8: ct. (a) The time axis is multiplied by c, so it is easier to compare time with space, i.e. time is expressed in units of distance. (b) We set c = 1 so that the so-called world line of any particle travelling at exactly the speed of light is always at an angle of π/4 with the x-axis. Or 45°, if you are into that sort of thing.

This does not mean we magically, qualitatively, or even hypothetically transformed the time dimension into a space dimension, even though this would be a perfect device for a cool work of science-fiction, but it does mean that we now express time in units of distance. And so, we label the $y$-axis with $ct$ as is shown in Figure 8.

Now, to tackle the second problem, where physicists work with particles whose motions approach the speed of light at distance scales smaller than an electron in the vicinity of black holes with forces greater than you would ever encounter, it is impractical to work with the ordinary distance and time units. Furthermore, they prefer to choose the units of $ct$ and $x$ such, that a ‘line’ of a photon, e.g. light, travelling through space and time, is always depicted at an angle of $\pi/4$ or $45^\circ$ with the $x$-axis. To do so, they set the speed of light to 1. So, $c=1$. What you get is a diagram as shown in Figure 8(b). Particle $P$ is a photon, thus travelling at the speed of light. So, its ‘line’ is at the exact angle of $\pi/4$ with both the $x$- and $y$-axis, i.e. the $x$ and $ct$, respectively. All the other particles thus travel at a certain ratio of $c$, i.e. a certain ratio of 1.

All this should tell you enough to figure out how fast a particle would be going if its ‘line’ would be drawn underneath that of $P$, i.e. at an angle smaller than $\pi/4$. And even though you should also be able to figure out if this is at all possible, we will tell you now that this is not possible.

By the way, the term ‘line’, which we use to describe the path a particle takes through space and time in our diagrams, is called a ‘world line’ as Hermann Minkowski would have wanted us to. And the diagrams of Figures ref7 and 8 are called Minkowski diagrams. They are also called spacetime diagrams, although there is a subtle difference: Minkowski diagrams are the subset of two-dimensional diagrams within the larger set of spacetime diagrams, which contains the 3D versions, and 4D, even.

Wick rotation revisited

Figure 9: from it to ict. (a) The Wick rotation of the real time axis ct to the imaginary time axis ict. We projected the coordinate system of Figure 8 onto ‘the floor’ to have the ct-axis then rotated to the imaginary ict-axis by multiplication by the imaginary unit i. (b) Consequently, the world line of P gets rotated onto the imaginary plane as well.

We are almost ready to derive the Lorentz transformations. The only thing we have to do, is Wick rotate the (real) time axis into the imaginary time axis, i.e. we rotate the $ct$-axis of Figure 8. So, we do as we did in the previous section Number sets: we multiply by the imaginary unit $i$ from the number set $\mathbb{C}$, thereby rotating the time axis of $\mathbb{R}$ into the complex plane $\mathbb{C}$ to become an imaginary axis of time.

Figure 9 offers a geometric representation of the whole operation. We projected our original coordinate system of Figure 8 onto ‘the floor’, so to speak. We left out particles $Q$, $R$, and $S$ to keep it legible. Wick-rotating the real time axis $ct$ by multiplying by the imaginary unit $i$ then yields the imaginary time axis $ict$. Automatically, the world line of $P$ rotates along into the complex plane. Note, that the spacetime coordinates of $P$ have changed a few times in this section. They were $(x_P,t_1)$, then they became $(x_P,ct_1)$, and have ended up to become $(x_P,ict_1)$. Just the way we like it.

Deriving the Lorentz transformations

Invariant world line in the complex plane

Figure 10: Wick-rotated M and E. Hermann’s M and Albert’s E frames of reference rotated at an angle θ relative to each other in the complex plane about their origin. We put a little square with sides marked I and II to aid us in our trigonometric calculations.

To end up with the Lorentz transformations as formulated in Equations \eqref{eq:Lorentz t-prime} and \eqref{eq:Lorentz x-prime} by rotating two frames of reference relative to each other in the complex plane – with an imaginary time axis – we refer to Figure 10.

In both frames, those of Hermann and Albert, a photon $P$ travels at speed $c$. By the second postulate of Einstein’s Special Relativity[3], we know that, somehow, the value for $c$, which is chosen to be 1 in our case, is the same in both frames of reference, even though one moves relative to the other, meaning the coordinates between the frames are unequal. In Figure 2, this is represented by $v$. In Figure 10, this is represented by an angle $\theta$.

We see that the coordinates of $P$ in $\mathcal{M}$ are $(\Delta x, ic\Delta t)$. In $\mathcal{E}$, they are $(\Delta x’, ic\Delta t’)$. They are related to each other by some proportion of angle $\theta$. Before we find that relation, we repeat our finding regarding Equation \eqref{eq:interval} in the section Invariances: the quantity $(\Delta x)^2 – (c\Delta t)^2$ is invariant. In our case, it is the interval $OP$ that is invariant, despite the fact that $P$ has different coordinates. In other words, geometrically, both $\mathcal{M}$ and $\mathcal{E}$ agree on the length of the yellow world line as you can see in Figure 10. We should proceed to show this.

Let us first write down the expressions for the invariant yellow world line $OP$ in both frames of reference:

\begin{align}
\text{Hermann, standing in }\mathcal{M}\text{, says: }(OP)^2 &= (\Delta x)^2 + (ic\Delta t)^2, \\
\text{Albert, standing in }\mathcal{E}\text{, says: }(OP)^2 &= (\Delta x’)^2 + (ic\Delta t’)^2.
\end{align}

And, since both expressions calculate the same invariant quantity, obviously, we can write:

\begin{equation}
(\Delta x)^2 + (ic\Delta t)^2 = (\Delta x’)^2 + (ic\Delta t’)^2,
\end{equation}

which simplifies to

\begin{equation}
\Delta x^2 – c^2\Delta t^2 = (\Delta x’)^2 – c^2(\Delta t’)^2.\label{eq:interval in the complex plane}
\end{equation}

This is the result we wanted. Whether it is with imaginary time or with real time, the quantity $\Delta x^2 – c^2\Delta t^2$ remains invariant. (Recall that $i^2=(sqrt{-1})^2=-1$.) Even in the complex plane, $\mathcal{M}$ and $\mathcal{E}$ agree on the magnitude of this quantity.

Coordinates in terms of the other coordinates

Let us now express the coordinates of $P$ in $\mathcal{E}$, i.e. $(\Delta x’,ic\Delta t’)$, in terms of angle $\theta$ and the coordinates of $P$ in $\mathcal{M}$, i.e. $(\Delta x,ic\Delta t)$. Firstly, we deduce an expression for $\Delta x’$ using Figure 10:

\begin{align}
\Delta x’ &= \Delta x\cos\theta + \text{I}, \\
\text{I} &= ic\Delta t\sin\theta, \\
\therefore \Delta x’ &= \Delta x\cos\theta + ic\Delta t\sin\theta.\label{eq:delta x prime}
\end{align}

Secondly, we deduce an expression for $ic\Delta t’$:

\begin{align}
ic\Delta t’ &= ic\Delta t\cos\theta – \text{II}, \\
\text{II} &= \Delta x\sin\theta, \\
\therefore ic\Delta t’ &= ic\Delta t\cos\theta – \Delta x\sin\theta.\label{eq:icdelta t prime}
\end{align}

Lastly, as we want to find the relation between $\theta$ in the complex plane and $v$ in real spacetime, we forget $P$ for a moment and now write the expression for Albert himself, sitting in $O’$ of his frame $\mathcal{E}$ in terms of the coordinates of Hermann’s frame $\mathcal{M}$. In other words, how does Hermann see Albert move? Since Albert is not moving in his own frame $\mathcal{E}$, as we said earlier, after a certain amount of time $ic\Delta t$, his $\Delta x’=0$. So, by Equation \eqref{eq:delta x prime}, we write

\begin{equation}
\Delta x’ = \Delta x\cos\theta + ic\Delta t\sin\theta = 0.
\end{equation}

Working this further, we get

\begin{align}
ic\Delta t\sin\theta &= -\Delta x\cos\theta, \\
\frac{sin\theta}{\cos\theta} &= -\frac{\Delta x}{ic\Delta t}, \\
\tan\theta &= -\frac{1}{ic}\frac{\Delta x}{\Delta t}, \\
\tan\theta &= -\frac{1}{ic}v, \\
\tan\theta &= -\frac{v}{ic}.
\end{align}

To remove the imaginary unit – being a surd – from of the denominator, we multiply the right hand side with $i/i$, yielding:

\begin{align}
\tan\theta &= -\frac{i}{i}\frac{v}{ic}, \\
\tan\theta &= -i\frac{v}{-c}, \\
therefore \tan\theta &= \frac{iv}{c}.\label{eq:tan \theta}
\end{align}

To recapitulate, we have now obtained Equations \eqref{eq:delta x prime}, \eqref{eq:icdelta t prime}, which express the coordinates of $P$ in $\mathcal{E}$ in terms of angle $\theta$ and the coordinates of $\mathcal{M}$. Lastly, we obtained relation \eqref{eq:tan \theta} between angle $\theta$ and speed $v$ of Albert’s frame $\mathcal{E}$ as seen by Hermann in his frame $\mathcal{M}$. So, to restate, we obtained the following transformations:

\begin{aligned}\Delta x’ &= \Delta x\cos\theta + ic\Delta t\sin\theta,&\quad\eqref{eq:delta x prime} \\ ic\Delta t’ &= ic\Delta t\cos\theta – \Delta x\sin\theta,&\quad\eqref{eq:icdelta t prime} \\tan\theta &= \frac{iv}{c}.&\quad\eqref{eq:tan \theta}\end{aligned}

Figure 11: Triangle tan θ. *The geometric representation of Equation 9: an imaginary triangle with an imaginary slope tan θ, where Γ is the hypotenuse. Note that sin θ = (iv/c)/Γ and cos θ = 1/Γ.*

The Lorentz transformations

Note that, algebraically, it is possible to write Equation \eqref{eq:tan \theta} as

\begin{equation}
\tan\theta = \frac{iv/c}{1},
\end{equation}

which, geometrically, looks like 11. Note that $\sin\theta=(\mathrm{iv/c})/\Gamma$ and $\cos\theta=1/\Gamma$, so all we have to do now, is figure out what $\Gamma$ is. Using, again, the Pythagorean theorem:

\begin{align}
\Gamma^2 &= 1^2 + \left(\frac{iv}{c}\right)^2, \\
&= 1 + \frac{-v^2}{c^2}, \\
\therefore \Gamma &= \sqrt{1-\frac{v^2}{c^2}}.
\end{align}

We can now write:

\begin{align}
\sin\theta &= \frac{iv/c}{\sqrt{1-v^2/c^2}}, \\
\cos\theta &= \frac{1}{\sqrt{1-v^2/c^2}}.
\end{align}

This is starting to look good. Moving on to substitute $\sin\theta$ and $\cos\theta$ in Equation \eqref{eq:delta x prime}, yields:

\begin{align}
\Delta x’ &= \Delta x \left(\frac{1}{\sqrt{1-v^2/c^2}}\right) + ic\Delta t\left(\frac{iv/c}{\sqrt{1-v^2/c^2}}\right), \\
&= \frac{\Delta x}{\sqrt{1-v^2/c^2}} + \frac{-v\Delta t}{\sqrt{1-v^2/c^2}}, \\
\therefore \Delta x’ &= \frac{\Delta x-v\Delta t}{\sqrt{1-v^2/c^2}}.
\end{align}

Since in our configuration the differences are calculated from the origin, we can leave out the $\Delta$-sign, using just the coordinates, and so we obtain

\begin{equation}
x’ = \frac{x-vt}{\sqrt{1-v^2/c^2}},
\end{equation}

which is indeed Equation \eqref{eq:Lorentz x-prime}.

Substituting $\sin\theta$ and $\cos\theta$ in Equation \eqref{eq:icdelta t prime}, yields:
\begin{align}
ic\Delta t’ &= ic\Delta t\left(\frac{1}{\sqrt{1-v^2/c^2}}\right) – \Delta x\left(\frac{iv/c}{\sqrt{1-v^2/c^2}}\right), \\
ic\Delta t’ &= \frac{ic\Delta t}{\sqrt{1-v^2/c^2}} – \frac{iv\Delta x/c}{\sqrt{1-v^2/c^2}}, \\
\Delta t’ &= \frac{\Delta t}{\sqrt{1-v^2/c^2}} – \frac{v\Delta x/c^2}{\sqrt{1-v^2/c^2}}, \\
\therefore \Delta t’ &= \frac{\Delta t-v\Delta x/c^2}{\sqrt{1-v^2/c^2}}.\end{align}

And so, leaving out the $\Delta$-sign, using just the coordinates, we obtain
\begin{equation}
t’ = \frac{t-vx/c^2}{\sqrt{1-v^2/c^2}},
\end{equation}

which is, indeed, Equation \eqref{eq:Lorentz t-prime}.

It is important to note that, while not unusual to leave out the $\Delta$-sign, formally, it is incorrect: in Special Relativity there is no preferred (fixed) origin, hence, it is always about differences.

Lastly, we reiterate that the term $1/\sqrt{1-v^2/c^2}$ is often written as $\gamma$ and is called the Lorentz factor. Also, in some texts, the term $v/c$ is replaced by symbol $\beta$, yielding the following equivalent expressions of the Lorentz transformations:

\begin{align}
ct’ &= \gamma(ct-\beta x), \\
x’ &= \gamma(x-\beta ct), \\
y’ &= y, \\
z’ &= z.
\end{align}

Thanks to the imagination of many mathematicians and physicists before us, our ability to investigate, analyse, and calculate has become as supple and malleable as is, indeed, the fabric of the cosmos.

Featured image: arielrobin

[1] Hawking, S. (2001) The universe in a nutshell. New York: Bantam Books.
[2] Walter, S. (2014) Poincaré on clocks in motion. Amsterdam, Ne.
[3] Einstein, A. (1905) “Zur Elektrodynamik Bewegter Körper,” Annalen der Physik, 322(10), pp. 891–921. doi: 10.1002/andp.19053221004.
[4] Bailey, D. H. and Borwein, J. M. (2016) Pi : the next generation : a sourcebook on the recent history of pi and its computation. Switzerland: Springer. doi: 10.1007/978-3-319-32377-0.
[5] Cooke, R. (2005) The history of mathematics : a brief course. 2nd edn. New York, N.Y.: Wiley.
[6] Caparrini S. (2006) On the Common Origin of Some of the Works on the Geometrical Interpretation of Complex Numbers. In: Williams K. (eds) Two Cultures. Birkhäuser Basel, pp. 139-151.
[7] Wick, G.C. (1954) Properties of Bethe-Salpeter Wave Functions. Physical Review, 96(4), pp. 1124-1134.

OpenCurve

A bit of maths and physics. For everyone.