PHYS 200: Fundamentals of Physics I

Lecture 14

 - Introduction to the Four-Vector


The four-vector is introduced that unifies space-time coordinates x, y, z and t into a single entity whose components get mixed up under Lorentz transformations. The length of this four-vector, called the space-time interval, is shown to be invariant (the same for all observers). Likewise energy and momentum are unified into the energy-momentum four-vector.

Transcript Audio Low Bandwidth Video High Bandwidth Video

Fundamentals of Physics I

PHYS 200 - Lecture 14 - Introduction to the Four-Vector

Chapter 1: Recap—Consequences of the Lorentz Transformations [00:00:00]

Professor Ramamurti Shankar: Let me remind you that everything I did so far in class came from analyzing the Lorentz transformations. And you guys should be really on top of those two marvelous equations, because all the stuff we are doing is a consequence of that. I won’t go over how we derive them, because I’ve done it more than once. But I remind you that if you’ve got an event that occurs at x,t for one person, and to a person moving to the right, at velocity u, the same event will have coordinates x’ = x - ut/√ and t = (t - ux/C2)/√. This is it. This is the key. From this, by taking differences of two events, you can get similar equations for coordinate differences. In other words, if two events are separated in space by Δx for one person and Δx’ for another person and likewise in time, then you get similar formula for differences. So, differences are related the same way that coordinates themselves are. But I will write it anyway, because I will use it sometimes one way and sometimes the other way. Even this one you can think of as a formula for a difference, except one of the coordinates, or the origin. This isn’t generally, if you’ve got two events, not necessarily at the origin, then there’s spatial separation and time separation are connected in this fashion.

So, I put this to work. I got a lot of consequences from that. You remember that? For example, I said, take a clock that you are carrying with you. Or let’s take a clock that I’m carrying with me and let’s see how it looks to you. And let’s say it goes tick and it goes tick one more time. So for me, the time difference between the two ticks will be, let’s say, one second. It’s a one-second clock. The space difference is zero because the clock is not going anywhere for me. So for you, Δt’ = 1s/√ which is more than one second. So you will say, hey, the time your clock ticked one second, I really claimed two seconds have passed. Therefore, you will say my clocks are running slow. And it’s easy to see, from the same equations, that if you replace the′ and unprimed ones, simply change u to -u, nothing changes. The clock that you think is set best with respect to you will look slow to me.

Then, I showed you how when you wanted to measure the length of a stick which is traveling past you, you get your assistants on the–who are all lined up on the x axis to measure the two ends of the rod, or any particular time they like. Take the special difference. That’s the meaning of the length of a moving rod. You’ve got to measure both ends at the same time. So, you make sure the two measurements are done at the same time, so Δt = 0. The distance between them is what I call the length of the rod. You are moving with the rod. And as long as you’re moving with the rod, the rod has got two points which are separated by the length L0. And one thing occurs at one end, another thing occurs at the other end. The spatial difference between them will be simply the length of the rod in your frame of reference. So, l will be L0 times–if you bring the square root to the other side, L will be L0√ whereas Δt’ would be some Δτ under this frame, divided by the square root. Okay. Now, the important result I showed you was a lack of simultaneity. It’s not absolute. So, I started with unfortunate example of twins born in Los Angeles and New York, and you guys were pointing out that no normal woman could do that, except the one from “your momma” jokes. “Your momma” jokes; your momma is so big she can have a kid in New York. So, I don’t want to deal with such cosmological objects right now.

So, I will say, pick a better example. Two things happen. Something here, something there, separated by distance. People cannot agree that they were simultaneous. Now, you’ve got to say, it is very surprising, people cannot agree on length difference between two events. For example, New York and Los Angeles; over 3000 miles apart. We think you can move in a train; you can move in a plane. You all have to agree on the separation, but you don’t. That’s just a consequence of now the new relativity. In the old days, before I did relativity, I was taking just the good old x and y coordinates. And I said you can take a point, you can assign to it some coordinates x and y. Then, somebody comes along with a rotated axis, but no one seems to be troubled by the fact that the same point is now given a new set of coordinates by this new person. Or take a pair of points on this line. For that person in the rotated system, these events have the same y’ coordinate. Because you see, this is the y ‘= 0 axis. It occurred at the same y’, but for me, they’re not the same y coordinate, because that has that y coordinate; that’s a different y coordinate. So, we are fully used to the fact that having the same x coordinate for two events is not an absolute statement difference on the frame of reference. Having the same y coordinates is also not absolute. But somehow, in the case of time, we used to think that time differences are absolute and spatial differences are absolute. And what we are learning is no, they are just like the x and the y. Okay.

Chapter 2: Causality Paradoxes: “Killing the Grandmother” [00:06:25]

So, I’m going to do one more thing with the Lorentz transformation. This is pretty interesting. Let’s take this equation for time difference; Δt’ = Δt - u Δx over c2 divided by this. So, let’s take two events. First, something happens; then, something else happens. And Δt is a separation in time between them, t2 - t1. Let’s say t2 - t1 is positive, so t2, the second event, occurred after the first event, according to me. How about according to you? Well, Δt’ doesn’t have to have the same sign as Δt, because you can subtract this number, u Δx over c2; Δx can be whatever you like. Therefore, you can find that Δt’ could be negative for you. And you’ve got to understand, that’s a big deal. I say this happened first and that happened later. Then, you say, no, it happened the opposite way. Now, this can lead to serious logical contradictions, especially if event one is the cause of event two. This is a standard example both in special and general relativity that people talk about; event one is somebody’s grandmother is born. And event two, that person is born. We say the birth of the kid takes place long after the birth of the grandmother, according to me. What if, in some other point of view, the kid is born, but the grandmother is not yet born. And somebody goes and assassinates the grandmother. This is–In fact, this is called killing the grandmother. These words are taken right out of textbooks. Why this kind of violence is inflicted on grandmothers I don’t know, but it’s a standard example. All logical contradictions involving time travel have to come–have to do with coming back and doing something.

So, here’s an example. The grandchild is born. Grandmother is not yet born, and something is done, you know, to prevent her from being born or two seconds after she is born, she is hit by the mafia. How do you explain the grandchild? Where did the grandchild come from? Because the cause has been eliminated. Or, event 1, I fired a bullet. Event two, somebody is hit. And you go to frame of reference in which the person has been hit and I haven’t fired the bullet. So, you come and you finish me off. So now, we’ve got this person dying for no apparent reason because the cause has been eliminated. That simply cannot happen. Einstein recognized that there is a limit to how far he can push this community, and he conceded that if A can be the cause of B, you better not find an observer for whom they occur in reverse order. Because if the cause occurs after the effect, then there is some time left for somebody to prevent the cause itself from happening and we have an effect with no perceivable cause. So, we want to make sure that Δt’ cannot be reversed whenever the first event could be the cause of the second event. So, you go and ask this equation: if Δt is positive, when will Δt’ be negative? That is simple algebra; you want this term to beat that term. So, that’ll happen if u, Δx–let me put a c here, another c here, is greater than Δt. If that happens, things are going to happen backwards, right? So, let’s modify that a little bit by putting this c over here, and rewrite it finally as u over cis bigger thanc Δt over Δx. Let’s understand what this equation means. This equation means that if there were two events separated in time by Δt and in space by Δx, I can find an observer, at a certain speed u, so that u divided by c, if it exceeds this number, for that observer, the time order of the events will be reversed. Yeah?

Student: [inaudible]

Professor Ramamurti Shankar: It’s not yet clear the velocity is greater than c. It depends on what’s happening in the top and bottom. You have to ask. What is c Δt? Remember, Δt is a time between two events. c Δt is what? It’s the distance a light pulse can travel in that time. The denominator is a spatial separation between those two events. Event 1 is here, event 2 is there; c Δt is how far a light signal can travel in the time separating these two events. If c Δt looks like that, then there’s enough time for a light signal to go from here to here. In that case, c Δt would be bigger than Δx, but then the velocity that you want of your frame will be bigger than one, and that’s not possible in units of speed of light. The only time you will get a sensible value for the second frame, namely with u/c < 1 would be if c Δt < Δx; that means, here’s the first event and here’s the second event. A light signal could have only gone from there to there in the time between the events. So, it is saying that if two events are such that there was not enough time for a light signal to leave the first event and arrive in time for the second event, then we can actually find an observer with the sensible velocity for whom the order of events is reversed. Therefore, what have we learned from this? We learned from this that it should not be–if there is enough time for a light signal to go from one event to the other event, then we will not play around with the order of events. But if there is no time even for the light signal to go, the system in fact allows you to see them in reverse order.

But we have seen that events which are causally connected, the order cannot be reversed or should not be reversed. So, what this is saying is, it should be impossible for an event here to affect a second event by using a signal faster than light. In other words, we are going to say that if there wasn’t enough time for a light signal to go from this event to that event, then this event could not have been the cause of that event. We will demand that. Therefore, what the theory for it to be make logical sense is, no signal should travel faster than the speed of light. Because if you had a signal that can travel faster than the speed of light, then the first event could have been the cause of the second event, because you send the signal and the signal may blow something up, but that is not enough time for light to get there in the same time. But if you allowed such things, then you will find a frame of reference with a velocity perfectly sensible in which they occur backwards, and cause and effect will have been reversed.

So, the answer to the whole thing, in case you’re still struggling with this issue, is the following. What the theory of relativity demands is that it should be impossible for events to influence other events with signals traveling faster than light. Once you accept that, there is no logical contradiction in the theory, because the only time events are reversed is in when there is no time for light signal to go from here to there. That means there was no way this could have been the cause of that. For example, when I fire a bullet, the cause is my firing of the bullet, the effect is whatever happens to the recipient. And the whole thing takes place at the speed equal to speed of the bullet. In that case, c Δt is definitely bigger than Δx, because in the time it took the bullet to go from here to there, the light pulse could have also gone there and beyond. In that case, you’ll never find an observer at a speed less than that of light for whom the events will appear backwards. So, the special theory of relativity is another way to ensure that everything is consistent is to demand that no signal travels faster than the speed of light. No energy, no signal can travel that fast.

Chapter 3: A New Understanding of Space-Time [00:15:22]

But this leads to something very interesting. Our view of space-time is now modified, you see? In the old days, here is space and here is time. And let’s say (0,0) is where I am right now. I am at the origin, my clock says zero; I’m here right now. Any dot I pick here is in my future, because t is positive for the events. They have not happened yet. Any dot I draw here is an event in my past, because it occurred at an earlier time. So, in the Newtonian view of the world, there is something called right now, this horizontal line. And everything above it is called later or future. Everything below it is called past. But in special relativity, after you’ve taken into account a relativistic thing, let’s draw here an axis ct. So, whenever you have another event so that ct > x, there is enough time for a light signal to go from here to here. That’s the meaning of saying ct > x. This is the line ct = x. This is the line ct > x. This event is in the future of this event according to me, and also according to anybody else. In other words, Δt is positive for me, but if you go back to these equations, since c Δt > x, you will never find anybody who said this event occurred earlier than this event.

Another way to understand it is if I want to cause that event, suppose that is some explosion going off there. I send a signal that goes and makes that happen. That signal has to travel slower than light to get there. Therefore, it could have caused the event. And therefore, there is no screwing around with the order of that event. That is in my future according to me and according to all observers. If you take an event here, so this is called–this whole event north of this 45 degree line is called the “absolute future.” By “absolute,” I mean future of me not only according to me, but according to all observers. For every observer, this will occur later than this. It won’t be later by the same number of seconds, but it will be later. So, the Δt between this event and this event is positive for me, will be positive for all people. They will all agree this occurred after this; therefore, the effect appeared before the cause. Sorry, the cause will appear before the effect.

And it is not really necessary that this event be caused by this event. It’s only necessary that it could have been caused by this event because not everything in the universe is caused by some other event. It can just simply happen. We have heard that; stuff happens. So, stuff happens here with no obvious cause, as long as this in the future, according to me; by this definition, it’s in the future according to all people. So, this is called the future because you can affect it. For example, if I heard something terrible is going to happen at this point, I can send my guys to go over there and do something. They have to travel at a speed that is less than the speed of light so they can affect it. These events are called the “absolute past.” What that means, any event here could have been the cause of what’s happening to me right now because from that event, a signal can be sent to arrive where I am right now at a speed less than the speed of light. This line, which is called the light cone, is a borderline case where you can communicate from here to here using a light signal. So, that’s also considered a future.

But what about an event here? Suppose this distance is two seconds times the velocity of light. That has not happened yet. And suppose I go there and open my envelope and say something terrible is going to happen at this point. There is nothing I can do. There’s nothing I can do. In the Newtonian days, there’s something you can do. Tell someone else to really hurry up and get there and do something. But now you cannot do that because for that someone to leave you with the instruction to do something about this requires that person to travel faster than light, and that’s not allowed. So, even though you know things can happen here in the future, and you have knowledge of the fact that someone is planning to do something there, you cannot get there. Okay? That’s a very important thing even for people who are going to law school. And all of you guys are not going into physics. But suppose you’re going to law school. You’ve got the DNA defense, right? My client’s DNA doesn’t match. Here’s another defense. If your client was accused of doing something here, and was last seen here, you can argue that my client was outside the light cone. That’s called outside the light cone defense, and it’s absolutely water-tight. It’s better than anything because if that event is outside the light cone of your client, the client cannot be held responsible, because your client will have to send a signal faster than light, and that law is unbroken.

So, what about the status of an event like this? The status of this event is, I can actually find other observers moving at a speed less than light, but for whom this event occurs before this event. So, order of events can be reversed. But it will not lead to logical contradictions because we know the two events are not causally connected. So, space and time, we just divide into upper half plane and lower half plane. The future and past, and a tiny sliver called present now divided into three regions; the absolute future that you can affect, the absolute past whose events can affect you, and this is–now, it doesn’t have a name. If you want, you can call it the relative future according to you, but I’ll find somebody to whom that occurs before this one. Okay? That’s also a consequence of the Lorentz transformation. So, you can see that the equations are very deceptively simple. In fact, they’re a lot easier looking than some equations with angular momentum, right? But the consequences are really stupendous. They all follow from that equation. And of course, if you invent a theory like this, you’ve got to make sure that there are no contradictions, right?

Suppose you find a theory, you’re very happy with everything; velocity of light is coming out right. But somebody points out to you that the order of events can be reversed. And you’ve got to agree you’ll be in a panic because, how can I reverse the order of events? And the theory is so beautiful; internally, it says you can reverse the order of events if they could not have been causally connected. Where the causally connected means a signal traveling at a speed less than light could not have gone from the first or the second event. Then, the theory does not allow them to be–If a signal could have gone, it allows them to be never reversed. If a signal could not have gone, it does say you can find people for whom events occurred in the reverse order. Alright. So, this is the second thing, and it’s called the light cone for the following reason. I’ve only shown you an x coordinate. If we had a y coordinate coming out of the blackboard, the surface would look like a cone. That’s why we call it the light cone. Sitting at a point in four-dimensional space-time, there’s a cone. And all points in that forward light cone are the events you can affect. All points in the backward light cone are events that can come and get you right now. And the rest of the thing outside the two cones, you cannot do anything about, even though you know something is going to happen. And they cannot do anything to you. You open an envelope, and it says somebody here is planning to do something to you. You don’t have to worry, because that person cannot get to you in the time available. Because the fastest signal is the light signal, and that won’t get there. And neither will anything else. Alright.

Now, we’re going to do something which is theoretically or mathematically very pretty. So, for those of you who like mathematical elegance, this is certainly the best example I can offer you, because it is simple and also rather profound. But it’s completely by analogy. The analogy is the following. We have seen that when we rotate our axis, as I’ve shown there, that the x’ is not the same as x related by this formula. And y’ is also not the same as y related by this formula. Nothing is sacred. Coordinates of a point are not sacred. They are just dependent on who is looking at them from what orientation. But even in this world, we noticed that x‘2 + y’2 is the same as x2 + x2. Namely, the length of the point connecting the origin to you, or the distance from the origin is unaffected by rotations. This is called an invariant. This doesn’t depend on who is looking at it. Everyone will agree on this. But they won’t agree on x and they won’t agree on y, but they will agree on this. So, it’s reasonable to ask in the relativistic case, where people cannot agree on time coordinate or space coordinate, maybe the square of the time plus the square of the space will be the same for two people. One can ask that question. So, when we look at that, we’ll find out that that’s not the case; t2 + x2 is not invariant. But even before you do that, you should shudder at the prospect of writing something like this. Right? What’s wrong with this?

Student: [inaudible]

Chapter 4: Introducing the Fourth Dimension and Four-Vector Algebra [00:25:51]

Professor Ramamurti Shankar: Good. Everyone’s on top of the units here. So, you cannot add t2 + x2. There is no way this can be anything. So, we know how we have to fix that. We’ve got to have either both coordinates in space-time measured with lengths or with time. The standard trick is to do the following. Let’s introduce an object with two components. I’m going to call it X. The first component of that is going to be called X1–is going to be called X0. The second is going to be called X1. X0, I mean, X1 is just our familiar x. X0 is going to be essentially time, but multiplied by c. Why do I switch from t and x to zero and one? It is just that if you want to go back to more coordinates, if you want to bring back y and z, then I just mention for future use that in four dimensions you really have X0, X1, X2 and X3. That is a shorthand for X0 and what you and I used to call the position r of a particle. It’s just that in one dimension, the vector r becomes a number x, and I want to call the number X1 because it’s the first of the three coordinates. And if you’re doing super strings, you can have ten coordinates, and one will be X0 and the nine will be spatial coordinates. So, we like the numerical index rather than letters in the alphabet because we run out of letters. But you don’t run out of these subscripts. So, I’m writing it purposely in many ways, because if you ever go read something or you’re involved in a lab project or the group is studying something and you want to read the paper, people who refer to coordinates in space-time in many ways. Some will call it–Some people like to write the X the following way; X1, X2, X3, X4, but X4 is ct. That’s why you call that the fourth dimension. But still, the fourth dimension, but you can either put it at the end of that first family of three or at the beginning. So, it’s very common for people to use either notation. I’m just going to call it X0. X0 is just time, okay, multiplied by c so that it has units of length and multiplying by c doesn’t do anything. Everybody agrees on what c is. So, we’re all multiplying our time coordinates by c. Alright.

So, let’s ask the following question. What does the Lorentz transformation look like when I write it in terms of these numbers? So, first take x’ is x - ut divided by the famous square root. But t is not what I want; ct is what I want. So, I fudge it as follows; I write it as (x - u/cct)/(1 - u2/c2). This I’m going to write as x is now called X1, and I’m going to introduce a new symbol β here. This guy is called X0/1 - β2. But β is universal convention for u/c. If all the velocities are measured as a fraction of the velocity of light, then β is a number between zero and one. So, the final way I want to write the Lorentz transformation for coordinate is that X1’ = (X1 - βX0)/√. So, you have to get used to this β. I mean, you’re following nicely what I was saying in terms of velocity u, but you have to get used to this β. We won’t use it a whole lot, but I put it here so you can recognize it when people refer to it in the literature, when you go read something.

What’s the transformation law for time? Remember t’ = (t - ux/c2)/(√1 - β2). β2 is clearly what’s sitting downstairs. u/c is β. So, what do you think we should do? We should multiply both sides by c. Because if you multiply both sides by c, put a c there, put a c there and get rid of this one, then you find X0 is (X0 - βX1)/(√1 - β2). So, let me write that here. Now, you see the relationship with the coordinates is nice and symmetric. If you write in terms of x and t, the coordinates transformation law is not quite symmetric. And that’s because one has units of length and one has units of time. But if you rescale your time to turn it into a length, then the transformation laws are very simple and symmetric between the two coordinates and space-time.

There are also the other coordinates; the y direction, X2 = X2 and in the z direction, X3 = X3. In other words, the length is perpendicular to the motion, or something you can always agree on. One can talk and talk about why that is true. Basically, what we are saying is, if you are zooming along, and you drew a line that says y = 1 and originally we agreed on it, we will have to agree on it even when I start moving relative to you, because you’ve got your own line. And there’s no reason your line should be above my line or should be above your line or below your line. If the lines originally agreed, there is no reason why my line is above yours or your line is below mine, because by symmetry, there’s no reason why one should be higher than the other. And the reason this is required is that the two lines, which last forever, can be compared anywhere you like. Unlike the rods, which are running away from each other and cannot be compared, the line here which says y = 3 has got to be y = 3 for both of us. So, the transverse coordinates are not modified. So, that’s why I don’t talk about them too much. All the interesting action is between any one coordinate along which the motion is taking place and time. So now, let’s ask ourselves maybe X02 + X12 is the same as this square plus that square. Well, let me just tell you that I’m not going to try something that I know will not work. In other words, you might think X02 + X12=X02 + X12. Well, that just doesn’t work. That doesn’t work because 1/√ is not a cos θ, and β/√ is not a sin θ. You cannot make them cosine and sine of something. If you could, that’ll work. But it doesn’t work. But I’ll tell you what does work.

Let’s take X02 - X12, the square of the time coordinate minus the square of the space coordinate with a minus. So, let’s try to do this in our head by squaring this and subtracting it from the square of this one. Downstairs, I think we all agree you get 1 - β2 because the square root is squared. How about upstairs? First, you’ve got to square this guy, this guy will give you X02 + β2X12 - 2βX0X1. From that, I should subtract the square of X12. That’ll give me -X12 - β2X02 + 2βX0X1. Just simple algebra. And these cross terms cancel out. And you notice I got an X02(1 - β2) -X12(1 - β2), right? This algebra, I trust you guys can do at home. And the result of this is going to be X02 - X12. That’s very nice. It says that even though people cannot agree on the time or space coordinate of an event, just like saying people cannot agree on what is X and what is Y, they can agree on this quadratic function you form out of the coordinates. But it’s very different from ordinary rotations, where you take the sum of the squares; here you’ve got to take the difference of the squares. And that’s just the way it is. Even though time is like another coordinate that mixes with space, it’s not quite the same. In other words, if I brought back all the transverse coordinates, then you will find X02 - X12 - X22 - X32 = X02 - X12 - X22 - X32. This you can do because X1 individually is equal to X1. So, you can put them on both sides. This is the actual four-dimensional result for people who really want to leave the x axis and move freely in space, which you’re allowed to. This is the invariant. This is the object that everyone agrees on. Allow me once again to drop this. Most of the time, I’m not going to worry about transverse coordinates.

So, this is the analogue of the length square of a vector, and it’s called the space-time interval. It’s not the space interval; it’s not the time interval. It’s a space-time interval between the origin and the point (x,t). And this is denoted by the symbol–there’s no universal formula, but let’s use s2 for space-time separation. Notice that even though it’s called s2, s2 need not be always positive, definite. You can take two events, but if the time coordinate is the same, or the time coordinate is zero, the space coordinate is not zero, then s2 will be negative. Whereas the usual Pythagoras length square, x2 + y2 is positive definite, always positive, the space-time interval can be positive or negative or zero. It’s positive if X0 can beat the sum of the squares of those guys. It’s negative if they can beat X0 and it’s equal to zero if X02 equal to that square.

So, if you want, if you go back to this diagram I drew here, this is X1. This is X0 = ct. These are events for which s2 is positive. These are also events for which s2 is positive. These are events for which s2 is negative. These are events in which s2 is zero. So, whenever s2 is positive, we have a name for that. It’s called a time-like separation. Well, it’s called time-like separation because the time-part of the separation squared is able to beat the space-part of the separation. It’s got more time component than space component. Now, you can also do the following. Let me introduce some rotation now. Let us agree that we will use X for a space-time with what’s called a four-vector. A four-vector has got four components. And the components are X0, and since I’m tired of writing, X1, X2, X3, I will call it r. So, in the special theory of relativity, what you do is, you take the three components of space and add one more of time and form a vector with four components. And we will agree that the note symbol for a space-time vector is an X. Now, I’ve been careless in my writing. Sometimes I use x and X, but you guys should be a little more careful. X like this, but no subscript, will stand for these four numbers. It is a four-vector. It’s a four-vector; it’s a vector living in space-time. And when you’ve got another frame of reference, the components of four-vector will mix with each other. Then, we’re going to define a dot product. It’s going to be a funny dot product. It’s going to be called X∙X. Usual dot product was some of the squares of the components. But this funny dot product will be equal to X02 minus the length of the spatial part, squared.

Student: [inaudible]

Professor Ramamurti Shankar: Pardon me?

Student: Is that a small x?

Professor Ramamurti Shankar: This is an X. These are all small. X is the name for a vector with four components. It’s a position vector in space-time. These are–This is the same as writing X02 - X12 - X22 - X32. Suppose there are two events. One occurs at the space-time coordinate X. The second event occurs at a new point, let me call it Xbar, whose time coordinate X0bar, whose position coordinate, this is unfortunate, vector r with a bar on it. These are like two vectors in the x,y plane. You and I, if we are going relative to each other, will not agree on the components of this or the components of that. But we will agree on X∙X’. In other words, X∙X will be the same as X’∙Xbar. What I’m saying is, if we go to a moving frame of reference, the components of the vector X goes into X’, whose components are X0 and some r’. And likewise, the vector Xbar has new components; Xbar, which are X0bar and rbar.

Look, it’s a story as the same like in two dimensions. Components of vectors are arbitrary. They vary with the frame of reference. But the dot product of vectors is the same no matter how you rotate your axis, because the dot product is length of A times length of B times cosine of the angle between them. None of the three things changes when you rotate your axis. Well, this is the analogue of the dot product in space-time because this is the guy that’s same for everybody. You can say, “Why don’t you study this combination?” It’s so nice. All pluses. I like that. I don’t study that because that combination is not special. If I have one value for the combination, you will have a different value for the combination. There’s nothing privileged. It’s the one with the minus signs in it that has the privileged role of playing the role of the dot product. So, space-time is not Euclidean. Euclidean space is the space in which we live, and distance squared is the sum of the squares of all the coordinates. So, it’s a pseudo-Euclidean space in which to find the invariant length, you’ve got to square some components and subtract from them the square of some other components. Okay.

So, finally, if you take a difference vector, in other words, take the difference of two events, then the spatial coordinate according to me, the difference in space coordinate, namely, c Δt2 - Δx2 will be c Δt’ - Δx’2. Or if you like, X02 - X12 will be X02 - X12. I’m using the notations back and forth so you get used to writing the space-time coordinates in two different ways. So, it not only works for coordinates, but for coordinate differences. You understand? Two events occur, one here now and one there later; let’s say they are separated in time by two meters. They are separated in space by two meters and time by 11 seconds. I’m sorry. I didn’t mean to write this. I meant to write ΔX02 - ΔX12 = ΔX02 - ΔX12. So, do you guys follow the meaning of this statement? Most things are relative. Distance between events – relative. Time interval between events – relative. But this combination is somehow invariant. Everybody agrees on what it is.

Chapter 5: The Space-Time Interval, or “Proper Time” [00:44:09]

Now, we are going to understand the space-time interval. Δs2 is a name for a small or a space-time interval formed under differences. Now, we are going to apply this to the following problem. I’m going to give you a feeling for what space-time interval means when applied to the study of a single particle. So previously, Δx and Δt were separations between two random, unrelated events or arbitrary events. But now I want you to consider the following event. Here is a particle in space-time. And it moves there. This is where it is in the beginning, that’s where it is at the end. So now, I want Δx to be the distance the particle travels. And I want Δt to be the time in which it did this. So, these are two events in the life of a particle. Two events lying on the trajectory of a particle. In the x,t plane, you draw a line, it’s a trajectory, because at every time, there’s an x, and that describes particle motion. And this is the Δx and this is the Δt.

Let’s look at the space-time interval between the two events. Δs2 will be c Δt2 - Δx2. But we’re going to rewrite this as follows. Equal to cC Δt2[1 - (v2/c2)] where v is Δx/ Δt. Because these are two events in the life of a particle, the distance over time is actually the velocity of the particle. I have always used v for the velocity of a particle that I’m looking at, and u for the velocity difference between my frame of reference and your frame of reference. So, u is always the speed between frames. And v is the speed of some particle I’m looking at. So, let’s take the square root of both sides, and we find Δs = c Δt√[1 - (v2/c2)]. So, the space-time interval between two events in the life of a particle is c times the time difference times this factor. Now, this is supposed to be an invariant. By that I mean no matter who calculates this space-time interval, that person is going to get the same answer. So, let’s calculate the space-time interval as computed by the particle itself. What does the particle think happens? Well, the particle says, in other words, you’re riding with the particle. So, in your frame of reference, for the particle, now, the time difference between the two events–let me call this one Δτ. And the space difference? What’s the space difference between the two events? Yeah?

Student: [inaudible]

Professor Ramamurti Shankar: Zero. Because as far as the particle is concerned, I’m still here. If I’m moving, from my vantage, my x coordinates, wherever I chose to put in the beginning, and that’s where it’ll follow me. So the two events, particle sighted here and particle sighted there, have different x coordinates for the person to whom the particle is moving. But if you’re co-moving with the particle, then your x coordinate as seen by you does not change. Therefore, Δx is zero. Therefore, the space-time interval would be simply c Δτ where τ is the time measured by a clock going with the particle. So, Δτ is the time in particle’s frame. In other words, if the particle had its own clock, that’s the time it will say has elapsed between the two points in its trajectory. So, it’s not so hard to understand. See, I am the particle. I’m moving, looking at my watch, and I’m saying that time will pass for me, right? One second and two second; that’s the time according to me. If you guys see me, you will disagree with me on how far I moved and how much time it took. That’s your Δx and your Δt. But for me, there’s only Δτ. There’s no Δx for me. So, the space-time interval, when you describe particle behavior, is essentially the time elapsed according to the particle. So, it’s not hard to understand why everybody agrees on that. See, you and I don’t have to agree on how much time elapsed between when the particle was here and when it was there. But if we ask, how much time elapsed according to the particle, we are all asking the same question. That’s why we all get the same answer.

The space-time interval is called the proper time. So, proper time is another name for the time as measured by a clock carried with the particle. S,o let’s write it this way. The space time interval is really c Δτ. And Δτ is also an invariant. Namely, everyone agrees on Δτ. And what’s the relation between Δt and Δτ? Δt is the time according to any old person. And Δτ is the time according to the clock. And you saw that the space-time interval, c Δτ was equal to Δt√[1 - (v2/c2)]. So, we are going to–I’m sorry, the c cancels on both sides. This is the relation. So, I will be using the fact that dτ/dt = √[1 - (v2/c2)] or dt/dτ = 1/√[1 - (v2/c2)]. In other words, the time elapsed between two events in the life of a particle, as seen by an observer for whom it has a velocity, compared to as seen by a clock moving with the particle, that ratio is given by this. As the particle speeds up, let’s see if it makes sense. As v approaches c, this number approaches zero. That means you can say it took–the particle has been traveling for 30 hours, but with this number almost vanishing, the particle would say, I’ve been traveling for a very short time. That’s the way it is. The particle will always think it took less time to go from here to there, compared to any other observer, because the clock runs fastest in its own rest rate. Okay.

Chapter 6: Deriving the Velocity and Momentum Vectors in Space-Time [00:51:47]

Now, this is the new variable I’m going to use to develop the next step. The next step is the following; in Newtonian mechanics, particles have a position x and maybe y and z. But let me just say x. Let me take one more. It had x and it had y. And these were varying with time. Then, I formed a vector r, which is ix + jy. And the vector is mathematically defined as an entity with two components so that when you rotate the axis, the components go into x’ and y’, which are related to x and y, but they’re cos and sin w. That’s defined to be a vector. Now, if I went to you and said, okay, that’s one vector, the position vector. Can you point to me another vector? Anybody know other vectors in Newtonian mechanics? Yes sir? You don’t know any other vectors in the good old mechanics days? The only vector you’ve seen is position?

Student: Velocity is one.

Professor Ramamurti Shankar: Very good. Yeah. That’s right. You’re right. You certainly know the answers. So, you shouldn’t hesitate. So, how do you get to velocity from position?

Student: Take the derivative.

Professor Ramamurti Shankar: You take the derivative. So, you’ve got to ask yourself; why does the act of taking derivative off a vector produce another vector? Well, what’s the derivative? You change the guy by some Δ, and you divide it by the time. Now, change in the vector is obviously a vector, because difference of two vectors is a vector. Dividing by time is like multiplying by the reciprocal of the time. That’s like multiplying by number. And I’ve told you multiplying a vector by a number also gives you a vector; maybe longer or shorter. Therefore, Δr is a vector because it’s the difference of r later minus r now, dividing by Δt is the same as multiplying by 10,000 or 100,000 or 1 million. It doesn’t matter. That’s also a vector. And the limit is also a vector. Therefore, when you take a derivative of a vector, you get a vector. And then you can–Once you got this derivative, it becomes addictive. You can take second derivatives. As you said, you can have acceleration. Then you can get–you can take the acceleration, multiply a mass by a mass.

Now, that is called a scalar. The mass of particle is a number that everyone will agree on. It has no direction. So, the product of a number and a vector is another vector, and that vector, of course, is the force. So derivatives of vectors and multiples of vectors by scalars, namely things that don’t change when you rotate your axis, are ways to generate vectors. So what I want to do is, I want to generate more vectors. The only four-vector I have is the position four-vector, which is this guy, X, whose components I write for you again are X0, which is code name for ct and X1, which is code name for x. And you can put the other components if you like. Now, take this X to be the coordinate in space time of an object that’s moving. I want to take the derivative of that to get myself something I could call the velocity vector in relativity. But the derivative cannot be the time derivative. You guys have to understand that if I take these things–Of course in space-time, the particle is moving. I can certainly tell you where it is at one time and where it is a little later. And I can take time derivative. But the time derivative of a vector in fourth dimension is not a vector, because time is like any other component, you know? It’s like taking the y derivative of x. That doesn’t give you a vector. You’ve got to take a derivative with respect to somebody that does not transform, that does not change from one observer to the other. So, do you have any idea where I’m going with this? Yes?

Student: Space-time.

Professor Ramamurti Shankar: Yes. Or you can take derivative with respect to the time as measured by the clock which is moving with the particle. So, you can take the τ derivative. In other words, let the particle move by some amount ΔX, namely ΔX0, ΔX1, ΔX2, etcetera. That difference will also transform like a vector. We have seen many, many times differences as coordinates and they transform like a vector. Divide it by this guy, now, which is the time according to a clock moving with the particle. Everybody agrees on that, because we’re not asking how much time elapsed according to you or according to me. We are going to quarrel about that indefinitely. We are asking, how much time passed according to the particle itself? So, no matter who computes the time, you will get the same answer, and that’s what you want to divide it by. So, I’m going to form a new quantity called velocity, which is the derivative of this with respect to τ. And I’m going to use the chain rule and write it as dx/dt(dt/dτ). That becomes then 1/[1 - (v2/c2)]dX0/dtdx1/dtdx2/dt…

So, you find the rate of change as measured by a clock moving with the particle. And that has the virtue that what you get out of this process will also be a four-vector. By that I mean, its four components will transform when you go to a moving frame just like the four components of X. You remember when you took x and y and you rotated the axis, x’ is x cos τ plus y sin and so on. But if you take time derivatives, you’ll find the x component of the velocity, in the rotated frame related to the x and y in the old frame by the same cosines and the same sines, because the act of taking derivatives doesn’t change the way the object transforms. So, this is my new four-vector. In fact, let me put the third guy too; dX3/dt. So, what I did was I took the τ derivative for which we don’t have good intuition and ordered in terms of a t derivative, for which we have a good intuition. Because no matter how much Einstein tells you about space and time, we think of time as different from space. So, we’ll think of time in terms of derivatives. But to form a four-dimensional vector, it’s not enough to do that. That’s what I’m saying. Every term there is divided by this, because that’s the way of rewriting d by as d by dt times this factor. So now, I am ready to define what I’m going to call momentum in relativity. It’s going to be called the four-momentum. Everything is the four something. The four-vector position, this is called the four-velocity. Now, I’m going to define something called the four-momentum. The four-momentum is going to be the mass of a particle when it’s sitting at rest. Multiply it by this velocity. So, what is it going to be? Let me write it out. What is dX0/dt? You guys remember X0 = ct. Sorry?

Student: [inaudible]

Professor Ramamurti Shankar: X0 = ct and dX0/dt will be equal to c. You’re right. If that’s what you were saying, I agree with you. Yes. So, this is a vector, m0c/[1 - (v2/c2)]. Then the other guy, other component, let me write simply as m0 times the familiar velocity times [1 - (v2/c2)]. If you wanted to keep the x, y, and z velocities, you can keep them as a vector. Or, if you just wanted to live in two dimensions, one space and one time, drop the arrow. So, we have manufactured now a new beast. It’s got four components. What is it? I modeled it after the old momentum. I took the mass and I multiplied it by what’s going to pass for velocity in my new world. And I got this creature with four parts. You’ve got to understand what the four parts mean. It’s got a part that looks like an ordinary vector. And it’s got a part, there’s no vectors in it. Completely an analogy with the fact that X had a part that looked like a vector in the three spatial components and a part that in the old days was called a scalar, because it didn’t transform. But of course, in the new world, everything got mixed up. We got to know who this is and we have to know who this is. So, I want to show you what they are.

So, here is my new four-vector; m0c/[1 - (v2/c2). And m0v, let me drop the other components except this in the x direction. I may not worry about that now. I wrote that because I want to be able to call it a four-vector. It’s crazy to call this a four-vector. I should call it a two-vector. You guys have already seen two vectors. Well, you have seen two-vectors in the x,y plane. This is a two-vector in space-time. Or, it’s part of this big four-component object. It’s a four-vector in four dimensional space-time. So, we have created something, and we’re trying to understand what we have created. So, let me look at this guy first. This guy looks like m0v divided by this factor. I don’t know what it stands for, but I say, well, relativistic physics should reduce the Newtonian physics when I look at slowly moving objects because we know Newton was perfectly right when he studied slowly moving objects. If I go to a slowly moving object, so v/c becomes negligible, I drop that, and that becomes mv; I know that’s the old momentum. So, this guy is just the old momentum properly corrected for relativistic theory. So, this quantity here deserves to be called momentum p; you can put an arrow on it, if you want, to keep the p component or don’t put an arrow so that of the four vector–the three components are just momentum, but not defined in the old way. It’s not m0v. It’s m0 divided by this. So, the momentum of a particle is very interesting in relativity, even though nothing can go faster than light, that doesn’t mean the momentum has an upper limit at m0c, because the momentum is not mass times velocity; it’s mass divided by this crazy factor. So, as you approach the speed of light, as v approaches c, the denominator is very close to zero, this number can become as big as you like.

So, why are people building bigger and bigger accelerators? If you ask them how fast is a particle moving, for everybody in Fermilab, in Surin it’s all close to the velocity of light. It’s 99.9999, other person has a few more nines. Nothing is impressive when you look at velocity. But when you look at the momentum, it makes a big difference how much of the one you have subtracted on the bottom. If you’ve only got a difference in the 19th decimal place, well, you’ve got a huge momentum. So, particles have limited velocity in relativity, but unlimited momentum. It also means, to speed up this particle is going to take more and more force as it picks up speed. In other words, you’ll be pushing it like crazy. It won’t pick up speed, but its momentum will be going up. It’ll pick up speed, but this pickup in speed will be imperceptible because in the 19th decimal place of v/c, it’ll go from 99999 to something near the end. The last digit will change. But the momentum will change a lot. But I don’t want to stop before looking at this guy.

Chapter 7: The New Energy-Mass Relation [00:64:40]

So, we don’t know who this is. So, we take this thing, this is a zero component of p. Just like I called it X0 in the old days, I’m going to call it P0. P0 was m0c/[1 - (v2/c2)]. So, I don’t know what to make of this guy either. If I put v = 0, I get m0c. That looks like the mass of a particle, but I have no idea. Okay, so it’s the mass of a particle. See here, when I put v = 0 on the bottom, because the top had a v, not v/c, something was left over that looked like something familiar, namely momentum. Here, if I take v = 0 too quickly, what’s left over is nothing reminiscent of anything in Newtonian mechanics. So, you want to go in what’s called the next order in v/c. You don’t want to totally ignore it. You want to keep the first non-zero term, so we write it as m0c[1 - (v2/c2)]-1/2. I’ve just written it with a thing upstairs for the - ½ as the exponent. Then, I remind you of the good old formula: 1 + xn = 1 + nx plusif x is very small. So, if you use that, you get m0c[1 + (v2/2c2) plus higher powers of v2/c2.

So, what are these terms coming out? Well, there’s the first term, for which I have absolutely no intuition. The second term looks like ½ (mv)2/c. So, for the first time, I see something familiar. I see the good old kinetic energy, but not quite, because I have this number c here. So, I decided maybe I shouldn’t look at P0, but I should look at cP0. Maybe that looks something more familiar. That looks like m0c2 + ½ (mv)2 plus more and more terms depending on higher and higher powers of v2/c2. But now we know who this guy is. This is what we used to call the kinetic energy of a particle by virtue of its motion. As the particle picks up speed, the kinetic energy itself receives more connections because the other terms with more and more powers of v/c are not negligible. You’ve got to put them back in. Basically, you have to compute this object exactly. But at low velocities, this is the main term. So, this is what led Einstein and mainly him to realize that this quantity is talking about the energy of an object. This is certainly energy recognized. I can put it to good use, right? You run windmills and so on with kinetic energy, or hydroelectric power is from channeling kinetic energy of water into work. So, you figure this is also part of the energy.

So, the first conclusion of Einstein was, even a particle that’s non-moving seems to have an energy, and that’s called the rest energy. If the particle is moving, then go back to the derivation and you can find cP0 = m0c2 over this thing. This is the full expression for the energy of a moving particle. Not the approximate one, but if you keep the whole thing, this is what you get. So, this says when a particle moves, it’s got an energy that looks like some velocity depending on mc2. And when the particle is at rest, it has got that energy, and that’s the origin of the big formula relating energy and mass. So, it leaves open the possibility that even a particle at rest; you think when a particle is at rest, you have squeezed everything you can get out of the particle, right? What more can it do for you? It gave it all it’s got and it’s stopped. But according to Einstein, there is a lot of fun left. We can do something more if there’s a way to destroy this. So, in his theory, it doesn’t tell you how you should do that. But it did point out that if you can get rid of the mass, because it was energy, some other form of energy must take its place by the law of conservation of energy.

So, that’s how all the nuclear reactions work. In nuclear reactions, you can take two parts, like two hydrogen atoms, hydrogen nuclei. You can fuse them into helium and you will find the helium’s mass is less than the mass of the two parts. So, there is some energy missing. Some mass missing; that means there’s some energy is missing. That’s the energy released in fusion. Or you can take uranium nucleus and give it a tap and break it up into barium and krypton and some neutrons. And you will find the fragments together have a mass less than the paired. And the missing mass times c2 is the energy of the reaction that comes out in the kinetic energy of the fast moving particles. So, Einstein is wrongly called the father of the bomb, because this is as far as he went. Didn’t specify how to extract energy from mass, but showed very clearly that rest energy–that a particle at rest seems to have a term that should be called energy because this guy certainly is. Yes?

Student: [inaudible]

Professor Ramamurti Shankar: Will be 3/8 m0(v4/c4). I don’t carry more terms in my head, but frankly, you won’t need it.

Student: [inaudible]

Professor Ramamurti Shankar: There are more corrections to kinetic energy. In other words, why is this an important concept? If two particles collide, we believe energy is conserved. So, you want to add the energy before and compare it to energy after. If you want them to perfectly match, if you stop here, your book keeping will not work. Before will not equal after. You’ve got to keep all this infinite number of terms to get it exactly right. But if you’re satisfied to one part in a million or something, maybe that’s as many terms as are needed. And at very low velocities, you don’t even need that because this is never changing in a collision. Every particle has a rest mass. This is what we balanced when we collided. Remember, we collided blocks and we balanced the kinetic energy? Actually, that energy is there in every block, but it is not changing. It is canceling out in the before and after. So, unless the two blocks annihilated and disappeared, then you would really have something interesting. But you don’t have that in the Newtonian physics. Okay.

So, I should tell you that I’ve given you guys two more problems, one of which is optional. You don’t have to do that, but it’s a very interesting problem. And I want you to think about it because it’s a problem whether there’s two rockets headed towards each other, both of the same length; say one meter. And this guy thinks he is at rest, and the second rocket is zooming past him like this. When the tail of this rocket passes the tip of this, he sends out a little torpedo aimed at this one. Clearly, it doesn’t do any harm. But look at it from the point of view of this person. This person says, “Well, I’m a big rocket. I’m moving like this, and when my tail met the tip of this rocket, he sent out an explosive, which certainly hit me down here.” So, the question is, “Does this upper rocket get attacked?” Does it get hit by the lower one or not? From one point of view, you miss. The other point of view, when you expand this one and shrink this one, you definitely got hit. So, you’ve got to ask, I mean, do you get hit or don’t get hit? You cannot have two answers to one question. So, I’ve given the problem. I’ve given you a lot of terms, and I think even though it’s optional, if you do that problem, you’ve got nothing to fear. Okay? It’ll give you good practice in how to think clearly about any number of these problems. And I’ve given you enough hints on how to do that.

[end of transcript]

Back to Top
mp3 mov [100MB] mov [500MB]