ECON 159: Game Theory
|Transcript||Audio||Low Bandwidth Video||High Bandwidth Video|
ECON 159 - Lecture 10 - Mixed Strategies in Baseball, Dating and Paying Your Taxes
Chapter 1. Mixed Strategy Equilibria: Example (Continued) [00:00:00]
Professor Ben Polak: All right, so last time we did something I think substantially harder than anything we’ve done in the class so far. We looked at mixed strategies, and in particular, we looked at mixed-strategy equilibria. There was a big idea last time. The big idea was if a player is playing a mixed strategy in equilibrium, then every pure strategy in the mix–that’s to say every pure strategy on which they place some positive weight–must also be a best response to what the other side is doing. Then we used that trick. We used it in this game here, to help us find Nash Equilibria and the way it allowed us to find the Nash Equilibria is we knew that if, in this case, Venus Williams is mixing between left and right, it must be this case that her payoff is equal to that of right and we use that to find Serena’s mix.
Conversely, since we knew that Serena is mixing again between l and r, we knew she must be indifferent between l and r and we used that to find Venus’ mix. So I want to go back to this example just for a few moments just to make one more point and then we’ll move on, but we’ll still be talking about mixed strategies throughout today. So this was the mix that we found before we changed the payoffs, we found that Venus’ equilibrium mix was .7, .3 and Serena’s equilibrium mix was .6, .4. And a reasonable question at this point would be, how do we know that’s really an equilibrium? We kind of found it but we didn’t kind of go back and check.
So what I want to do now is actually do that, do that missing step. We rushed it a bit last time because we wanted to get through all the material. Let’s actually check that in fact P* is a best response to Q*. So what I want to do is I want to check that Venus’ mix P* is a best response for Venus against Serena’s mix Q*. The way I’m going to do that is I’m going to look at payoffs that Venus gets now she knows - or rather now we know she’s playing against Q*. So let’s look at Venus’ payoffs.. I’m going to figure out her payoffs for L, her payoffs for R, and also her payoff for what she’s actually doing P*.
So Venus’ payoffs, if she chooses L against Q* then she gets–very similar to what we had on the board last week, but now I’m going to put in what Q* is explicitly–she gets 50 times .6.. [This is Q* and this is 1-Q*.]. So she gets 50 times .6 and 80 times 1 minus .6 which is .4, 80 times .4. We can work this out, and I worked it out at home, but if somebody has a calculator they can please check me. I think this comes to .62. Somebody should just check that. If Venus chose R–remember R here means shooting to Serena’s right, to Serena’s forehand–if she chose R then her payoffs are 90 Q*. So 90(.6) plus 20(1-Q*) so 20(.4), so 90(.6) plus 20(.4), and again I worked that out at home, and fortunately that also comes out at .62. So what’s Venus’ payoff for P*? We’ve got her payoff for both her pure strategies, so her payoff from actually choosing P* is what?
Well, P* is .7, so .7 of the time she will actually be playing L and when she plays L, she’ll get a payoff of .62, and .3 of the time she’ll be playing R, and once again, she’ll be getting a payoff of .62 and–do I have a calculator? Sorry, thank you. So P* is .7, yes, you’re absolutely right, so this is P* and 1-P*, So let’s make that clearer. I’ll show you what the equilibrium is but P* itself is .7. So when Venus plays L with probability of .7, then .7 of the time she’ll get the expected payoff of .62 and .3 of the time she’ll get a payoff again of .62 and that’s the kind of math I don’t have to do at home, that’s going to come out at .62. Again, assuming my math is correct. So all I’ve really done here is confirm what we did already last time.
We knew–we in fact chose Serena’s mix Q to make Venus indifferent between L and R. And that’s exactly what we found here, going left it’s .62, going right it gets .62 and hence P* gets .62. But I claim we can now see something a little bit else. We can now ask the question, is P* in fact the best response? Well, for it not to be a best response, for this not to be an equilibrium, there would have to be some deviation that Venus could make that would make her strictly better off. Let me repeat that. If this were not an equilibrium, there would have to be some deviation for Venus, that would make her strictly better off. By playing P* she’s getting a return of .62. So one thing she could deviate to, is playing L all the time. If she deviates to playing L all the time, her payoff is still .62 so she’s not strictly better off. That’s not a strictly profitable deviation. Another thing she could deviate to, is she could deviate to playing R. If she deviates to playing R, her payoff will be .62. Once again, she’s not strictly better off: she’s the same as she was before, so that’s not a strictly profitable deviation.
So what have I shown so far? I’ve shown that P* is as good as playing L, and P* is as good as playing R. In fact that’s how we constructed it. So deviating to L is not a strictly profitable deviation and deviating to R is not a strictly profitable deviation. But at this point, somebody might ask and say, okay, you’ve shown me that there’s no way to deviate to a pure strategy in a strictly profitable way, but how about deviating to another mixed strategy? So, so far we’ve shown–we’ve shown just up here–we can see that Venus has no strictly profitable pure-strategy deviation. She has no strictly profitable pure-strategy deviation because each of her pure strategies yields the same payoff as did her mixed strategy, yields the same as P*. But how do we know that she doesn’t have a mixed strategy that would be strictly better? How do we know that? Anybody? No hands going up; oh, there was a hand up, good.
Student: Any mix between left and right will still yield .62.
Professor Ben Polak: Good, so any mix that Venus deviates to, will be a mix between L and R, and any mix between L and R will be a mix between .62 and .62 and hence will yield .62. So we’re going to use again, this fact we developed last week. The fact we developed last week was that any mixed strategy yields a payoff that is a weighted average of the pure strategy payoffs, the payoffs to the pure strategies in the mix. Any mixed strategy yields a payoff that is a weighted average of the payoff to the pure strategies in the mix. That was our key fact last week. So here if we’ve shown that there’s no pure-strategy deviation that’s strictly profitable, then there can’t be any mixed strategy deviation that’s strictly profitable.
Why? Because the mixed strategy deviations must yield payoffs that lie among the pure strategy deviations. So this is a great fact for us. What’s the lesson here? The lesson is we only ever have to check for strictly profitable pure-strategy deviations. That’s a good job. Why? Because if we had to check for mixed strategy deviations one by one, we’d be here all night, because there’s an infinite number of possible mixed-strategy deviations. But there aren’t so many pure strategy deviations we have to check. Let’s just repeat the idea. Suppose there isn’t any pure-strategy deviation that’s profitable, then there can’t be any mixed strategy deviation that’s profitable, because the highest expected return you could ever get from a mixed strategy, is one of the pure strategies in the mix, and you’ve already checked that none of those are profitable.
So this simple idea, the simple idea we developed last time, not only helps us to find Nash Equilibria, but also to check for Nash Equilibria. Now a lot of people I gathered from feedback from sections were left pretty confused last time. It’s a hard idea. Actually I looked at the tape over the weekend, I could see where it could be confusing. But it’s actually, I think what’s really confusing here–it wasn’t so much–I think it wasn’t so much that I could have been clearer though I’m sure I could have been. It’s that this is really a hard idea, this idea of mixed strategies. So we’re going to work on it again today, but I think one of the ideas that gets people confused, is the following idea.
They say, look we found Venus’ equilibrium mix by choosing a P and a 1-P to make Serena indifferent. We found Serena’s equilibrium mix by finding a Q and a 1-Q to make Venus indifferent and a natural question you hear people ask then is, why is Venus “trying to make Serena indifferent?” Why is Serena “trying to make Venus indifferent?” That’s not really the point here. It isn’t that Venus is trying to make Serena indifferent. It’s that in equilibrium, she is going to make Serena indifferent. It isn’t her goal in life to make Serena indifferent between l and r, and it isn’t Serena’s goal in life to make Venus indifferent between L and R, but in equilibrium it ends up that they make each other indifferent. The way that we can see that is that if Venus puts–we said last time it’s repeated–if Venus puts too much weight, more than .7 on L, then Serena just cheats to the left all the time, and that can’t possibly be an equilibrium. And if Venus puts too much weight on R, then Serena cheats to the right all the time and that can’t be an equilibrium. So it has to be that what Venus is doing is going to make Serena exactly indifferent and vice versa.
Chapter 2. Mixed Strategy Equilibria: Other Examples in Sports [00:12:49]
Now let’s see that idea in some other applications. Let’s talk about this a bit before we move on. So it turns out that some very natural applications for mixed-strategy equilibria arise in games, in sport. So let’s talk about a few now. Can anybody suggest some other places where we see randomization or at least mixed strategy in equilibria in sporting events? Let me actually grab the mike myself. Anybody here play football for example, and we’re talking American football now, the gridiron game, not the civilized type. Anyone play? Yes, so some of you play football. So where is the mixing involved in playing football? Where in equilibrium would we expect to see mixed strategies? There’s somebody down there can we go on and get them. So shout it out.
Student: Running game and passing game.
Professor Ben Polak: All right, the running game and the passing game. So a very simple idea whether to run or whether to pass when you have the ball is likely to end up as a mixed-strategy equilibrium. The defense is also randomizing between, for example, rushing the passer or playing a run defense. Is that right–this is not exactly a game I know a lot about, but I’m hoping I’m getting close enough here. It couldn’t possibly be a pure-strategy equilibrium, other than very extreme parts of the game, like at the end of the game perhaps, but for most of the game, it’s very unlikely to end up as a pure-strategy equilibrium. Much more likely that the offense is mixing between passing and running, and for that matter between going to the left, going to the right and going to the center, and the defense is also mixing between–over its types of defense. So we see that–for those people who were watching yesterday–we see that in football games.
Where do we see it else in sport, some other sports? I can’t have a room full of non-sports fans. How many of you ever watch any sports? Let’s raise some hands here–some of you do. So this is baseball playoff season. How many of you have been watching the baseball playoffs? Raise your hands if you’ve been watching the baseball playoffs. I’ll let you off, I know you should have been doing my homework. How many of you have been watching the playoffs instead? How many watched the Yankees game last night? Quite a few of you. So they haven’t been very exciting yet but we’re hoping that it’s going to get more exciting. So when you’re watching baseball what kind of things do you see where you just know that there must be mixed strategies involved? There must be randomization involved. Now I’ve got a few more hands out. Good, so you sir.
Student: Choosing how to pitch the ball.
Professor Ben Polak: Choosing how to pitch the ball. Enlarge a little bit more, say a bit more.
Student: Fast ball versus slider, versus change up, all sorts of different things.
Prof Ben Polak: All right, so there’s different ways of throwing the ball, and there’s going to be randomization from the pitcher, or at least it’s going to look like there’s randomization by the pitcher over whether to throw a fast ball or a curve ball or whatever. How is the hitter randomizing there? How is the hitter randomizing? Is the hitter randomizing at all? What’s the hitter doing while this is going on? Anybody? Yeah.
Student: He’s choosing whether to swing or not to swing.
Professor Ben Polak: Okay, he’s choosing whether to swing or not to swing, although presumably he can do that just after the ball’s thrown. So you sometimes hear the commentator say that that hitter was looking for a fast ball, is that right? Or looking for a curve ball. The hitter is trying to anticipate the pitch, is that right? This is not a game I played a lot of either–I played a little bit. You’re trying to anticipate where the ball is going to be thrown. So the type of ball you throw in baseball and the way in which the pitch being anticipated by the hitter, is likely to be a mixed strategy. What else is likely to be a mixed strategy in baseball? What else? Anybody here on the Yale baseball team? Okay, I’ve got one volunteer here. So what else, stand up for a second. Let’s have a Yale baseball team member, what’s your name?
Professor Ben Polak: Where do you play?
Student: I’m a pitcher.
Professor Ben Polak: You’re a pitcher, okay. So he’s not going to get on base now, so he’s not going to answer this. Suppose you did get on base, pitchers don’t often get on base. Let’s assume that happens, what might you randomize? There you are, you’re standing on base, what might you randomize about?
Student: Whether to steal second or not.
Professor Ben Polak: Right, whether to steal or not, whether to try and steal or not. Stay up a second. So the decision whether to try and steal or not is likely to end up being random. If you’re the pitcher, what can you do in response to that?
Student: You can either choose to try to pick them off or not.
Professor Ben Polak: What else? So one thing you can try and pick him off. What else?
Student: You can be quicker to the plate.
Professor Ben Polak: Quicker to the plate, what else?
Student: You can pitch out.
Professor Ben Polak: You can pitch out, what else? At least those three things, right?
Professor Ben Polak: At least those three things okay, thank you. I have an expert here, I’m glad I had an expert. So in this case we can see there’s randomization going on from the runner whether he attempts to steal the base or not, and by the pitcher on whether he throws the pitch out or whether he tries to throw, to get to the plate faster. So we see this in sport. We don’t see it well anticipated by sports commentators. Let me put this down a second. So in baseball, for example, you’ll sometimes see quite sophisticated statistical analyses of baseball in which somebody will have looked at base stealers across the major leagues and they’ll look at all the instances in which a player was on first base and in the position where you think they might steal, and they’ll look at what happened on every attempt to steal, whether they were in fact caught stealing or not, and they’ll try and measure the value of these things and they’ll see, the conclusion they’ll come to is something like this.
They’ll conclude that whether the guy stole or not, whether the guy attempted to steal or not, sorry, or whether he just sat on first base doesn’t seem to make much difference, they’ll say. They’ll say that the payoff for even great base stealers are attempting to steal or not, when you take into account the pick offs versus just staying put, turns out the payoff in terms of the impact on the game is roughly equal, and then they’ll draw– these analysts will then draw the following conclusion. They’ll say, oh look, speed or the ability to steal bases is therefore overrated in baseball.
How have they made a mistake? What’s the mistake they made there? So the premise was, let’s give them the premise, the premise was that when a base stealer is attempting to steal or not the expected return in terms of outcome of the game is roughly equal, whether they attempt to steal or don’t attempt to steal. The conclusion is, therefore stealing doesn’t seem such a big deal. What’s the mistake they’ve made? Yeah, let me borrow it again, sorry.
Student: The pitcher has to react differently in pitching when he knows that there’s a fast guy on base.
Professor Ben Polak: Good, so our pitcher has to react differently. Let’s talk to our pitcher again, so one thing our pitcher said was he wants to get to the plate faster. What does that mean getting to the plate faster? –Shout out so people can hear you.
Student: It means just getting the ball to the catcher as fast as possible so he has the best chance to throw out the runner.
Professor Ben Polak: Right, so you’re going to pitch from, you’re not going to do that funny windup thing, you’re not, thank you, you’re going to pitch from the stretch, I knew there was a term there somewhere. I’m learning American by being here. And you’re more likely to throw a fast ball, there’s some advantage in throwing a fast ball rather than a curve ball. Both actions of which, both having to move more towards fast balls and pitching to the stretch are actually costly for the pitcher. But we’ll get there in a second, let’s just back up a second, so that was good, that’s right. But let’s just back up a second. The premise of these commentators was what?
It was that the return to stealing, attempting to steal, seems to be roughly a wash. It seems to be that the expected return when this great base runner attempts to steal a base is roughly the same as the return when they don’t attempt to steal the base. But I claim we knew that was going to the case. We didn’t have to go and look at the data. Why did we know that was going to be the case? How did we know that we were bound to find a return in that analysis that finds those things roughly equal? Yeah.
Student: If he is randomizing that means that the returns will be equal. If they weren’t equal he would just do one or the other all the time.
Professor Ben Polak: Good, excellent. Since we’re in a mixed strategy equilibrium, since he’s randomizing, it must be the case that the returns are equal. That’s the big idea here, that’s the thing we learned last time. If the player, and these are professional baseball players doing this, they’ve been very well trained, a lot of money has been spent on getting the tactics right. There’s people sitting there who are paid to get the tactics right. If it was the case that the return to base stealing wasn’t roughly equal when you attempt to steal or didn’t attempt to steal, then you shouldn’t be randomizing. Since you are randomizing it must be the case that the returns are roughly equal. So that’s the first thing to observe and the second thing to observe is what we just pointed out.
In fact, the value of having a fast base stealer on the team doesn’t show up in the expected return on the occasions on which he attempts to steal, or which he does not attempt to steal. It shows up where? It shows up in the fact that the pitching team changes their behavior to make it harder for this guy to steal by going faster to the plate, or throwing more fast balls. Where will that show up in the statistics? If you’re just a statistician like me, you just look at the data, where will that show up? I mean suppose I can’t keep track of every single pitch, I can’t actually observe all these fast balls, where will I see the effect of all these extra fast balls in pitching and from the stretch, in the data? Somebody?
It’s going to show up in the batting average of the guy who’s hitting behind the base stealer. The guy hitting behind the base stealer is going to have a higher batting average because he’s going to get more pitches which are fast balls to hit, and more pitches out of the stretch. So if you ignore that effect, you’re going to be in trouble. But we know, if we analyze this properly using Game Theory, we know we’re in a mixed strategy equilibrium. We know, in fact, the pitching team must be reacting to it. We know there must be a cost in doing that, and the cost turns up in the hitter behind. So when you’re watching the playoffs in the last– now I’m giving you permission to watch a bit of TV at night, after you’ve done my homework assignment, but before anyone else’s homework assignment–you can have a look at these baseball games and have a go at being a little bit better than the commentators who are working on them.
Chapter 3. Mixed Strategy Equilibria Interpretation 1: Literal Randomization [00:23:41]
So one application for mixed strategies is in sports, but not the only application. Let’s just talk about another application, a slightly more scary application. So after 9/11 there was a lot of talk in the U.S. about the placement of baggage checking machines at airports. Actually there’s still quite a lot of talk, but there was a lot of talk then about the placement of machines to search the luggage that goes onboard. The hand luggage was being searched anyway, but to search luggage going into the cabins. It was pointed out at the time, this has changed since, there weren’t actually enough machines in the U.S., on the day after 9/11, to search every single bag that went into the hold. You’d hear discussions of the following type. You’d hear these experts on Nightline or whatever and they’d say: look there’s no point trying to do this, because if we put all our baggage searching machines at Logan Airport in Boston, for example, then the terrorists will simply move their attack to O’Hare and if we put them at O’Hare, then they’ll move their attack to Logan.
If we have enough to do both Logan and O’Hare then they’ll move their attack to some third airport. So there was a sense of doom in the air. It was kind of a depressing time anyway. There was a sense of doom in the air saying that if you put your baggage searching machines somewhere, all you do is cause the attempted terrorists, terrorists attempting to blow up the planes, to go elsewhere. And you hear the same things today about searching individuals as they go on the plane. For example, you’ll hear a discussion that says, if we only search men traveling alone, let’s say, then you’ll quickly end up with all the people carrying bombs being couples or women. Again, there’s this sense of doom, this sense that says it’s hopeless. Whatever we do we’re just going to force the terrorists to do something else but we won’t have gained anything.
So once again that’s wrong. What’s wrong about that one? What should we be doing in that setting? Let me come down again. What should they have done–in fact they did do–with those luggage/baggage searching machines when they were in short supply after 9/11? What do they do with searching people as they got on planes? What do they do? Well here’s what they didn’t do, they didn’t just put them at certain airports and announce they’re just at these airports. That would have been a crazy thing to do. That would have been hopeless–not entirely hopeless–but not wise. What should they have done? What did they do? Anybody want to guess? Yeah.
Student: In name they randomized who they were checking.
Professor Ben Polak: Right, so when they’re checking passengers, they’re going to randomly check passengers. When they’re checking, when they think about the baggage machines, a sensible thing to do is to put a big metal box at every single airport and say: we’re not going to tell you which of these boxes actually have baggage checking machines, which effectively is randomizing. From the point of view of the terrorists, they’re not going to know where the baggage checks are going on. That’s worth doing. It doesn’t–It isn’t going to perfectly eliminate, well unfortunately it isn’t probably going to perfectly eliminate all terrorist attacks, but it does make it harder for the terrorists. So randomization there–whether it’s literally randomizing over who is checked, or whether it’s “as it were” randomizing, by concealing where in fact you have placed those machines–can be very effective.
The hard thing, both in sports and in these military examples, is really mimicking randomization. It’s very hard for us as humans to do it, and there’s a famous story about a military commander, actually a English military commander during an insurgent war in, I think it was Malaysia after World War II, where again he had to worry about randomizing which convoys to protect. And the way in which…–He figured out that randomizing was the right thing to do to try and protect these convoys as well as he could with small numbers of troops. And the way in which he randomized was he literallyrandomized. Every morning he put a bit of paper in his hand and he had somebody, had one of his sergeants, pick which hand the paper was in. So we do actually see these random strategies used. The reason we have to literally randomize is because it’s very difficult to do so unless you’re a professional sports player.
Chapter 4. Mixed Strategy Equilibria Interpretation 2: Players’ Beliefs about Each Other’s Actions [00:28:17]
Okay, but it turns out that mixed strategy equilibria, and mixed strategies in general, are relevant beyond just these contexts in which you think of people literally randomizing. I want to look at a different context now. So I want to go back to a game we started a few weeks ago. This isn’t the same game. It’s a sequel. It’s a follow up in our exciting adventure of our dating couple in the classroom. Who were our dating couple, do we still have them here? That was the guy, who is the–yeah, there they are. They’re even sitting closer. What a success here. Can we get the camera on them a second? Stand up a second, thank you. Your name was?
Professor Ben Polak: David. And look at this. Is this romantic or what? David and your name is?
Professor Ben Polak: Nina and David, okay. I think we pretty much figured out last time that Nina’s Player I and David’s Player II, is that right? As we remember last time, I’ll pick on you in a second, you can sit down a second. So we figured out last time that they were going to try and go on a date and they had arranged to go to the movies. They picked out two, in fact three movies, but two that remained viable, and the problem was being typical Economics majors who are, are you both Economics majors, I think we figured that out? They are, look at that, so being typical Economics majors who are just hopeless at dating they had forgotten to tell each other which movie they’re going to.
So that, I don’t know if that worked out well or not, but now that life has moved on, they’re going to try it again, but this time taking advantage of fall in New England, rather than go to a movie, they’ve decided on some new activities. So they might either go apple picking or they might go to the Yale Rep and see a play.. And so apple picking has its advantages: the fall weather, it’s local flavor, it has certain undertones about the Garden of Eden or something. I don’t know if you can use the term flavor, local or otherwise, for American apples but never mind. And the Yale Rep, Yale Rep is a good thing to do in New Haven, go to a play, I think it’s Richard II is showing now, is that it? Probably not a great “date play” but Economists are trying to show they have culture, so there it goes. And let’s assume the payoffs are like this.
Much as they were before, whereby we mean that Nina wants to meet David but she would, given the choice she would rather meet David in the apple fields. And David who’s a dark personality, likes the sort of darker side of Shakespeare. And he also wants to meet Nina but he would rather meet at the Yale Rep. If that’s backwards I apologize to their preferences. But once again, because they’re still incompetent Economics majors, they’ve again forgotten to tell each other where they’re going. So let’s analyze this game again, we’ve figured out this was a coordination game last time or several weeks ago., And we know in this game, we know what the pure strategy Nash Equilibria are, so no prizes to be able to spot them. One of the Nash Equilibria in pure strategies, let’s put this in pure strategies, so one of the pure strategy Nash Equilibria is for them both to go apple picking and meet up in Bishop’s Orchard or whatever, and another pure strategy equilibrium is for them both to choose the Rep.
We’d figured out that if they were able to communicate, there’s really a pretty good chance of them managing to coordinate at one of these equilibria but we suspect, I think, that this is not all that’s going on here. It looks quite likely that come your next Saturday afternoon, when we send these guys out on their date, they’re going to fail to meet, it’s at least plausible. To test the plausibility of that, let’s ask them, have you been, have you managed to meet on a date yet? No, haven’t managed to meet on a date. See, so I’m proving the point that in fact they haven’t managed to at least coordinate an equilibrium yet. So it seems at least plausible that they’re going to fail to coordinate. It’s plausible they’re going to fail to coordinate. We’d like to sort of capture that idea, and the way we’re going to capture that idea is–let’s see if there’s another equilibrium in this game.
Well, there certainly isn’t another pure strategy equilibrium in this game is there? We know that.. So if there’s another equilibrium it better be mixed. So let’s try and find a mixed Nash Equilibrium in this game, and remember this game is called Battle of the Sexes, it’s a famous game. This is Battle of the Sexes revisited. So how are we going to go about finding this mixed Nash Equilibrium in the game? We’ll interpret it later but let’s just work on finding it. So, in particular, I’m going to postulate the idea that Nina is going to mix P, 1 - P and David is going to mix Q, 1 - Q. So how do we go about finding David’s equilibrium mix Q, 1–Q? What’s our trick from last week? Should be able to cold call at this point, but let’s not have to. How am I going to find Q, the equilibrium Q? Somebody?
Thank you, they can use Venus’ payoffs, good. So to find–it isn’t Venus’ payoffs–it’s Nina’s payoffs. Fair enough, sorry. So to find the Nash Equilibrium Q, to find the mix that David’s using we use Nina’s payoffs. So let’s do that. So, in particular, for Nina, if she goes apple picking then her payoff is 2 with probability Q if she meets David and 0 otherwise. If she goes to the Rep then her payoff is 1 if she meets David, sorry, need to be careful, let’s do it again. If she goes to the Rep her payoff is 0 if David goes apple picking with probability Q, and her payoff is 1 if she meets David at the Rep, which happens with probability 1 - Q, is that correct? So this is her payoff from apple picking and this is her payoff from seeing Richard II. And what do we know if Nina is indeed mixing, what do we know about these two payoffs? They must be equal. If Nina is in fact mixing, then these two things must be equal. And that means: what we’re saying is 2Q equals 1(1-Q) or Q equals 2/3, I guess it is. No it’s 1/3 sorry. Is that right? Q is 1/3.
Okay, so our guess is that if there’s a mixed strategy equilibrium it must be the case that David is assigning a probability 1/3 to going apple picking, which means he’s assigning probability 2/3 to his more favored activity which is going to see Richard II. What about, I’m going to pull these both down, okay, how do we find Nina’s mix? So to find the Nash Equilibrium P, to find Nina’s mix what do we do? What’s the trick? Somebody? Use David’s payoffs. So David’s payoffs, if he goes apple picking then he gets a payoff of 1 if he meets Nina there and 0 otherwise and if he goes to the Rep he gets a payoff of 0 if Nina’s gone apple picking, and he gets a payoff of 2 if he meets Nina at the Rep.
Once again, if David is indifferent it must be that these are equal. So if these are– if David is in fact mixing between apple picking and going to the Rep–it must be that these two are equal and if we set this out carefully we’ll get, let’s just see, we’ll get 1(P) equals 2(1-P), which is P equals 2/3 and 1-P equals 1/3. So here we have Nina assigning 2/3 to going apple picking, which in fact is her more favored thing and 1/3 to going to the Rep. Okay, so we just used the same trick as last time, let’s check that this is in fact an equilibrium. So, in particular, let’s check that it is in fact an equilibrium for Nina to choose 2/3, 1/3. Let’s check. So check that P equals 2/3 is in fact the best response for Nina. Let’s go back to Nina’s payoffs. For Nina, if she chose to go apple picking, her payoff now is 2 times Q but Q is equal to 1/3 plus 0(1-Q) and if she chooses to go to the Rep then her payoff is 0 with probability 1/3 and 1 with probability now 2/3.
All I’ve done is I’ve taken the lines I had before and substituted in now what we know must be the correct Q and 1-Q and this gives her a payoff of 2/3 in either case. If she chooses P, her payoff to P will be 2/3 of the time she’ll get the payoff from apple picking which is 2/3 and 1/3 of the time she’ll get the payoff from going to the Rep which is 2/3 for a total of 2/3. So Nina’s payoff from either of her pure strategies is 2/3. Her payoff from our claimed equilibrium mixed strategy is 2/3, so neither of her possible pure strategy deviations were profitable. She didn’t lose her anything either, but they weren’t profitable, and by the lesson we started the class with, that means there cannot be any strictly profitable mixed deviation either, so indeed, for Nina, P is a best response to Q. We can do the same for David but let’s not bother, it’s symmetric.
So in this game we found another equilibrium. The other equilibrium, the new equilibrium is Nina mixed 2/3, 1/3 and David mixed 1/3, 2/3 and we also know the payoff from this equilibrium. The equilibrium from this payoff, for both players, was 2/3. There are three equilibria in this game. They managed to meet at apple picking in which case the payoffs are 2 and 1. They managed to meet at the Rep, that’s the second pure strategy equilibrium, in which case the payoffs are 1 and 2, or they mixed, both of them mixed in this way, and their payoffs are 2/3, 2/3. Why is the payoff so bad in this mixed strategy equilibrium? Does everyone agree, this is a pretty lousy payoff? The other equilibrium payoffs the worst you got was 1 and you sometimes got 2, but now here you are playing a different equilibrium and at this different equilibrium you’re only getting 2/3. Why are you only getting–what happened? Why have these payoffs got pushed down so far? What’s happening to our poor hapless couple? Or not hapless I don’t know. What’s happening to our couple?
Student: Sometimes they don’t meet.
Professor Ben Polak: Yeah, they’re failing to meet. The reason, what’s forcing these payoffs down is they’re not meeting very often? How often are they actually meeting? How often are they meeting? Let’s have a look. Let’s go back to the previous board. Here it is..So they meet when they end up in this box or this box, is that right? So what’s the probability of them ending in those boxes? Well ending up in this box is probability 2/3, 1/3 and ending up in this box is probability 1/3, 2/3, is that right? You end up meeting apple picking, the 2/3 of the time when Nina goes there times the 1/3 of the time when David goes there. And you end up meeting at the Rep the 1/3 of the time Nina goes there times the 2/3 of the time that David goes there. So this is the total probability of meeting and it’s equal to 4/9, is that right? So 4/9 of the time they’re meeting, but 5/9 of the time–more than half the time–they’re screwing up and failing to meet. This is why I call them a hapless dating couple.
So this is a very bad equilibrium, but it captures something which is true about the game. What is surely true about this game is that if they just played this game, they wouldn’t meet all the time. In fact what we’re arguing here is they’d meet less than half of the time. But certainly this idea that we’re given from the pure strategy equilibria, that they would magically always manage to meet seems very unlikely, so this does seem to add a little bit of realism to this analysis of the game. However, it leads to a bit of an interpretation problem. You might ask the question why on Earth are they randomizing in this way. Why are they doing this? It’s bad for everybody. Why are they doing this? This leads us to think about a second interpretation for what we think mixed strategy equilibria are. Rather than thinking of them literally as randomizing, it’s probably better in this case to think about the following idea.
We need to think about David’s mixture as being a statement about what Nina believes David’s going to do. David may not be literally randomizing. But his mixture Q, 1–Q, we could think of as Nina’s belief about what David’s going to do. Conversely, Nina may not literally be randomizing. But her P, 1 - P, we could think of as David’s belief about what Nina’s going to do. And what we’ve done is we’ve found the beliefs such that these players are exactly indifferent over what they do. We found the beliefs for David over what Nina’s going to do, such that David doesn’t really quite know what to do. And we found the beliefs that Nina holds about what David’s going to do such that Nina doesn’t quite know what to do. That make sense? So it’s probably better here to think about this not as people literally randomizing but these mixed strategies being a statement about what people believe in equilibrium.
Chapter 5. Mixed Strategy Equilibria Interpretation 3: Prediction of Split on Two or More Courses of Action in a Large Population [00:47:04]
We’ll come back and look at this game some more later on, so our couple I’m afraid are not quite out of the woods yet. But I want to spend the rest of today looking at yet another interpretation of mixed strategy equilibria. So, so far we have two, we have people are literally randomizing. We have thinking of these as expressions about what people believe in equilibrium rather than what they’re literally doing. And now I’m going to give you a third interpretation. So for now we can get rid of the Venus and Serena game. So to motivate this third idea I want to think about tax audits. So none of you here have ever, probably ever, had to fill out a tax form, except for the fact that there seems to be a lot of parents in the room today, is it parents weekend, is that what’s going on? So where are the parents in the room? Wave your arms in the air if you’re a parent here. So at least these guys at probably some point in their life filled out a tax form.
So come tax day, the parents in the room face a choice, and the choice is are they going to honestly fill out their taxes, or are they going to cheat? I’m not going to ask them what they, well maybe I will, but for now I won’t ask them what they did. So they can choose one of two things. They can choose to pay their taxes honestly–we’ll call that H–or to cheat. This is the tax payer, the parent. And at the same time the audit office, the auditor, has to make a choice, and the auditor’s choice is whether to audit you or not and it’s not literally true because literally the auditor can wait until your tax return comes in and then decide whether to audit you. But for now let’s think of these choices being made simultaneously, and we’ll see why that makes it more interesting. So let me put down some payoffs here and then I’ll explain them. So 2, 0, 4, -10, 4, 0 and 0, 4. So how do we interpret this? Let’s look at the auditor’s payoffs first of all.
So the auditor is very happy not having to audit your parents and having your parents pay taxes, so we’ll give that a payoff of 4. It’ll turn out, in this game, we’ve decided in the payoffs, that the auditor is equally happy if she actually audits your parents in the year that they cheated. We’ll say that makes the auditor equally happy. Now the auditor is not so happy if she audits your parents when they’re honest because audits are costly. The auditor is really unhappy if she fails to audit when the parents cheated. Let’s look at the–I keep wanting to call them parents–I should stop calling them parents, let’s call them taxpayers. So for the taxpayers, what are their payoffs? Well we’ll normalize things, so if they’re honest we’ll give them a payoff of 0. That means they correctly fill in their tax form and pay what they’re supposed to pay, but if they can conceal some of their income, they pretend to have whatever it is, a third child, then they might be in trouble if they’re audited. If they’re audited they’re going to have to pay a big fine, maybe even go to jail, so that’s -10. Of course if they’re not audited they get to keep a chunk of money so we’ll call that 4.
Everyone understand the basic idea of this game? In reality, we could add more complications, we could think of different ways to cheat on your taxes, but I don’t want to give tutorials on how to cheat on your taxes here. So it’s not going to take long staring at this game to figure out that there are no pure strategy equilibria in this game. Let’s just do that, so from the taxpayer’s point of view, if they’re going to be audited, then they’d rather pay their taxes than not, and if they’re not going to be audited then according to these payoffs they’d rather cheat. From the auditor’s point of view, if they knew everyone was going to pay taxes, then they wouldn’t bother auditing and if they knew everyone was going to cheat, then they’d of course audit. So you can quickly see that there’s no box in which the best responses coincide, there’s no pure strategy Nash Equilibria.
For those people who are thinking this is seeming other worldly, you will have to pay taxes in a couple of years, and trust me your parents are paying taxes now. So what we want to do here is we’re going to solve out and find a mixed strategy equilibrium, but we’re going to give it a different interpretation to the equilibria we found so far. But the basic, initial exercise is what? We’re going to find–we’re going to try and find the equilibrium here. So to find the Nash Equilibrium here we know it’s going to be mixed. So to find the probability with which taxpayers pay their taxes–and let me already start getting ahead of myself and just say to find the proportion of taxpayers who are going to pay their taxes–what do we do? What must be true of that equilibrium proportion Q of taxpayers who pay their taxes? How am I going to find that Q? Shout it out somebody. Yeah look at the auditor’s payoffs.
So from the auditor’s point of view, if the auditor audits, their payoff is 2Q plus 4(1-Q) and if they don’t audit their payoff is 4Q plus 0(1-Q). Everyone see how I do this, this is 2Q plus 4(1-Q) and this is 4Q plus 0(1-Q). And if indeed the auditor is mixing, then these must be equal. And if they’re equal, let’s just do a little bit of algebra here and we’ll find that 2Q equals 4(1-Q) so Q equals 2/3, is that right? So our claim is to make the auditor exactly indifferent between whether to audit or not, it must be the case that 2/3 of the parents of the kids in the room, are going to be paying their taxes honestly, which means 1/3 aren’t, which is kind of worrying, but never mind. Let’s have a look at the taxpayer. To find, sorry. We found the taxpayer, we found the proportion of taxpayers who are paying their taxes, now I want to find out the probability of being audited.
How do I figure out the equilibrium probability of being audited in this model? How do I work out the equilibrium probability of being audited? Shout it out. So the equilibrium probability of being audited are going to use P and 1-P, so P is going to be the probability of being audited, how do I find P? Yeah, I’m going to look at the taxpayer’s payoffs. So from the taxpayer’s point of view, if the taxpayer pays their taxes, their payoff is just 0, and if they cheat they’re payoff is -10P plus 4(1-P). And if indeed the taxpayers are mixing–or in other words, we are saying that not all taxpayers are cheating and not all taxpayers are honestly paying their taxes–then these must be equal. So if these are equal I’m going to get 4P equals 14–no it didn’t– I’m going to get 4 equals 14P, let’s try again, 4 equals 14P, that was a bit worrying, 4 equals 14P, which is the same as saying P equals 2/7. If somebody can just check my algebra I think that’s right.
So my claim is that the equilibrium here is for 2/3 of the taxpayers to pay their taxes and for the audits, the auditor, to audit 2/7 of the time. Now we could go back in here and we could check, I could do what I did before, I could plug the Ps and Qs in here and check that in fact this is an equilibrium, but trust me that I’ve done that, trust me that it’s okay. So here we have an equilibrium, let’s just write down what it is. From the auditor’s point of view it is that they audit 2/7 of the time, or 2/7 of the population, and from the taxpayers’ point of view, it’s that they pay their taxes honestly 2/3 of the time and not otherwise. Now without focusing too much on these exact numbers for a second, I want to focus first for a minute on how do we interpret this mixed strategy equilibrium.
So from the point of view of the auditor we’re really back where we were before with the base stealer or the person who’s searching baggage at the airport. We could think of the auditor literally as randomizing. In fact, there’s some truth to that. It actually is the case that by law, that the auditor’s literally have to randomize. So this 2/7, 5/7 this has the same interpretation as we had before. This is really a randomization. But this 2/3, 1/3 has a different interpretation and a potentially exciting interpretation. It isn’t that we think that your parents get to tax day, work out what their taxes would be and then toss a coin. They may be doing that, I’m looking at the parents and I don’t think that’s what they’re doing. The interpretation here is that the parents, some parents are paying their taxes and some parents aren’t paying their taxes. There’s a lot of parents out there, a lot of potential taxpayers, and in the population, in equilibrium, if these numbers were true, 2/3, of parents would be paying their taxes and 1/3 would be cheating.
So this is a randomization by a player, and this is a mixture in the population. The new interpretation here is, we could think of the mixed strategy not as players randomizing, but as a mix in a large population of which some people are doing one thing and the other group are doing the other. It’s a proportion of people paying taxes. So I don’t know if this 2/3, 1/3 is an accurate number for the U.S. It’s probably not very far off actually. For Italy I’m ashamed to say the number of people who pay taxes is more like 40%, maybe even lower now, and there are countries I think where it gets as high as 90%. I think the U.S. rate when they end up auditing is a little higher than this but not much. So again, we’re going to think of this not as randomization but as a prediction of the proportion of American taxpayers who are going to pay their taxes.
Chapter 6. Mixed Strategy Equilibria: Policy Applications [00:59:52]
Now, I want to use this example in the time we have left, to actually think about a policy experiment. So let’s put this up somewhere we can see it. Let’s think about a new tax policy. So suppose that Congress gets fed up with all these newspaper reports about how 2/3 of American’s don’t pay their taxes or whatever the true proportion is, I think it’s actually a little higher than that but never mind. They get fed up with all these reports and they say, this isn’t fair, we should make people pay their taxes so we’re going to change the law and instead of paying–instead of being in jail for ten years, or the equivalent of a fine of -10 if you’re caught cheating, we’re going to raise the fine or the time in jail so that it’s now -20. So the policy experiment is let’s raise the fine–to fine the cheating–to -20 and the aim of this policy is try to deter cheating, right? It seems a plausible thing for a government to want to do.
Let’s redraw the matrix, so here’s the game - 2, 0, 4, -20, 4, 0, 0, 4 audit, not audit and pay honestly or cheating. So here’s our new payoffs and let’s ask the question, with this new fine in place, now we’ve raised the fine, to being caught not paying your taxes, in the long run once things have worked their way back into equilibrium again, after a few years, do we expect American taxpaying compliance to go up or to go down, or what do we expect? What do we think is going to happen? So who thinks it’s going to go up? Who thinks it’s going to go down? Who thinks it’s going to say the same? Who’s abstaining here? I notice the parents are abstaining. u’re not really meant to abstain. You have to vote here. Well how are we going to figure this out? How are we going to figure out what’s going to happen to compliance? What happens to tax compliance? Tax compliance, remember that was our P–no it wasn’t, sorry, it was our Q. The only way we’re going to figure this out is to work out, so let’s work out the new Q in equilibrium.
Let’s do this, so to find out the new Q in equilibrium, once again, we’re going to have to look at the auditor’s payoffs, and the auditor’s payoffs if they audit, they’re going to get 2Q plus 4(1-Q), and if they don’t audit they’re going to get 4Q plus 0(1-Q), and if the auditor is indifferent, if they’re mixing, it must still be the case that these are equal. And now I want to ask you a question, where have you seen that equation before? Yeah, it’s still there right, I didn’t delete it. It’s the same equation that sits up there. Is that right? From the auditor’s point of view, given the payoffs to the auditors nothing has changed, so the tax compliance rate that makes the auditor exactly indifferent between auditing your parents and not auditing your parents, is still exactly the same as it was before at 2/3. In equilibrium, tax compliance hasn’t changed at all.
Let me say that again, the policy was we’re going to double the fines for being caught cheating and in equilibrium it made absolutely no difference whatsoever to the equilibrium tax compliance rate. Now why did it make no difference? Well let’s have a techie answer and then a better, a more intuitive answer. The techie answer is this, what determines the equilibrium tax compliance rate, what determines the equilibrium mix for the column player is what? Is the row’s payoffs. What determines the equilibrium mix for the column player are the row’s payoffs–row player’s payoffs. We didn’t change the row player’s payoffs, so we’re not going to change the equilibrium mix for the column player. Say again, we changed one of the payoffs for the column player but the column player’s equilibrium mix depends on the row player’s payoffs and we haven’t changed the row player’s payoffs, so we won’t change the equilibrium compliance rate, the equilibrium mix by the column player.
What will have changed here? What will have changed in the new equilibrium? So we’ve pretty much established that people are cheating as much as they were before in equilibrium. Rahul, can I get Henry here?
Student: Probability has changed.
Professor Ben Polak: Say again.
Student: The probability of audit would have changed.
Professor Ben Polak: The probability of audit will have changed. What’s going to change is not the Q but the P, the probability with which you’re audited is going to change in this model. Let’s just check it, to find the new P, I need to look at the taxpayer’s payoffs and the taxpayer’s payoffs are now 0, –sorry, if they pay their taxes honestly then they get 0, and if they cheat they get -20 with probability P and 4 with probability 1-P. If they’re mixing, if some of them are paying and some of them are not, this must be the same, and I’m being more careful than I was last time I hope, this gives me 24 P is equal to 4 or P equals 1/6. So the audit rate has gone down from 2/7 to 1/6. I’m guessing that probably wasn’t the goal of the policy although it isn’t necessarily a bad thing. There is some benefit for society here, because audits are costly, both to do for the auditor and they’re unpleasant to be audited, so the fact that we’ve managed to lower the audit rate from 2/7 to 1/6 is a good thing, but we didn’t manage to raise the compliance rate.
So I don’t want to take this model too literally, because it’s just a toy model, but nevertheless, let’s try and draw out some lessons from this model. So here what we did was we changed the payoff to cheating, we made it worse. But a different kind of change is we could have changed, sorry,- we changed the payoff negatively to being caught cheating. But a different change we could have done is we could have left the -10 in place and we could have raised the payoff to cheating and not getting caught. We could have left this 10 in place and changed this 4 let’s say to a 6 or an 8. We’ve increased the benefits to cheating if you’re not caught. What would that have done in equilibrium? So I claim, once again, that would have done nothing in equilibrium to the probability of people paying their taxes, but that would have done what to the audit rate? The audit rate would have gone up, the equilibrium audit rate would have gone up.
Let’s tell that story a second. So rich people, people who are well paid, have a little bit more to gain from cheating on their taxes if they’re not caught, there’s more money at stake. So my colleagues who are finance professors in the business school have more money on their tax returns than I do, so in principle, they gain more if they cheat. Does that mean that they cheat more than me in equilibrium? No, it doesn’t mean that they cheat more than me in equilibrium. What does it mean? It means they get audited more often. In equilibrium, richer people aren’t necessarily going to cheat more, but they are going to get audited more, and that’s true. The federal audit rates are designed so they audit the rich more than they audit the poor. Again, it’s not because they think the rich are inherently less honest, or the poor are inherently more honest, or anything like that, it’s simply that the gains to cheating and not getting caught are bigger if you’re rich, so you need to audit more to push back into equilibrium.
Now, suppose we did in fact want to use the policy of raising fines to push down, to push up the compliance rate, to push down cheating. How would we change the law? Suppose we want to raise the fines for cheating, we don’t like people cheating so we raise the fines, but we’re worried about this result that didn’t push up compliance rates, how could we change the law or change the incentives in the game so that it actually would change compliance rates? What could we do? Yeah.
Student: If we changed the payoffs of auditing to 4, from 2 to 4.
Professor Ben Polak: Good, if we want to change the compliance rates we should change the payoffs to the auditor. The problem with the way the auditor is paid here is that the auditor is paid more if they manage to catch people, but audits are costly. The problem with that is when you raise the fine on the other side, all that happens is the auditor’s audit less often in equilibrium. So if you want to get a higher compliance rate, one thing you could do is change the payoffs to the auditor to make auditing less costly for them, or making catching people nicer for them, give them a reward, or you could simply take it out of Game Theory altogether. You could enforce, you could have congressional law that sets the audit rates outside of equilibrium, and that’s been much discussed in Congress over the last five years. Somebody setting audit rates, as it were, exogenously by Congress. Why might that not be a great idea? Leaving aside Economic Theory for a second, leaving aside Game Theory, why might it not be a great idea to have Congress set the audit rates rather than some office?
Student: Most members in Congress have a lot of money so they’re going to lower the audit rates so that they don’t get audited.
Professor Ben Polak: So the lady in the front is saying that a lot of Congressmen are rather rich, so maybe they have particular incentives here. I don’t want to take a particular political stance here, but it could be whatever side of the political spectrum you guys sit on, it could be that you might not trust Congress to get this right. You might think they’re going to be political considerations going on in Congress other than just having an efficient tax system. Okay, so what do I want to draw out as lessons here? The big lessons from this class are there are three different ways to think about randomization in equilibrium or out of equilibrium. One is it’s genuinely randomization, another is it could be something about peoples belief’s, and a third way and a very important way is it could be telling us something about the proportion of people who are doing something in society, in this case the proportion of people who are paying tax.
A second important lesson I want to draw out here, beyond just finding equilibria, two other things we drew out today, one lesson was when you’re checking equilibria, checking mixed strategy equilibria, you only have to check for pure strategy deviations. Be careful, you have to check for all possible pure strategy deviations, not just the pure strategies that were involved in the mix. If the guy has seven strategies and is only mixing on two, you have to remember to check the other five. The third lesson I want to draw out today is because of the way equilibria works, mixed strategy equilibria work, if I change the column player’s payoffs it changes the row player’s equilibrium mix, and if I change the row player’s payoffs, it changes the column player’s equilibrium mix. Next time, we’re going to pick up this idea that mixed strategies can be about proportions of people playing things and take it to a totally different setting, namely evolution. So on Wednesday we’ll start talking about evolution.
[end of transcript]Back to Top
|mp3||mov [100MB]||mov [500MB]|