Being Superrational

Being Superrational:

Game theory tends to view people as “rational” decision makers.  This means we (the economists) expect people to act in their own self-interest, selfishly choosing what benefits them the most.  

Interestingly, there are situations in game/decision theory (the two aren’t actually the same thing, but this distinction isn’t too important here), where acting like a traditional rational agent, as prescribed by economists, will cause them to be worse off than if they had acted “irrationally”.  

Which brings up the question: “If being rational means optimizing for yourself, but irrational agents can do a better job than you, are you actually being rational?”

Newcomb’s Problem is probably the most colorful example of this type of situation.  But we don’t need to go so far as to imagine all-powerful aliens and weird setups with boxes and simulations.  The good old Prisoner’s Dilemma works well enough to demonstrate some of textbook rationality’s shortcomings.


As a quick recap, you can see the outcomes by following the matrix.  The numbers correspond to the player’s payoff, and each person wants to maximize their payoff.  Lastly, there is no communication between the two players.

Now imagine you’re an economist, facing this dilemma, against another economist.  You want to do the best for yourself, and it’s obvious that choosing to Defect is the better move.  If your opponent cooperates, then you get more versus if you had cooperated.  If he/she also defects, you still get more than if you had cooperated.  It looks like you can’t lose!

In fact, because you know the other player is also a selfish economist, you can be sure they they’ll be choosing Defect.  So you choose Defect as well (which obviously trumps picking Cooperate), and pat yourself on the back for choosing the rational option.

Except that, Defect-Defect only nets you both 1 point each, which is objectively worse, compared to Cooperate-Cooperate, which nets you both 3 points each.

“But,” you exclaim, “I’m a textbook economist!  I can’t do better than that!  I’m following prescribed rational behavior!  How could I have done better?”

Alas, if you’re an economist as described, you may be doomed to fare less than optimally on such problems where your textbook rationality is inadequate.  

However, let’s imagine an Economist v2, who acts just as selfishly as our normal economist.  However, on Prisoner’s Dilemma-type games, we define our Economist v2 to Cooperate if and only if  they are facing another Economist v2.  Otherwise, they defect.  Our Economist v2 fares better against copies of itself, versus the traditional economist; Cooperate-Cooperate is the resultant outcome.

What our Economist v2 does that the standard economist does not do is factor in its opponent’s decision-making algorithm into its own calculations.  A regular economist knows their opponent is an economist and acts no differently. This type of recursive thinking is the basic concept underlying superrationality.

Superrationality is a term coined by Douglas Hofstadter (who I reference fairly often) that basically says that superrational agents consider the thought processes of other superrational opponents (who will in turn factor in their own thoughts) when making decisions.

(Sort of like Kant’s categorical imperative.)

Imagine you’re playing the Prisoner’s Dilemma against someone else who’s about the same intelligence as you.  You want to win, so at first you think Defect is the better choice.  But if your opponent is just as smart as you are, they’ve probably also thought that Defect is the best choice.  Rats!

“If only there was a way to Cooperate…” you think.  Then you realize that if you’re trying to figure out how to do better than Defect-Defect, so is your opponent.  So if you decide to Cooperate, they will most likely do the same… and they’re probably thinking the same thing right now!

What if they realize this and still try to double-cross you by picking Defect?  Well, if you’ve thought of this, they’ve probably thought of it too— your opponent is as scared of you defecting as you are of them.  At this point, it’s clear that second-guessing the other person leads to infinite regress.  Yet, they also see that.  Which means that if you both want to exit this loop, there has to be some fixed choice.

And Cooperate-Cooperate is definitely better than Defect-Defect.  So even though you two aren’t running the exact same brain, understanding that you will most likely still come to the same conclusions, by virtue of your similar intelligence, means you have to choose between the two fixed-points of Cooperate-Cooperate or Defect-Defect.

So you choose Cooperate, and your opponent does the same.

This is the basic idea behind superrationality.  Even though there is no causal connection (which is what the Causal Decision Theorist economists look at), there is still a correlation between what’s happening in your brain and your opponent’s brain.  And this very fact is recognized by both brains.

Lots of other people have explored these ideas, and the above is a general rehash of other works.  Some selected works are linked below, if you want to read more about superrationality, or decision theory in general:

  1. An introduction to Causal Decision Theory, Evidential Decision Theory, and Causal Decision Theory.
  2. Hofstadter’s Original Discussion of Superrationality.
  3. People Smarter Than Me Talking About This On Stack Exchange
  4. Someone Else Smart Talking About Superrationality


  1. I find the idea of superrationality appealing… but I just don’t feel it works in this case.
    I agree that second-guessing your opponent leads to an infinite regress, but except in the case of bots with the same source code, I don’t think you necessarily reach a symmetric fixed point.
    I think there’s enough random variation that you might flip-flop and end up cooperating while your opponent thinks of the same things you do, decides you’re a superrationalist, figures that you’re going to cooperate, and then defects.

    Thanks for writing this up so clearly.
    It mangles my brain.


    • Hey Julian!

      Thanks for sharing your thoughts!

      To be honest, I’ve also thought about an opponent gambling on your superrationality to “exploit” you in this way.

      It’s not a direct counter, but would you find it plausible that:

      Once you predict that your opponent is choosing to cooperate because they expect you to do the same, your brain has just generated a good reason to Defect, conditioning on the opponent’s reasoning skills.

      It would be reasonable to expect them to have done the same (also consider Defection).

      Now consider that even thinking of Defecting presents a memetic hazard– once you realize you can exploit the other person by defecting, you end up in this second-guessing game in the first place.

      In contrast, if two people had just seen this and Defecting had not crossed their mind, they’re obviously doing better than two rational agents.

      Given this, does it seem rational to act like you had never considered Defect (even after having actually considered it?)

      Also, reciprocative Cooperation is the key here; there’s no point Cooperating w/ Defect-bot.

      This isn’t 100% sorted in my mind yet, but let me know if this makes things a little clearer.

      Liked by 1 person

      • Thinking of Defection presents a memetic hazard…yeah, exactly!
        I do think it’s good to act like you never considered defect.
        I think it’s kind of similar to the way in strategy games it can be useful to -_not_ be seen as someone solely interested in maximizing their utility.


  2. Hi Owen,

    This is pretty much what Yudkowsky sort of discussed in HPMOR ch 33 (, around a third into the page). The difference was that the dilemma posited in the chapter is against an identical copy of you – but I suppose the arguments would still work out if you replace “identical copy” with “someone who thinks identically”. In retrospect, HPMOR taught me a lot about game theory, and made me end up accepting TDT more than CDT.


    • Hey CJ!

      Thanks for linking the HPMOR chapter! I recalled reading it up while typing this up, but I couldn’t recall which chapter it was exactly.

      Your point about identical players is also neat, I think, because it mimics the Clique-bots found in PD scenarios w/ open source code (MIRI has some papers about this).

      But you’re right; symmetry on some level is required for this to work out (I think).


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s