Prisoner's Dilemma

Definition

The prisoner’s dilemma is a game in which two players each independently choose to cooperate or defect. The payoff structure makes defection the individually rational choice regardless of what the other player does — yet when both players defect, both end up worse off than if both had cooperated.

It is the most studied game in game theory and appears in hundreds of real-world contexts, from arms races to environmental agreements, from business competition to biological evolution.

The Payoff Matrix

	Opponent Cooperates	Opponent Defects
You Cooperate	3, 3	0, 5
You Defect	5, 0	1, 1

Why defection dominates: If your opponent cooperates, you get 5 (defect) vs. 3 (cooperate) — defect is better. If your opponent defects, you get 1 (defect) vs. 0 (cooperate) — defect is still better. Defection is the dominant strategy: better no matter what the opponent does.

Yet mutual defection yields (1,1), while mutual cooperation yields (3,3). Both players acting rationally produces an outcome worse for both than if they had both acted “irrationally.”

Why It Matters

The prisoner’s dilemma is not an artificial puzzle — it is the underlying structure of enormous real-world conflicts:

Arms races: Both the US and USSR would have been better off not developing nuclear arsenals, but each was individually better off building them regardless of what the other did
Climate change: Each country is better off not reducing emissions (avoiding the cost) regardless of what others do — yet collective inaction produces catastrophic outcomes
Price competition: Two competing firms would both profit from maintaining high prices, but each gains by undercutting the other
Common pool resources: Each individual is better off taking more fish/water/bandwidth, but collective overuse depletes the resource
Biological cooperation: Impalas grooming each other, fish cleaning sharks — each individual incurs a cost that benefits the other

One-Shot vs. Repeated

In a one-shot game (played once, never again), defection is the only rational strategy. There is no future relationship to protect, so immediate self-interest dominates.

In a repeated game (same players, multiple rounds), cooperation can become rational because:

Your current defection influences your opponent’s future behavior
The prospect of future mutual cooperation has value worth protecting
Shadow of the future disciplines present defection

The repeated game is where tit-for-tat and other cooperative strategies emerge. See source—prisoners-dilemma-axelrods-tournament.

The Nash Equilibrium Problem

The mutual defection outcome is the Nash Equilibrium of the one-shot game: given what the other player is doing, neither player can improve by changing their own strategy. This is the formal definition of a “stable” outcome — yet it is collectively suboptimal.

The prisoner’s dilemma is the canonical example of why Nash equilibria are not necessarily efficient or desirable. Individual rationality and collective rationality diverge.

Connections

tit-for-tat

The solution to the repeated prisoner’s dilemma. TFT (and its variants) achieves mutual cooperation through the implicit threat of retaliation, making defection unprofitable over the long run.

inclusive-institutions

Institutions — rule of law, property rights, enforceable contracts — solve the prisoner’s dilemma at scale by changing the payoff structure. They make defection costly (legal punishment) and cooperation stable. See source—why-nations-fail.

invisible-hand

adam-smith’s invisible hand applies in markets where transactions are voluntary and competitive — the non-zero-sum situation where both parties gain from trade. The prisoner’s dilemma is the dark side: the cases where individual rationality produces collective harm that markets cannot self-correct.

lollapalooza-effect

When multiple parties are simultaneously locked in prisoner’s dilemmas (arms race + trade war + territorial competition), the cascade of mutual defection produces lollapalooza-scale damage.

zero-sum-thinking

Zero-sum beliefs are the cognitive error that makes the prisoner’s dilemma feel unsolvable. Players who incorrectly believe the situation is purely zero-sum (I can only win if you lose) cannot access the cooperative equilibrium — they are cognitively locked into the defection outcome. Research by Bohnet & Chilazi (2025) shows zero-sum beliefs increase even when the actual situation is positive-sum; the belief itself is the obstacle. See source—zero-sum-fairness.

base-rate-neglect

In real-world prisoner’s dilemma situations, people often over-weight their current read of the opponent’s intentions and under-weight the base rate of the relationship — leading to echo effects of mutual retaliation that neither party initially wanted.

Naval’s Real-World Application

naval-ravikant explicitly connects long-term business success to iterated prisoner’s dilemma logic:

“If you look at prisoner’s dilemma type games, a solution to prisoner’s dilemma is tit-for-tat… that only works in an iterated prisoner’s dilemma. If you’re in Silicon Valley, people are doing business with each other, they know each other, they trust each other. Then they do right by each other because they know this person will be around for the next game.”

Naval’s career advice follows from this: play long-term games with long-term people. All returns in life come from compound interest — and compound interest requires iteration. When someone exploits you in a negotiation, the solution is to convert the single-move game to a multi-move game by introducing reputation, referrals, and repeat business. See source—how-to-get-rich.

🪴 PG Notes

Explorer