The Prisoner’s Dilemma and Compassion…

Imagine the following scenario: two prisoners are held in solitary confinement and have no way of communicating with one another. Each is told, individually, that there are three options: (1) if you betray the other prisoner, you will go free and he will be imprisoned for three years; unless (2) he also betrays you, in which case you will both be imprisoned for two years; or (3) if both of you stay silent (i.e., cooperate), both of you will get a reduced, one-year sentence.

Chainlink fence

This thought experiment is called the ‘Prisoner’s Dilemma’ and it’s a classic aspect of game theory, or the study of strategic thinking. In this scenario, assuming that the prisoner’s actions have no repercussions on his reputation (e.g., the other prisoner isn’t able to take revenge), the rational response is to betray the other prisoner. If he doesn’t betray you, you go free, and if he does, you serve two years. But if you don’t betray him, he’s likely to betray you, and so you’ll get three years if you stay silent. This logic is applied consistently in economics for predicting how individuals and organisations will act in given scenarios.

Now imagine an alternate version. In this situation, the scenario is ‘iterated’ or, in other words, repeated over and over, but both individuals can be held accountable for their earlier actions (e.g., if one betrays the other, that betrayal can be punished in the future). If the number of iterations is known by both ‘prisoners’ in advance, betrayal will always occur (the late John Nash developed the proof for this scenario, known as a zero-sum outcome). Again, many economics theories are underpinned by this assumption. Nevertheless, if the number of iterations is random, and neither party knows how many times the dilemma will be repeated, an interesting change emerges: both parties start cooperating even though they have no way of communicating. Not only do they cooperate overall, they will forgive previous transgressions, and will continue to cooperate over time. It is worth noting, that the random N iteration version is relatively illustrative of real-life interactions between individuals.

Understandably, cooperative behaviour in iterative N versions is particularly interesting to researchers, largely because it is not purely rational. In fact, when modelled extensively, with a whole load of different interaction strategies, it turns out that altruistic behaviours are more successful in a Prisoner’s Dilemma scenario. In particular, four behaviours predict the best outcome: being nice (i.e., I won’t betray the other until he betrays me), limited retaliation (i.e., if he betrays me, I will betray him back to show him that I don’t approve of his behaviour), forgiveness (i.e., I’ll forgive his betrayal so long as he doesn’t keep betraying me), and non-enviousness (i.e., I’m not doing this to do better than you). Fascinatingly, when applied to interaction between most animals (including humans but excluding economists) in the real world, these rules invariably apply. Cooperation appears to be built in to most animals, even if they’re incapable of directly observing the consequences of their actions.

Consequently, evolutionary theorists have used the iterated prisoner’s dilemma to explain the evolution of the altruistic behaviours observed in pretty much all species. Thus, initially selfish behaviours (in this case, of ‘reducing my sentence’) can rapidly evolve into cooperative strategies with certain characteristics coming out on top, including being nice, and forgiving one another for past transgressions,. In humans, it’s possible that the end product of this system is compassion.

If you’ve read many of my blogs, you might have come across my various articles on compassion (see herehere, and here). As a psychologist, I’m fascinated by human compassion because it represents our highly evolved ability, not only to cooperate, but to do so in a way that allows us to model another person’s experiences strongly, and to adapt our behaviour based on our interpretations of his or her experience. In other words, compassion is the ability to put ourselves ‘in another person’s shoes’ and to modify our behaviour, in a non-selfish way, as a consequence. Even more fascinating is the idea that this complex behaviour might have evolved from simple Prisoner’s Dilemma iterations. That is, that initially selfish (survival-based) actions can be shaped, over time and through complex interactions, to an end point as complex as compassion, in which the human brain devotes large amounts of processing power to being able to model another’s experience. Moreover, thanks to specific brain cells called ‘mirror neurons’, we’re actually able to feel a version of another person’s experiences, including emotions and pain. The distress or sadness you feel when witnessing another person’s discomfort, is the result of the firing of your mirror neurons in the parts of your brain that process those experiences – you are literally feeling another person’s suffering (albeit to a lesser degree). Of course, the ability to feel another person’s pain is a big plus when it comes to acting compassionately; it’s hard to harm someone when you’re likely to feel a portion of that harm.

It’s worth pointing out the downside here. Our ‘compassion centres’ are located in our left prefrontal lobes. When we’re angry, upset, or otherwise physiologically aroused (especially in any ‘danger’ situation), our limbic systems (read here) take over and direct resources (in the form of blood sugar) away from our prefrontal cortex, reducing its functionality. So when we’re angry, our ability to be compassionate is substantially degraded.

Game theory makes it clear that cooperation makes sense in any system without a determined end point. That is, if our ongoing survival is determined by our current behaviours, in that those behaviours can affect us at a later date, it’s rational to select behaviours (like kindness) that minimise the chance of future risk (such as retribution). In humans, this has quite obviously worked overall – we’ve progressed societally, culturally, and technologically by, for the most part, selecting a cooperative strategy. It’s quite plausible, therefore, that to attain a high level of societal complexity, we needed to evolve more complex abilities to model the behaviour of other humans, as well as to be able to directly experience (in a limited fashion) another’s state of mind (even though that can be eroded through the perception of danger).

So, to put it another way, human compassion could well be the end point of a complex survival system, that selects for deep cooperation between humans who exist in an unknown number of Prisoner’s Dilemma iterations. I’m certainly not the first person who’s made this assertion (read “Apex” Ramez Naam’s great fictional interpretation of compassion and the iterated Prisoner’s Dilemma).

Now for the shitty bit. In normal human interactions, the N of iterations is unknown and unpredictable, meaning that it’s in our interest to cooperate. We have, however, created some extremely artificial systems, in which the N is clearly known, and a zero-sum outcome (i.e., non-cooperation) is a lot more likely. Three-year electoral terms, tax years, and quarterly profit sheets, are all potentially zero-sum iterations, because the consequences for the ‘player’ are minimised when there is little chance that punishment will occur. An incumbent government can effectively act non-cooperatively if they believe that their actions won’t result in negative consequences – and this occurs regularly if their ‘betrayal’ actions aren’t punished directly, rapidly, and meaningfully (whilst losing the next election is certainly a form of punishment, it becomes moot if they either don’t expect to win, or are extremely confident of winning). Similarly, when the banking industry isn’t penalised for reckless risk (again ‘betrayal’ behaviour in the Prisoner’s Dilemma), the strategy becomes zero-sum – it’s more ‘rational’ to keep betraying. Likewise, when companies are beholden only to their shareholders, and are expected to maximise profit in a given time frame (known N), cooperative behaviour is seen as counterproductive, no matter what the cost to agents outside of the organisation and its shareholders.

In other words, despite having evolved an astounding ability to cooperate with other humans, backed up by dedicated brain centres that allow us to experience directly the negative consequences of non-cooperative behaviour, we’ve also managed to ‘game the system’, by creating artificial zero-sum interaction conditions. Disturbingly, this in itself is delusional. No real-world system is devoid of consequence, because no real-world system exists in isolation. A company’s immoral actions might result in increased share prices over a given timeframe, but these actions are simply not sustainable as the time span increases. Likewise, when governments (for example) deny climate change and act to ignore spiralling carbon levels, they might succeed in creating greater mining revenue temporarily, but are deluding themselves about the value of their actions. The delusion comes from the notion that they are not responsible for their actions beyond a given timeframe. If our current government takes actions that doom a future generation, they do so under the mistaken belief that they are not culpable for those actions. And, sadly, they’re not – or at least they’re not held accountable, which amounts to the same thing.

We don’t need a drastic regime change or a revolution to fix this type of ridiculous (but understandable) thinking. We need compassion. A while back, I wrote a blog titled “Why we need compassionate politics” – more than ever, we need to allow for compassionate action in large systems by removing the deluded notion that any system can exist in a known N series of iterations. That is, we need to recognise that our actions, individually, societally, and systemically, result in consequences outside of the perceived boundaries of that system. Importantly, this requires the removal of ways of thinking that are reductionist, including both religion (possibly the most reductionist form of thinking that exists – read here), and economic theories that are based on limitless resources or expansion. And a great start would be to hold those who make decisions, that have the power to do harm, accountable for those decisions.

This is where, as individuals, we have some (albeit limited) power to act and to evoke change. Rather than waiting for an election to ‘punish’ a non-cooperative government, take direct action by writing to or talking with your MP, especially regarding governmental ethics and the need for consequent legislation (e.g., restrictions on pollution, environmental degradation, human and animal rights violations, etc.).

2 Replies to “The Prisoner’s Dilemma and Compassion…”

  1. Sadly one-upping and outsmarting your counterparts in business and politics is openly lauded and appears to be something to which we should aspire. Understanding how cooperative behaviors yield more optimal results in the longer (undefined) term and are more reflective of a more evolved human brain needs to be the focus of a new marketing campaign.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.