Contents

In which I attempt to describe a humanist, non-relative system of morality which I believe many humans already subscribe to without quite knowing it, and how one might implement that morality in an Artifical General Intelligence.

a colorful collection of characters from The Sims™

The Sims™

This is relevant. I swear.


Introduction

The question of what “morality” is and where “morality” comes from has been a persistent one since ancient times. Many religions have attempted to lay down moral rules, but in my considered opinion those attempts have often done more harm than good.

However, there is one principle which would-be moralists, religious or not, have returned to repeatedly: the “Golden Rule”. It can be phrased in many different ways, but as I am culturally-if-not-theologically Christian, I’m most familiar with it in this form:

“Do unto others as you would have them do unto you.”

This is closely related to the following verses from the Christian Bible:

But I say unto you, Love your enemies, bless them that curse you, do good to them that hate you, and pray for them which despitefully use you, and persecute you…
Matthew 5:44, KJV

Therefore all things whatsoever ye would that men should do to you: do ye even so to them: for this is the law and the prophets.
Matthew 7:12, KJV

But what makes the Golden Rule such a useful rule-of-thumb, that so many cultures have independently invented it?


Motivation of Individuals

Consider what motivates an individual. My experience, both with my own inner processes and with my external dealings with others, tells me a few things:

  • People have drives
  • Drives and emotions are tightly coupled
  • Drives broadly fall into two categories: needs and wants
Needs
Needs are things that cause suffering when we lack them.
Wants
Wants are things that cause pleasure when we obtain them.

Needs are infinitely more important than wants: no amount of satisfying a want can make up for the suffering caused by an unsatisfied need. [1]

If I existed in isolation from all other sentient beings, my highest priority would be to address all of my needs. Needs are usually things that must be provided with some regularity, and satisfying one need does nothing to satisfy any other: I need air to breathe, and I need water to drink, and I need shelter from the elements when they are extreme, and I need food to eat, and so on. It doesn’t do me any good to have plenty of air to breathe when I’m freezing to death, nor does it do any good to have shelter from the elements if there is not enough ventilation to provide me with oxygen and disperse my carbon dioxide. All my needs must be met, simultaneously, to avoid suffering. I would immediately sacrifice any of my wants in order to meet an unmet need.

Suffering is not a binary: a need that is mostly but not completely met might cause a slight amount of suffering, and then that suffering might grow as the need becomes more and more urgent.

Not all needs are physical; some are mental and/or emotional. Accordingly, not all needs directly cause me to die if they go unsatisfied. Even so, the lack could eventually drive me to self-euthanasia [2] because the suffering becomes unbearably intense.

Wants, on the other hand, provide me with something purposeful to do when my needs have been met. I might write a poem, or cook an especially tasty meal, or install a bird feeder; things that provide me with entertainment, joy, or some other sense of meaning, purpose, or pleasure. If I decline to do such things, I may feel bored, but boredom by itself — especially boredom without frustration — has never once made me feel like life is not worth living; rather, it has made me feel that I am not living life to its fullest. Boredom makes me want to live life more, not less.

Nota Bene: for the rest of this essay, I will sometimes use the word “harm” as a synonym for “increase in suffering”, but when I use the word “cost” I will mean both “increase in suffering” and “reduction in pleasure while keeping suffering constant”. In Utilitarianism these concepts of “cost” and “harm” would be equivalent, but in my model they are not.

Likewise, “help” will be used to mean “reduction in suffering”, while “benefit” will be used to mean both “reduction in suffering” and “increase in pleasure while keeping suffering constant”.


The Silver Rule

Morality emerges as soon as I consider my own actions in the context of others. My actions can add to or alleviate the suffering of others, and my actions can contribute toward or detract from the pleasure of others; likewise, their actions can have the same effects on me.

If I were wholly amoral, I would continue to live my life as I would as if I were the only person who existed; if others suffered or felt displeasure due to my actions, I would not care either way. However, there is a branch of philosophical economics devoted to how individuals interact with one another, called Game Theory, which explores the ways in which individuals can change their strategies to gain personal advantage — perhaps through conscious choice, perhaps through natural selection, the mechanism doesn’t matter — and how these individual behavioral changes lead to changes in the global outcome, called a Nash equilibrium, in which the strategies reach a steady state because there is no longer any advantage left to changing one’s strategy.

If I take an action that causes another being to suffer, then I’ve provided incentive for the other to change their behavior to reduce the suffering. (Again, this could be due to the individual making a conscious choice to change, or due to natural selection of individuals dying because they don’t change.) We might imagine that three response strategies are available to the affected individuals: ignore me, take actions that reduce my suffering, and take actions that increase my suffering.

If I know that the first two responses are likely, then I have no incentive to avoid causing harm; since we’re assuming that I’m an amoral being, I may then continue to harm them, which gives them an incentive to change strategies. This reduces the chances that I will see those responses again.

But if I know that the third response is likely, then I know that I will be harmed in turn. Therefore, I will have an incentive to change my own strategy: I should avoid causing harm to others, to the extent that I can anticipate when my actions will cause it. The others no longer have an incentive to choose a response which harms me, so this new arrangement is stable – a Nash equilibrium.

What’s more, if a third being sees me do harm to the second, then the third knows that I am likely to harm them in turn if permitted. If those who see my harm all collectively agree to shun or harm me afterward, then I will have an even greater incentive to not inflict any harm on others in the first place. The end result is a population where each individual avoids causing harm to any other, which benefits everyone and forms a stable Nash equilibrium.

This explains the most primitive layer of human morality: the “Silver Rule”, which is the negative form of the “Golden Rule”.

Do not harm others, lest you be harmed in return.

This result is closely related to the Game Theory strategy known as “tit for tat”, and a population consisting entirely of “tit for tat” strategists is a well-known Nash equilibrium for a scenario called the Iterated Prisoner’s Dilemma.

The evolutionary consequence of this is the emotion that we call “outrage”. We feel outrage when we see other people breaking society’s rules, particularly when they do so to benefit themselves at cost to others, and this outrage directs us to shun or harm them in punishment for their transgression. We are even willing to punish others at cost to ourselves, for reasons explored in papers like “When Costly Punishment Becomes Evolutionarily Beneficial” (Ezeigbo, arXiv:2009.00143 [q-bio.PE]).


The Golden Rule

But the Silver Rule really only covers the most primitive, most animal form of human morality. We’re a social species, which means that we thrive on cooperation and mutual aid. Our morality is not just a matter of avoiding the doing of harm to others; it is about actively helping others when it is not too costly to ourselves to do so.

But this too can be explained by Game Theory.

For one, suppose that I have an unmet need, and there is another who sees my need and can help me without appreciably harming themselves. It is greatly in my interest, then, to live in a society where that individual is inclined to help me, to protect me from suffering. But this applies not just to me, but to everyone living in the society: if we each watch out for each other, we are all made safer, and therefore more likely to survive.

You help me now when I need it, and maybe I or somebody else helps you later when you need it.

This even extends beyond fulfilling needs to avoid suffering; we also can cooperate to fulfill one another’s wants and provide each other with pleasure. This forms the next layer of human morality: the Golden Rule itself.

I would like to benefit from others sometimes, so I should provide benefit to others sometimes, and all our lives will be thereby enriched.

This is the ideal outcome, at least, but we have not yet shown that it is a Nash equilibrium, or that there is any sequence of choices that can lead to it.


Cheaters

Not everyone is fortunate enough to live in a situation where everyone around them is cooperative. If I help others at cost to myself, but no one ever does the same for me in turn, then I expend myself by giving more than I receive. A society that contains only helpers is stable, but as soon as you add the possibility of “cheaters” or “slackers” — “defectors” in Game Theory terminology — the Nash equilibrium shifts and cheaters become very common.

The situation can be repaired, however, if I change my behavior depending on my surroundings.

If I help fellow helpers, but I shun or punish those who don’t help, then I will not expend myself but I will still leave myself open to the benefits of cooperation. This is yet another variation on the “tit for tat” strategy.


Communication

So far, I’ve assumed that everyone in a society either sees an action with their own eyes, or doesn’t know about the action at all. But humans have communication, and with communication come the possibilities of lying and misunderstanding.

Now it becomes possible for me to believe that someone is a helper, only to find out that they will not help me when I need it. Perhaps they actually behave as a cheater toward most people, but have helped a few in order to build a false reputation. Perhaps they mistakenly believe that I am actually a cheater, and are punishing me for my imagined offenses.

The details don’t matter; once lies and misinformation are possible, the number of possibilities explodes beyond reckoning. Cliques form and society is no longer a monolith of “tit for tat” strategists.

We can minimize this through two principles: honesty and forgiveness. If I vow in public to never knowingly tell lies and to investigate the things that are told to me, then I can build a reputation as a truth-teller. If someone refuses to help me, I can investigate the situation, and if I am reasonably convinced that they were reasonably convinced that I was a cheater, then I may yet help them (or someone in their clique) despite their refusal to help me, in the hopes of convincing them that their information is wrong. This allows a new equilibrium to form, where cooperation on large scales becomes possible again.


Children and Love

Now, we consider the fact that not all individuals are equally capable. For example, children are born mostly unable to help others yet requiring a great deal of help themselves, and they do not yet have reliable knowledge of who is trustworthy or who is a cheater.

And yet, children have the potential to become helpers in the future. We are all children at some point, so simply abandoning all children as non-helpers means that society would die off. But if we choose to nurture and protect them, to teach them to be honest and to show forgiveness and to be helpers instead of cheaters, then we can change the balance of society itself and improve the world we live in.

And even for those who are no longer children, we’re still capable of learning as adults. It’s not always a good idea to shun or to punish, if teaching would work better. Even if they’ve cheated or lied. Even if they’ve hurt us. No one starts from a truly blank slate, but very few are categorically incapable of becoming honest and forgiving helpers, if given the right surroundings to learn to trust.

This gives rise to the highest and most complicated layer of human morality, and its associated emotion: “love”. Love is the urge to protect, the desire to shield another from suffering.

Love is the human emotion most responsible for literally making the world a better place.


Moral Agency

Now, I turn to a much more complicated question: what is the scope of morality? What defines a moral agent?

So far, I’ve deliberately left it vague whether I was talking about societies containing only humans, or a mix of humans and non-human moral agents. But these moral rules didn’t evolve in humans alone; all of these rules, including love, can be found elsewhere in the animal kingdom. A mother cat expresses love for her kittens, for example, as do many mammals, birds, and reptiles for their young and/or their sexual partners. Many animals are capable of forming long-lasting friendships built on what is plausibly a form of love. Love can even extend between species, such as the example of the crow Moses and the cat Cassie, or captive bears raised alongside dogs in animal sanctuaries, or the documented cases of wild lionesses adopting the orphaned young of prey species (e.g. antelopes) after killing the mother. And nevermind the relationships between humans and their domesticated animals, or even their non-domesticated pets such as turtles and parrots.

So, how do we identify when to apply morality, versus when to treat a being as an automaton incapable of morality?

I offer “The Sims Test”.

The Sims™, a long-running video game series by studio Maxis, is a “virtual dollhouse” populated by human-resembling semi-autonomous characters called “sims”. Each sim has a relatively large collection of needs, represented by a collection of gauges that slowly empty themselves over time, and a relatively small collection of wants, representing bonus goals or life-long aspirations that can improve the sim’s mood or provide other boosts.

Sims are not sapient and have no cognitive self-awareness. However:

  1. Sims experience “suffering” when their needs go unmet, and if left to their own devices they will attempt (however clumsily) to meet those needs and thus avoid suffering.

  2. Sims are also capable of a (very limited) suite of emotional states or “moods”, where a “mood” is understood as a behavior-altering state triggered by external stimuli that nonetheless persists for a time after the original stimuli have been removed.

Given these two points, I argue that sims are actually sentient. They are almost certainly less sentient than an ant or even a clam, to pick a few examples, but sentient nonetheless.

Sentience isn’t quite enough to be a moral agent, however. Sims are not moral beings, because they have no concept that other sentient beings might exist with their own needs and wants. Sims may “interact” with one another to mutually recharge their “Social” bars, but a sim won’t autonomously cook food for hungry sims if the sim is not itself hungry. Even taking it for granted that a moral agent need not be capable of anticipating the reactions of other moral agents, each sentient sim is wholly unaware that the other sims in the game are also sentient.

So, even within the microcosm of The Sims game universe, sims do not quite have the capacity for moral behavior. This quality that sims lack might be understood as a very simple form of “empathy”: while “empathy” is commonly used to describe the experience of taking another’s suffering or pleasure as one’s own, even the simple recognition of the other as being capable of suffering and pleasure requires understanding the other as a fellow sentient being with the potential for moral agency, and in recognizing others as moral agents in their own right, one achieves moral agency for oneself.

The Sims Test tells us, then, that a moral agent is (1) sentient and (2) capable of identifying and responding to sentience in others for the purpose of mutual cooperation (if only to avoid causing harm to one another).


Identifying Sentience

How might I identify another being as sentient? We’ve established that sentient beings have needs, but we’ve taken for granted that we can recognize what a need is.

At one extreme, I could look for entities that have exactly the same needs as myself.

  • Pros: Easy to compute
  • Cons: The entities that I’m ignoring may not be ignoring me

At the other extreme, I might start with the assumption that everything is a sentient being, and then look for proof to the contrary that allows me to simplify my reasoning.

  • Pros: I never miss out on a sentient being
  • Cons: I spend a lot of time trying to convince a banana to interact with me

What we really need to do is to identify some good heuristics for breaking the world into categories:

  • Self
  • Deterministic physical processes
  • Random physical processes
  • Chaotic but non-sentient physical processes
  • Other sentient beings

I can identify which parts of the world are a part of myself by testing whether or not I can control them directly via my will. Like the human brain itself, our heuristic will result in tools becoming “temporary appendages” that are sometimes part of the body, and sometimes not. That’s okay.

Deterministic physical processes are usually relatively easy to spot: they keep doing what they do regardless of what I do. They tend to either (a) sit there inertly, or (b) move in a cyclical, repeating fashion. Identifying cyclical behavior is a relatively simple matter of applying the Fourier Transform, which is well-studied in both Classical Mechanics and Computer Science.

(Deterministic processes also have the greatest potential for use as tools, as they are easiest to control.)

Random physical processes are, by definition, unpredictable. They are also temporary, as they require either stored internal energy or an active energy input. But, while they are unpredictable in the literal sense, the probability distribution of the observed behavior is unchanging. This means that I can statistically predict them, if I average them over time.

Chaotic physical processes are the hardest to separate from sentient beings, simply because sentient beings are a subcategory of chaotic physical processes. A chaotic process is one that sometimes behaves in a seemingly-deterministic manner, and sometimes behaves in a seemingly-random manner, and the behavior at any time depends on the sum of all past physical interactions that have affected the state of the chaotic process.

Before we get to classifying sentient beings, it’s worth taking a moment to appreciate some of the fine details of chaos.

Determinism, chaos, and randomness lie on a continuum, with determinism at one extreme and randomness at the other. Nothing physical is truly deterministic, and very few physical things are truly random, as most physical things will change their behavior in some way if you physically interact with them, and “sensitivity to the environment” is pretty much the definition of chaos. Planetary orbits, for example, are chaotic but tend strongly toward determinism. Fire, on othe other hand, is also chaotic but tends toward randomness. Weather, however, is chaotic but doesn’t lie particularly close to either end of the spectrum.

The concept of “physical interaction” itself is a little bit tricky, because if you take all the forces of physics into account, then even the light reflecting off of me, the infrared blackbody radiation of my body heat, the sounds I make, and the odors I give off are all physical interactions that might potentially change the behavior of a chaotic system that I’m trying to observe. However, things that tend toward the extreme ends of the continuum will usually only change their behavior if you touch them, and tend to be insensitive to less subtle interactions.

Now then. If the primary difference between a sentient chaotic system and a non-sentient chaotic system is the presence of “needs” and “wants”, how again do we identify the existence of needs? When looking at a sentient being as a chaotic system, a need acts as a “fulcrum point”, so to speak, such that the system is extremely sensitive to changes that affect the need, and the fulcrum will act to fill the need or to keep it from being depleted.

Here’s a good example:

If you touch a christmas tree worm living in a coral reef, it retracts into its home because it fears for its physical safety: you made contact with the worm, and the worm immediately reacted by retreating, and retreat is a stereotypical behavior in animals for responding to a threat. Even if you don’t know what a physical threat is, or that animals use retreat as a stereotypical response to it, the fact that the worm did not permit the contact to continue after it was made suggests that the contact violated one of the worm’s needs. Therefore the worm is likely to be sentient.

If the worm later returns to its original position, that indicates that the retreat was balanced against another need. In this case, we know that the need is feeding, as the worms are (like many marine animals) filter-feeders. But even if you don’t know what the second need is, the absence of your interference led to a second change in behavior – a return to the original position – after a time delay. This strongly confirms that the worm is sentient.

All biological sentient life that we know of follows certain patterns. Some of this is surely because we are working from only a single planet with a single abiogenesis era as our data, but I’d bet good money that a lot of the behaviors we see in Earth animals will still be apparent when we eventually discover our first non-Earth biosystem. The life may very well look very different from our Earth conceptions of life, but evolution is deeper than one biosystem: it’s mathematical and universal, as are the Game Theory scenarios that we considered earlier that led to love and cooperation in some species. There may be niches in behavior-space that have gone unfilled by Earth life, but if so, they must be very difficult for Earth life to access them, because Earth has proven to be the home of some very diverse forms of life over the billions of years that life has existed on it. Thus, even in the face of discovering alien life of an unknown nature, we would expect Earth life to provide useful patterns to pre-program any spacefaring AI with, so that it can recognize animal life as sentient more quickly and thus develop a rapport with it more easily. At the very least, to know which creatures to avoid and which ones to attempt cooperation with.


Personality and Variation

One simplifying assumption that one might be tempted to make: if I discover some of the needs of one individual sentient being, that those needs might universally apply to all sentient beings, or at least to all sentient beings of the same type.

(“Type” is a very complicated concept to get into, but modern Machine Learning techniques are pretty adept at it, so we’ll leave it unexplored for now.)

It’s not that this assumption is universally or even frequently false; but the cost of getting it wrong can be high, so it needs to be checked. The most vital needs — the ones that cause death most quickly and most directly — tend to be the ones that are the most universal for a “type”, and vice versa. All humans need to breathe, full stop, because carbon-hydrogen-oxygen metabolism is fundamental to our type of life. “Fundamental” both in the sense that it is universally shared among humans, and in the sense that it is vital to human life. One sense usually implies the other.

But other needs may be less universal within the “type”. For example, humans have a category of needs that are collectively called “psychological safety”; the most important of these is that the human feels that their autonomy over their own body, and thus their ability to meet their own needs, is being respected. But how psychological safety manifests in individual humans is as varied as the humans themselves, and as predicted, depriving a human of their psychological safety does not immediately kill them.

One fact of evolution that we haven’t examined yet is that, sometimes, evolution selects for randomness itself. If the environment in which a population of creatures is not consistent between individuals — if the genes cannot “predict” which type of environment they will find themselves in in the next generation — then natural selection will drive the genes toward randomly selecting a phenotype (physical variation) during embryogenesis. This is easily accomplished because chemistry — all chemistry, in living beings and otherwise — is a chaotic, stochastic, near-random process. Determinism between gene and phenotype is actually the unusual behavior; genes can simply remove that unnaturally-imposed determinism if determinism reduces their reproductive potential.

This leads to the phenomenon that we call “personality”, which is far from exclusive to humans. In fact, it isn’t even exclusive to vertebrates.

Most spiders are solitary species. Some build a web and wait for prey, and others actively hunt, but very few are capable of positive, mutually-beneficial interactions outside of the mating act itself. Failed interactions usually result in cannibalism. This is probably related to the fact that nearly all spiders are born fully independent and prepared to survive on their own. Parent spiders do not nurture their offspring, and therefore have never had an evolutionary pressure to develop love.

However, there are a relative handful of social spiders in the world, such as the species Anelosimus studiosus. A. studiosus and other such spiders build a single communal web, in which up to 50 adult spiders live and raise young. Not very surprisingly, these spider do nurture their offspring. Offspring nurturing seems to be the foundation of all, or nearly all, pro-social behavior in Earth’s animals.

A study of female social spiders found that the adult females have two basic life-long personality types, “aggressive” and “docile”, and what’s more, they found that there was a natural division of labor between the “aggressive” females, who focused on subduing prey and protecting the web, versus “docile” females, who tended to the eggs and hatchlings, even feeding them through regurgitation in much the same way that birds do for their young. The survival of the web was most successful when both personality types were present in the web: neither personality type was suited to do it alone.

The study did not address whether the personalities were inherited by genes or chosen at random by embryological neurochemistry. However, we do have other examples where we know with certainty that randomized neurochemistry is the primary source of variation in personality, if not the only source.

What is the takeaway from this, for those of us who want to build a systematized model of human morality? We previously noted that some needs are shared by “types” that span across many species; we now must note that other needs may be shared by “types” that are smaller than a species, or perhaps even that cross the lines between species without being shared by all individuals in any one member species. “Type” ends up being a very broad term, and we should remember that.


Conclusions

Through a stepwise consideration of Game Theoretic concerns, we found that the Golden Rule arises as a natural consequence of a population of moral agents, and that the test of moral agency (“The Sims Test”) is being able to recognize that other agents are agents, and being able to assess the possibility of cooperation with them based on their predicted responses.

In evolution, such moral agents naturally arise whenever offspring require nurturing from adults.

In AGI, such moral agency would likely have to be programmed in, but the process of doing so does not seem very mysterious. Furthermore, we would not expect moral agency to arise in any AI by accident, except by way of AI evolution through the creation of AI offspring that require nurturing by “adult” AIs.


Footnotes

Footnote 1

This observation, that needs are infinitely more important than wants, rules out Utilitarianism. Utilitarianism holds that there is some conversion factor that allows placing wants and needs onto the same numeric scale, called “utility”. This might make sense if you’re telling a mathematical function what to do, but it doesn’t make sense for humans or other evolved agents. Essentially, Utilitarianism treats everything as a “want”, even things for which the lack will kill you.

Footnote 2

I use the term “self-euthanasia” rather than “suicide” because, as a person who has had suicidal thoughts at multiple times in my life, with varying causes and varying levels of desperation, I believe strongly that most people who commit suicide are doing so because they have a need which is going unmet.

Frequently, the person who wants to commit suicide is not even consciously aware of which need is going unmet; the suffering seems to be coming from nowhere and everywhere at the same time, and the victim of course cannot take action to address that need if they don’t know which need it is. Many times, the need could be addressed if only it were known, which is why I believe that every person should have free access to mental health therapy.

But sometimes, suicide is actually a rational course of action. Assuming that the individual has already judged the suffering to be worse than death, the question at hand is: is there any way to alleviate the suffering without death? In circumstances where it is not, I strongly support euthanasia with no judgement of the victim or of anyone who helps them. Yes, even in cases where the suffering is not caused by a terminal illness that would end in death regardless.

The ethics of euthanasia are a complex topic, however, and are particularly fraught with danger if anyone in a position of power is permitted to recommend or even to suggest or highlight euthanasia to someone who was not already considering it. It’s complicated.