On Stuart Russell's "Safer Rules of Robotics", and Beyond

January 18, 2018

An email came in recently where the sender referenced both Isaac Asimov’s famous Three Rules of Robotics and Stuart Russell’s updated “3 Principles for Creating Safer AI.”

The sender of the email then asked me the following questions:

  1. Is it generally agreed in the field that Asimov’s three rules are in fact downright dangerous? (he implies the rules are dangerous, but never states it explicitly)
  2. Is his program just a restatement in different terms (or a minor tweaking) of something common in the field?
  3. Is implementation of his program at all realistic?
  4. Do the majority of researchers in the field agree on the importance of something like his program?

Straight up: I’m not in the AI field. I mostly work in hydrology and oceanography and not even in a scientific capacity. I tell people that I’m mostly just a guy that knows how to use tools, and that my tools are programming languages and databases. There are much smarter people than me in any of these fields.

Now that we’ve got past that disclaimer, I’ll share my thoughts on the matter anyway.

The Rules of Asimov and Russell

Asimov himself knew that his three rules were imperfect. The sender of the email says that he never said so explicitly, but Asimov’s stories were basically always some version of these rules gone awry.

For thoroughness’ sake, they are:

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law

I mean, a middle schooler can see the problems with these. Russell does a good job of pointing out the issues in his talk. He states that A simple task such as “fetching the coffee” would lead to a machine “protecting its own existence” to fulfill its duty, and not allowing a human to do something as simple as shut it off could potentially be catastrophic.

I agree with Russell and many contemporaries about one fundamental axiom: People are not going to stop innovating, and therefore we are not going to, as Russel says, ‘stop doing AI.’ It’s in our nature - we tinker and prod and pry and solve problems. We’re not going to just willingly stop doing that.

To mitigate the risk, Russell puts his rules forward:

  1. The robot’s only objective is to maximize the realization of human values.
  2. The robot is initially uncertain about what those values are
  3. Human behavior

In his talk Russell quickly tackles the immediate concern that arises with number 3, that humans can behave horribly. He says that eventually the robots will have the compendium of human knowledge, as well as the sum total of human experience to draw from, and will therefore produce a positive an altruistic behavior set, instead of something nasty and harmful.

My Thoughts

Again: I’m not an expert, but I love thinking and talking about this stuff.

We Don’t Really Know What Gestalt of Human Behavior Is

Research needs to be done to define what the average human ethical and moral state is, especially if the idea is to imprint them upon our computer systems. There’s no such thing as moral relativism anymore, whether we like it or not. Soon there are going to be robots and underlying systems several magnitudes of order more powerful than not only our individual selves, but our collective human self as a global whole.

It’s easy to believe that people are good - Wikipedia is fairly accurate, we have fewer and fewer car accidents every year, the buildings get built and stay up, and so on and so on. However, war is still a human constant, as is greed and corrupution on literal industrial scales.

Do we get to say “Well, just ignore the war and greed parts, dear robot. Simply focus on helping us achieve our values.’ Who said those values were good on average, or even good in the first place?

Robots, In Our Own Image, are Going to Have Emotion and Spirituality

Another major issue that I see is that these rules supercede the idea that machines will eventually have believable emotional or spiritual experiences, properties that would be necessary in fields like health care and companionship. Your robot dog can’t fulfill its purpose if it doesnt love you or, put another way, if you don’t believe that it loves you.

Although the idea of the machine expression of emotion is somewhat Kurzweilian, it carries profound another profound implication when it comes to morality: Evil is carried out not despite, but because of our morals, ethics, and emotions.

That isn’t to say that all emotions are bad - if emotions create some of our worst problems then they are also certainly the cause of some of our greatest achievements - Even if the manifestations of emotions are stubbornness and delusion.


Thanks for reading this far into these ramblings. I’ll simply conclude with three major questions:

  1. Is amorality an artifact of intelligence?
  2. If emotion is removed, can you achieve altruism without resentment?
  3. How does one research and calculate what the average set of human values are?

I’d love to hear your thoughts!

Kyle Mathews

This is the website of Mark Robert Henderson. He lives in Cape Ann, works in Cambridge, and plays with distributed apps and tech philosophy online.

Mark's social media presence is slowly and deliberately withering away, so the best way to reach him is via e-mail.