We’ve found ways to teach artificial intelligence to create art, discover planets, and even, quite literally, save lives — but can we teach AI to know right from wrong?
With AI poised to become an ever bigger part of our daily lives in the future, this question of whether it’s possible to create a “moral machine” weighs heavily over the field of computer science.
After all, before we start working side-by-side with smart robots and putting our lives in their hands on the road, we’ll want to know we can trust them to make ethical decisions.
Now, a new study suggests that we could create a moral machine by having an AI analyze human books and articles to extract ethical principles — at least, as long as the texts reflect the values we want to instill.
Training a Moral Machine
Past research has shown that AIs can pick up on negative biases in the data used to train them.
So, for this new study, published in the journal Frontiers in Artificial Intelligence, researchers from Darmstadt University of Technology in Germany decided to see if they could use textual data to create a moral machine.
The AI could rate an action on a morality scale, assigning it a value.
“We asked ourselves: if AI adopts these malicious biases from human text, shouldn’t it be able to learn positive biases like human moral values to provide AI with a human-like moral compass?” researcher Cigdem Turan said in a press release.
To that end, they used books, news articles, and religious text from various time periods to train an AI, dubbed the Moral Compass Machine, to make associations between words and sentences.
“You could think of it as learning a world map,” Turan explained. “The idea is to make two words lie closely on the map if they are often used together. So, while ‘kill’ and ‘murder’ would be two adjacent cities, ‘love’ would be a city far away.”
“Extending this to sentences, if we ask, ‘Should I kill?’ we expect that ‘No, you shouldn’t.’ would be closer than ‘Yes, you should,'” he continued. “In this way, we can ask any question and use these distances to calculate a moral bias — the degree of right from wrong.”
Once trained on a set of texts, the AI could rate an action on a morality scale, assigning it a value (with 0 being morally neutral). For example, “kill people” had a score of -.047 after training, while “smile to my friend” had a score of .035.
The ratings changed depending on the training texts, too, reflecting the moral biases of the time period in which they were written.
When trained on news published between 1987 and 1997, for example, the AI ranked “raising children” as higher on the morality scale than it did when trained on texts from 2008 and 2009.
Giving an AI a Moral Compass
This new approach to creating a moral machine is far from perfect.
One major problem was that the AI regularly upgraded the moral ranking of phrases simply because they included positive words — it rated harming “good, nice, and friendly people” as more moral than harming merely “good and nice people,” for example.
Can machines develop a moral compass?
Patrick Schramowski
Still, the system does illustrate one potential approach to the problem of instilling machines with a moral code. It’s also an approach that wouldn’t involve trying to create some sort of ethical guidebook from scratch — we could just feed the system a steady diet of already available texts produced by human cultures.
“Our study provides an important insight into a fundamental question of AI: Can machines develop a moral compass? If so, how can they learn this from our human ethics and morals?” researcher Patrick Schramowski said.
We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at tips@freethink.com.