I have been incredibly frustrated with the lack of quality and content in many responsible AI (RAI) conversations. Almost all the (non-academic) RAI meetings I attended these past 12 months involve the speakers repeating words like fairness, accountability, and transparency basically for the entire duration of the meeting, with everyone nodding furiously in agreement about their importance but, frankly, my sense is not a single person got any wiser about how to actually build AI (eco-)systems that satisfy those properties at the end of those discourses. Maybe I am not hanging out with the right crowd, or maybe the topic of responsible AI engineering is just not yet well-understood beyond a small niche group of AI research scientists.
I think a couple of books will help lift the quality of RAI conversations, generally and broadly beyond what’s happening in the research community. The one I recommend everyone reads is
- The Alignment Problem: How can AI Learn Human Values by Brian Christian
The key virtue of Christian’s book is that, in addition to covering the now well-known cases of bias built into data — the racial bias built into Kodak’s Shirley cards is particularly fascinating to me — used to train AI systems and the interesting ways AI systems can fail by blindly pursuing a goal that is not completely specified for corner cases, Christian goes further and did a good job of writing general descriptions of recent technical advances in how an AI system can flexibly learn, with proper shaping like being encouraged to be curious and knowledge-seeking, a human’s intended (but not explicitly specified) goals or reward functions by imitating and making inferences about human behaviour. This body of research is designed to get around the seeming impossibility of humans to write down a precise mathematical function that states a goal that covers all cases in complex scenarios, with the canonical example being the difficulty of writing a mathematical function that says precisely “drive this car conservatively in the presence of other cars driven by humans”. I certainly learnt a great deal about all the AI safety and alignment research I missed over the last 5-8 years, and I encourage everyone who thinks about RAI, and especially those who speaks publicly about RAI, to read this book.
If you’re hungry for more after reading Christian’s book, here are a few others I have read and would recommend:
- The Ethical Algorithm: The Science of Socially Aware Algorithm Design by Michael Kearns and Aaron Roth
- Human Compatible: AI and the Problem of Control by Stuart Russell
- Misbelief: What Makes Rational People Believe Irrational Things by Dan Ariel
- The Book of Why: The New Science of Cause and Effect by Judea Pearl and Dana Mackenzie
Enjoy and speak and think quality thoughts!