Machine Ethics and Artificial Moral Agents

How to design machines with ethically-significant behaviors There has been a lot of talk over the past months about AI being our best or worst invention ever. The chance of robots taking over and the following catastrophic sci-fi scenario makes the ethical and purposeful design of machines and algorithms not simply important but necessary. But the problems do not end here. Incorporating ethical principles into our technology development process should not just be a way to prevent human race extinction but also a way to understand how to use the power coming from that technology responsibly. This article does not want to be a guide for ethics for AI or setting the guidelines for building ethical technologies. It is simply a stream of consciousness on questions and problems I have been thinking and asking myself, and hopefully, it will stimulate some discussion. Now, let’s go down the rabbit-hole… Image Credit: phloxii/Shutterstock I. Data and biases The first problem everyone raises when speaking about ethics in AI is, of course, about data. Most of the data we produce (if we exclude the ones coming from observation of natural phenomena) are artificial creations of our minds and actions (e.g., stock prices, smartphone activity, etc.). As such, . data inherit the same biases we have as humans First of all, what is a cognitive bias? The (maybe controversial) way I look at it is that a cognitive bias is a . So, a bias is a good thing to me, at least in principle. The reason why it becomes a bad thing is that the external environment and our internal capacity to think do not proceed Our brain gets trapped into heuristics and shortcuts which could have resulted into competitive advantages 100 years ago but is not that to quickly adapt to the change of the external environment (I am not talking about a single brain but rather on a species level). shortcut of our brain that translates into behaviors which required less energy and thought to be implemented pari passu. plastic In other words, (this is how bias is defined in psychology) is nothing more for me than a simple the systematic deviation from a standard of rationality or good judgment evolutionary lag of our brain. Why all this ? Well, because I think that most of the biases data embed comes from our own cognitive biases (at least for data resulting from human and not natural activities). There is, of course, another block of biases which stems from pure statistical reasons ( ). Kris Hammond of Narrative Science merged those two views and identified in AI. In his words: excursus the expected value is different from the true underlying estimated parameter at least five different biases (bias that depends on the input data used); Data-driven bias ; Bias through interaction (it is simply the product of systems doing what they were designed to do); Similarity bias (systems designed for very specific business purposes end up having biases that are real but completely unforeseen); Conflicting goals bias (decisions made by systems aimed at personalization will end up creating bias “bubbles” around us). Emergent bias But let’s go back to the problem. How would you solve the biased data issue then? Simple solution: you can try to remove any data that could bias your engine . Great solution, it will require some effort at the beginning, but it might be feasible. ex-ante However, let’s look at the problem from a different angle. I was educated as an economist, so allow me to start my argument with this statement: . It is not only omni-comprehensive but also clean, consistent and deep both longitudinally and temporally speaking. let’s assume we have the perfect dataset Even in this case, . In other words, removing biases by hand or by construction is not a guarantee of those biases to not come out again spontaneously. we have no guarantee AI won’t learn the same bias autonomously as we did We have no guarantee AI won’t learn the same bias autonomously as we did. This possibility also raises another (philosophical) question: we are building this argument from the assumption that (mostly). So let’s say the machines come up with a result we see as biased, and therefore we reset them and start again the analysis with new data. But the machines come up with a similarly ‘biased result’. Would we then be open to accepting that as true and revision what we consider to be biased? biases are bad This is basically a . cultural and philosophical clash between two different species In other words, I believe that two of the reasons why embedding ethics into machine designing is extremely hard is that i) , and ii) we should be open to admit that our values or ethics might not be completely right and that what we consider to be biased is not the exception but rather the norm. we don’t really know unanimously what ethics is Developing a (general) AI is making us think about those problems and (if it hasn’t already started) . And perhaps, who knows, we will end up learning something from as well. it will change our values system machines’ ethics Image Credit: Notre Dame of Maryland University Online II. Accountability and trust Well, now you might think the previous one is a purely philosophical issue and that you probably shouldn’t care about it. But the other side of the matter is about how much you . Let me give you a different perspective to practically looking at this problem. trust your algorithms Let’s assume you are a medical doctor and you use one of the many algorithms out there to help you diagnose a specific disease or to assist you in a patient treatment. In the 99.99% of the time the computer gets it right — and it never gets tired, it analyzed billions of records, it sees patterns that a human eye can’t perceive, we all know this story, right? But what if in the remaining o.o1% of the case your instinct tells you something opposite to the machine result and you end up to be right? What if you decide to follow the advice the machine spit out instead of yours and the patient dies? Who is liable in this case? But even worse: let’s say in that case you follow your gut feeling (we know is not gut feeling though, but simply your ability to recognize at a glance something you know to be the right disease or treatment) and you save a patient. The following time (and patient), you have another conflict with the machine results but strong of the recent past experience (because of an or an bias) you think to be right again and decide to disregard what the artificial engine tells you. Then the patient dies. Who is liable now? hot-hand fallacy overconfidence The question is quite delicate indeed and the scenarios in my head are: a) a scenario where the doctor is only human with no machine assistance. The payoff here is that liability stay with him, he gets it right 70% of the time, but the things are quite clear and sometimes he gets right something extremely hard (the lucky guy out of 10,000 patients); b) a scenario where a machine decides and gets it right 99.99% of the time. The negative side of it is an unfortunate patient out of 10,000 is going to die because of a machine error and the liability is not assigned to either the machine or the human; c) a scenario the doctor is assisted but has the final call to decide whether to follow the advice. The payoff here is completely randomized and not clear to me at all. As a former economist, I have been trained to be heartless and reason in terms of expected values and big numbers (basically a ), therefore scenario looks the only possible to me because it saves the greatest number of people. But we all know is not that simple (and of course doesn’t feel right for the unlucky guy of our example): think about the case, for instance, of autonomous vehicles that lose controls and need to decide if killing the driver or five random pedestrians (the famous ). Based on that principles I’d save the pedestrians, right? But what about all those five are criminals and the driver is a pregnant woman? Does your judgement change in that case? And again, what if the vehicle could instantly use cameras and visual sensors to recognize pedestrians’ faces, connect to a central database and match them with health records finding out that they all have some type of terminal disease? You see, the line is blurring… Utilitarian b) Trolley Problem The final doubt that remains is then not simply about liability (and the choice between pure outcomes and ways to achieve them) but rather on trusting the algorithm (and I know that for someone who studied 12 years to become doctor might not be that easy to give that up). In fact, is becoming a real problem for algorithms-assisted tasks and it looks that people want to have an (even if incredibly small) degree of control over algorithms (Dietvorst et al., 2015; 2016). algorithm adversion But above all: And if so, in what circumstances and to what extent? are we allowed to deviate from the advice we get from accurate algorithms? Are we allowed to deviate from the advice we get from accurate algorithms? If an AI would decide on the matter, it will also probably go for scenario but we as humans would like to find a compromise between those scenarios because we ‘ ’ don’t feel any of those to be right. We can rephrase then this issue under the lens, which means that the goals and behaviors an AI have need to be aligned with human values — an AI needs to think as a human in certain cases (but of course the question here is how do you discriminate? And what’s the advantage of having an AI then? Let’s therefore simply stick to the traditional human activities). b) ethically ‘alignment problem’ In this situation, the work done by the Future of Life Institute with the becomes extremely relevant. Asilomar Principles The alignment problem, in fact, also known as ‘ ’, arises from the idea that no matter how we tune our algorithms to achieve a specific objective, we are not able to specify and frame those objectives well enough to prevent the machines to pursue undesirable ways to reach them. Of course, a theoretically viable solution would be to let the machine maximizing for our true objective without setting it , making therefore the algorithm itself free to observe us and understand what we really want (as a species and not as individuals, which might entail also the possibility of switching itself off if needed). King Midas problem ex-ante Sounds too good to be true? Well, maybe it is. I indeed totally agree with Nicholas Davis and Thomas Philbeck from WEF that in the wrote: Global Risks Report 2017 “There are complications: humans are irrational, inconsistent, weak-willed, computationally limited and heterogeneous, all of which conspire to make learning about human values from human behaviour a difficult (and perhaps not totally desirable) enterprise”. What the previous section implicitly suggested is that not all AI applications are the same and that apply differently to different industries. Under this assumption, it might be hard to draw a line and design an accountability framework that does not penalize applications with weak impact (e.g., a recommendation engine) and at the same time do not underestimate the impact of other applications (e.g,., healthcare or AVs). error rates We might end up then designing to justify algorithmic decision-making and mitigate negative biases. multiple accountability frameworks Certainly, the most straightforward solution to understand who owns the liability for a certain AI tool is thinking about the following threefold classification: (does it make any sense?); We should hold the AI system itself as responsible for any misbehavior (but it might be hard because usually AI teams might count hundred of people and this preventative measure could discourage many from entering the field); We should hold the designers of the AI as responsible for the malfunctioning and bad outcome (to me it sounds the most reasonable between the three options, but I am not sure about the implications of it. And then what company should be liable in the AI value chain? The final provider? The company who built the system in the first place? The consulting business which recommended it?). We should hold accountable the organization running the system There is not an easy answer and much more is required to tackle this issue, but I believe a good starting point has been provided by . They suggest to consider accountability through the lens of five core principles: Sorelle Friedler and Nicholas Diakopoulos : a person should be identified to deal with unexpected outcomes, not in terms of legal responsibility but rather as a single point of contact; Responsibility : a decision process should be explainable not technically but rather in an accessible form to anyone; Explainability : is likely to be the most common reason for the lack of accuracy in a model. The data and error sources need then to be identified, logged, and benchmarked; Accuracy garbage in, garbage out : third parties should be able to probe and review the behavior of an algorithm; Auditability : algorithms should be evaluated for discriminatory effects. Fairness Image Credit: mcmurryjulie/Pixabay III. AI usage and the control problem Everything we discussed so far was based on two implicit assumptions that we did not consider up to now: first, everyone is going to benefit from AI and everyone will be able and in the position to use it. This might not be completely true though. Many of us will indirectly benefit from AI applications (e.g., in medicine, manufacturing, etc.) but we might live in the future in a world where only a handful of big companies drives the AI supply and offers fully functional AI services, which might not be affordable for everyone and above all not . super partes vs a is a policy concern that we need to sort out today: if from one hand the former increases both the benefits and the rate of development but comes with all the risks associated with system collapse as well as malicious usages, the latter might be more safe but unbiased as well. AI democratization centralized AI Should AI be centralized or for everyone? The second hypothesis, instead, is that we will be forced to use AI with no choice whatsoever. This is not a light problem and we would need a higher degree of education on what AI is and can do for us to not be misled by other humans. If you remember the healthcare example we described earlier, this could be also a way to partially solve some problem in the accountability sphere. If the algorithm and the doctor have a contradictory opinion, you should be able to choose who to trust (and accepting the consequences of that choice). The two hypothesis above described lead us to another problem in the AI domain, which is the : if it is centralized, who will control an AI? And if not, how should it be regulated? Control Problem I wouldn’t be comfortable at all to empower any government or existing public entity with such a power. I might be slightly more favorable to a big tech company, but even this solution comes with more problems than advantages. We might then need a new impartial organization to decide how and when using an AI, but history teaches us we are not that good in forming mega impartial institutional players, especially when the stake is so high. Regarding the AI decentralization instead, the regulation should be strict enough to deal with cases such as (what happens when 2 AIs made by two different players conflict and give different outcomes?) or the ethical use of a certain tool (a few companies are starting their own ) but not so strict to prevent research and development or full access to everyone. AI-to-AI conflicts AI ethics board I will conclude this section with a final question: I strongly believe there should be a sort of ‘ ’ to switch off our algorithms if we realize we cannot control it anymore. However, the question is who would you grant this power to? red button Image Credit: TheDigitalWay IV. AI safety and catastrophic risks As soon as AI will become a commodity, it will be used maliciously as well. This is a virtual certainty. And the value alignment problem showed us that we might get in trouble due to a : it might be because of misuses ( ), because of some accident ( ), or it could be due to variety of different reasons misuse risks accident risks other risks. But above all, no matter the risk we face, it looks that AI is dominated by some sort of exponential chaotic underlying structure and getting wrong even minor things could turn into catastrophic consequences. This is why is paramount to understand every minor nuance and solve them all without underestimating any potential risk. Amodei et al. (2016) actually dug more into that and drafted a set of five different core problems in AI safety: ; Avoiding negative side effects ; Avoiding reward hacking (respecting aspects of the objective that are too expensive to be frequently evaluated during training); Scalable oversight (learning new strategies in a non-risky way); Safe exploration (can the machine adapt itself to different environments?). Robustness to distributional shift This is a good categorization of AI risks but I’d like to add the as fundamental as well, i.e., the way in which we interact with the machines. This relationship could be beneficial (see the ) but comes with several risks as well, as for instance the so-called , which is a highly visceral dependence of human on smart machines. interaction risk Paradigm 37–78 dependence threat A final food for thought: we are all advocating for full transparency of methods, data and algorithms used in the decision-making process. I would also invite you though to think that full transparency comes with the great . I am not simply referring to cyber attacks or bad-intentioned activities, but more generally to the idea that once the rules of the game are clear and the processes reproducible, it is easier for anyone to hack the game itself. risk of higher manipulation Maybe companies will have specific departments in charge of influencing their own or their competitors’ algorithms, or there will exist companies with the only scope of altering data and final results. Just think about that… Image Credit: Sergey Nivens/Shutterstock Bonus Paragraph: 20 research groups on AI ethics and safety There are plenty of research groups and initiatives both in academia and in the industry start thinking about the relevance of ethics and safety in AI. The most known ones are the following 20, in case you like to have a look at what the are doing: (Boston); Future of Life Institute (Boston); Berkman Klein Center (Global); Institute Electrical and Electronic Engineers — IEEE (Cambridge, UK); Centre for the study on existential risks (Global); K&L gates endowment for ethics (Berkeley, CA); Center for human-compatible AI (Berkeley, CA); Machine Intelligence Research Institute (Los Angeles); USC center for AI in society (Cambridge, UK); Leverhulme center for future of intelligence (Global); Partnership on AI (Oxford, UK); Future of Humanity Institute (Austin, US); AI Austin (San Francisco, US); Open AI (Global); Campaign to Stop Killer Robots (Global); Campaign against Sex Robots (Global); Foundation for Responsible Robotics (New York, US); Data & Society World Economic Forum’s (Global); Council on the Future of AI and Robotics (New York, US); AI Now Initiative (Stanford, CA). AI100 Finally, Google has just announced the (PAIR) initiative, which aims to advance the research and design of people-centric AI systems. ‘ People+AI research ’ Image Credit: Zapp2Photo/Shutterstock Conclusion Absurd as it might seem, I believe ethics is a technical problem. Writing this post, I realized how much little I know and even understand about those topics. It is incredibly hard to have a clear view and approach on ethics in general, let’s not even think about the intersection of AI and technology. I didn’t even touch upon other (e.g., unemployment, security, inequality, universal basic income, robot rights, social implications, etc.) but I will do in future posts (any feedback would be appreciated in the meantime). questions that should keep AI experts up at night I hope your brain is melting down as mine in this moment, but I hope some of the above arguments stimulated some thinking or ideas regarding new solutions to old problems. I am not concerned about robots taking over or Skynet terminates us all, but rather of humans using improperly technologies and tools they don’t understand. I think that the sooner we clear up our mind around those subjects, the better it would be. References Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D. (2016). “Concrete Problems in AI Safety”. . arXiv:1606.06565v2 Dietvorst, B. J., Simmons, J. P., Massey, C. (2015). “Algorithm aversion: People erroneously avoid algorithms after seeing them err”. Journal of Experimental Psychology 144(1): 114–126. Dietvorst, B. J., Simmons, J. P., Massey, C. (2016). “Overcoming Algorithm Aversion: People Will Use Imperfect Algorithms If They Can (Even Slightly) Modify Them”.Available at SSRN: or https://ssrn.com/abstract=2616787 http://dx.doi.org/10.2139/ssrn.2616787 — — Follow me on Medium Look at my other articles on AI and Machine Learning: Unsupervised Investments (II): A Guide to AI Accelerators and Incubators A Brief History of AI Open Source in Artificial Intelligence