Been There, Done That: What Data Science Can Learn from Psychology
by Kim Darnell on September 20, 2018
In the wake of recent revelations regarding misuses and abuses of personal data by a variety of well-known and successful companies, as well as the growing evidence that big data are being used to actively perpetuate and increase socioeconomic inequality, it might seem like data science as a discipline has wandered so far down a dark ethical path that there is no clear map to recovery.
As a professor of psychology and a data scientist, however, I see something very different: A young field that is still trying to figure out how to maximize its potential in a fast-paced, dynamic world while still following a stable, practical moral compass. Psychology was there once, too, performing highly controversial studies, such as the Milgram experiment, which showed everyday Americans that, like their counterparts in Nazi Germany, they would engage in the potentially life-threatening torture of strangers if instructed to do so by an authority figure. Or the Stanford prison experiment, which revealed that even the most privileged among us can become predators or prey at the flip of a coin when placed in a prison environment.
Today, data scientists and those who employ them are struggling publicly, if not painfully, to find the right balance between getting the data they want while respecting the rights of those they get the data from. Fortunately, psychology can offer a detailed and time-tested framework for making that struggle less difficult.
In the United States, any licensed psychologist or employee of a training program approved by the American Psychological Association (APA) is bound by the Ethical Principles of Psychologists and Code of Conduct, also known as the APA Code of Ethics. This code is centered on five principles that are intended to “guide and inspire … toward the very highest ethical ideals of the profession.” They include A) Beneficence and Nonmaleficence, B) Fidelity and Responsibility, C) Integrity, D) Justice, and E) Respect for People’s Rights and Dignity. Taken together, these principles and the rules they give rise to govern psychologists’ behavior in all areas of professional practice (e.g., therapy, research, education, public service) and describe how we must:
- resolve conflicts of interest among our domains of practice, and that practice with the law;
- interact with our colleagues and clients in a way that guarantees they understand what we are doing, why we are doing it, and what we think the consequences of our collective actions might be;
- define the limits of our professional competence, as well as that of our colleagues and clients;
- and address any mistrust or harmful effects that arise from our professional conduct, and do so in a meaningful and timely fashion.
Of course, psychology’s model is not the only plausible source for a data science code of ethics. Data for Democracy, for example, has attempted to crowdsource a code of ethical conduct from the data science community itself, an effort supported by former U.S. Chief Data Scientist and data ethics evangelist, D.J. Patil. Others have proposed a data science code of ethics based on the Hippocratic Oath or the code of ethics for the National Association of Social Workers. Each of these approaches has their strengths and weaknesses, but none seems to offer the comprehensive perspective that the APA Code of Conduct does.
However we ultimately choose to resolve the crafting of a data science code of ethics, there are a few things we can be sure of. First, we as data scientists need to break the bad habit of asking for forgiveness rather than permission. If we don’t, the general public will become so mistrustful of us that they refuse to provide us with the honest and representative data we need to do our jobs well. Second, we need to avoid falling prey to the entropy of procrastination. Otherwise, we will find our own code of ethics defined for us piecemeal by various government entities, the majority of which have members who know far less about the ethics of human subjects research and data science technology than they do about their current polling numbers and chances for re-election.
Psychology as a discipline runs the gamut from social science to biological science, and thus has constructed its code of ethical conduct to function effectively in diverse intellectual, cultural, and professional contexts. Given that data science is facing a comparably Herculean but highly related task, it seems both reasonable and efficient for our young discipline to take advantage of the insight that psychology can offer and base our own code of ethics on its well-validated model.