Human versus Machine: Can the struggle for better decision-making apparatuses prove to be a forum for partnership? – Data Science W231

Human versus Machine
Can the struggle for better decision-making apparatuses prove to be a forum for partnership?
By Brian Neesby | March 10, 2019

“Man versus Machine” has been a refrain whose origins are lost to history—perhaps it dates back to the Industrial Revolution, perhaps to John Henry and the Steam Mill. Searching the reams of books on Google’s archives, the first mention of the idiom appears to hail from an 1833 article in the New Anti-Jacobin Review. Authorship is credited to Percy Bysshe Shelley, posthumously, but the editor was his cousin Thomas Medwin. Both poets are famous in their own right, but Shelly’s first wife, Mary Shelly, is probably more renown. Personally, I choose to believe that the author of Frankenstein herself dubbed the phrase.

Not only must the phrase be updated for modern sensibilities—take note of the blog’s gender-agnostic title—but the debate itself must be reimagined. Our first concerns were over who was the best at certain strategic, memory, or mathematical tasks. The public watched as world chess champion Garry Kasparov beat IBM’s Deep Blue in 1996, only to be conquered by the computer just on year later, when the machine could evaluate 200 million chess moves per second. I think in modern times, we can safely say that machines have won. In 2011, Watson, an artificial intelligence named after IBM’s founder, soundly beat Jeopardy champions, Ken Jennings and Brad Rutter, in the classic trivia challenge; it wasn’t close. But do computers make better decisions; they certainly make faster decisions, but are they substantively better? The modern debate with these first “thinking computers” centers on the use of automated decision making, especially those decisions that affect substantive rights.

Automated Decision Making

One does not have to go too far to find automated decision-making gone awry. Some decisions are not about rights, per se, but they can still have far-flung consequences.

Beauty.AI, a deep-learning system supported by Microsoft, was programmed to use objective factors, such as facial symmetry and lack of wrinkles, to identify the most attractive contestants in beauty pageants. It was used in 2016 to judge an international beauty contest of over 6000 participants. Unfortunately, the system proved racist; its algorithms equated beauty with fair skin, despite the numerous minority applicants. Alex Zhavoronkov, Beauty.AI’s Chief Science Officer, blamed the system’s training data, which “did not include enough minorities”.
Under the guise of objectivity, a computer program called the Correctional Offender Management Profiling for Alternative Sanctions (Compas) was created to rate a defendant on the likeliness of recidivism, particularly of the violent variety. The verdict—the algorithm was given high marks for predicting recidivism in general, but with one fundamental flaw; it was not color blind. Black defendants who did not commit crimes over the next two years were nearly twice as likely to be misclassified as higher risks vis-à-vis their white counterparts. The inverse was also true. White defendants who reoffended within the two-year period had been mislabeled low risk approximately twice as often as black offenders.
206 teachers were terminated in 2009 when Washington DC introduced an algorithm to assess teacher performance. Retrospective analysis eventually proved that the program had disproportionately weighed a small number of student survey results; other teachers had gamed the system by encouraging their students to cheat. At the time, the school could not explain why excellent teachers had been fired.
A Massachusetts resident had his driving license privileges suspended when a facial recognition system mistook him for another driver, one that had been flagged in an antiterrorist database.
Algorithms in airports inadvertently classify over a thousand customers a week as terrorists. A pilot for American Airlines was detained eighty times within a single year because his name was similar to a leader of the Irish Republican Army (IRA).
An Asian DJ was denied a New Zealand passport because his photograph was automatically processed; the algorithm decided that he had his eyes closed. The victim was gracious: “It was a robot, no hard feelings,” he told Reuters.

Human Decision-Making is all too “Human”

Of course, one could argue that the problem with biased algorithms is the humans themselves. Algorithms just entrench existing stereotypes and biases. Put differently, do algorithms amplify existing prejudice, or can they be a corrective? Unfortunately, decision-making by human actors does not fare much better than our robotic counterparts. Note the following use cases and statistics:

When researchers studied parole decisions, the results were surprising. The prisoner’s chance of being granted parole was heavily influenced by the timing of the hearing – specifically it’s proximity to the judge’s lunch hour. 65% of cases were granted parole in the morning hours. This fell precipitously over the next couple hours, occasionally to 0%. The rate returned to 65% once the ravenous referee had been satiated. Once again, late afternoon hours brought a resurgence of what Daniel Kahneman calls decision fatigue.
College-educated Blacks are twice as likely to face unemployment compared to all other students.
One study reported that applicants with white-sounding names received a call back 50% more often than applicants with black-sounding names, even when identical resumes were submitted to prospective employers.
A 2004 study found that when police officers were handed s series of pictures and asked to identify faces that “looked criminal”, they chose Black faces more often than White ones.
Black students are suspended three times more often than White students, even when controlling for the type of infraction.
Black children are 18 times more likely than White children to be sentenced as adults.
The Michigan State Law Review presented the results of a simulated capital trial. Participants were shown one of four simulated trial videotapes. The videos were identical except for the race of the defendant and/or the victim. The research participant – turned juror – was more likely to sentence a black defendant to death, particularly when the victim was white. The researchers’ conclusion speaks for itself: “We surmised that the racial disparities that we found in sentencing outcomes were likely the result of the jurors’ inability or unwillingness to empathize with a defendant of a different race—that is, White jurors who simply could not or would not cross the ’empathic divide’ to fully appreciate the life struggles of a Black capital defendant and take those struggles into account in deciding on his sentence.”

At this point, dear reader, your despair is palpable. Put succinctly, society has elements that are bigoted, racist, masochist – add your ‘ism’ of choice – and humans, and algorithms created by humans, reflect that underlying reality. Nevertheless, there is reason for hope. I shared the litany of bad decisions that are attributable to humans, without the aid of artificial intelligence, to underscore the reality that humans are just as prone to making unforgivable decisions as their robotic counterparts. Nevertheless, I contend that automated decision-making can be an important corrective for human frailty. As a data scientist, I might be biased in this regard – according to Kauffman, this would be an example of my brain’s self-serving bias. I think that the following policies can marry the benefits of human and automated decision-making, for a truly cybernetic solution – if you’ll permit me to misuse that metaphor. Here are some correctives that can be applied to automatic decision-making to provide a remedial effective for prejudiced or biased arbitration.

Algorithms should be reviewed by government and nonprofit watchdogs. I am advocating turning over both the high-level logic, as well as the source code, to the proper agency. I think there should be no doubt that government-engineered algorithms require scrutiny, since they involve articulable rights. The citizen’s sixth amendment right to face their accuser would alone necessitate this, even if the accuser in this case is an inscrutable series of 1s and 0s. Nevertheless, I think that corporations could also benefit from such transparency, even if it is not legally coerced. If a trusted third-party watch dog or government agency has vetted a company’s algorithm, the good publicity – or, more likely, the avoidance of negative publicity – could be advantageous. The liability of possessing a company’s proprietary algorithm would need to be addressed. If a nonprofit agency’s security was compromised, damages would likely be insufficient to remedy a company’s potential loss. Escrow companies routinely take on such liability, but usually not for clients as big as Google, Facebook, or Amazon. The government might provide some assistance here, by guaranteeing damages in the case of a security breach.
There also need to be publicly-accessible descriptions of company algorithms. The level of transparency for the public cannot be expected to be quite as formulaic as above; such transparency should not expose proprietary information, nor permit the system to be gamed in a meaningful way.
Human review should be interspersed into the process. I think a good rule of thumb is that automation should preserve rights or other endowments, but rights, contractual agreements, or privileges, should only be revoked after human review. Human review, by definition, necessitates a diminution in privacy. This should be weighed appropriately.
Statistical review is a must. The search for a discriminatory effect can be used to continually adjust and correct algorithms, so that bias does not inadvertently creep in.

One final problem presents itself. Algorithms, especially those based on deep learning techniques, can be so opaque that it becomes difficult to explain their decisions. Alan Winfield, professor of robot ethics at the University of the West of England, is leading a project to solve this seemingly intractable problem. “My challenge to the likes of Google’s DeepMind is to invent a deep learning system that can explain itself,” Winfield said. “It could be hard, but for heaven’s sake, there are some pretty bright people working on these systems.” I couldn’t have said it better. We want the best and the brightest humans working not only to develop algorithms to get us to spend our money on merchandise, but also to develop algorithms to protect us from the algorithms themselves.

Sources:
https://www.theguardian.com/technology/2017/jan/27/ai-artificial-intelligence-watchdog-needed-to-prevent-discriminatory-automated-decisions
https://www.marketplace.org/2015/02/03/tech/how-algorithm-taught-be-prejudiced
https://humanhow.com/en/list-of-cognitive-biases-with-examples/
https://www.forbes.com/sites/markmurphy/2017/01/24/the-dunning-kruger-effect-shows-why-some-people-think-theyre-great-even-when-their-work-is-terrible/#541115915d7c
https://www.pnas.org/content/108/17/6889
https://deathpenaltyinfo.org/studies-racial-bias-among-jurors-death-penalty-cases

Leave a Reply Cancel reply