morganya – Page 2 – Data Science W231 | Behind the Data: Humans and Values

November 30, 2020

Twitch and the U.S. Military – an Ethical Issue

Twitch and the U.S. Military – an Ethical Issue
By Harinandan Srikanth | November 27, 2020

Image: How do we address the ethics or lack thereof in the military recruiting minors and young adults on Twitch when many of the representatives in Congress have never heard of Twitch?
(The Verge)

This was the frustration expressed by Congresswomen Alexandria Ocasio-Cortez from New York’s 14th district after the House of Representatives failed to vote for her bill to ban the U.S. Army from recruiting on Twitch. The draft of the amendment to the House Appropriations Bill, proposed on July 22nd, “would ban U.S. military organizations from using funds to ‘maintain a presence on Twitch.com or any video game, e-sports, or live-streaming platform.’” (Polygon.com). Twitch is a subsidiary of Amazon that specializes in video live streaming. This platform primarily supports gaming channels but also content other than gaming. Twitch has grown over the past decade to become the leading platform for online gaming, surpassing Youtube Gaming’s audience in recent years.

With 72% of men and 49% of women ages 18 to 29 engaging in gaming as a source of entertainment, the U.S. Military saw prominent live-streaming platforms like Twitch as an opportunity for recruitment from “Gen Z”. The U.S. Army launched its esports team in 2018, receiving 7,000 applicants for 16 spots, with team members streaming war-related games on “Twitch, Discord, Rivals, Mixer, and Facebook” (Military.com). The Army primarily uses fake prize giveaways on its esports channels to direct viewers to the recruitment page (Polygon.com). The number of recruiting leads has been growing rapidly, with 3,500 recruiting leads last year and 13,000 recruiting leads this year. The U.S. Navy and U.S. Air Force followed suit in recruiting gamers on live-streaming platforms (Military.com).

There was, however, an exception to the U.S. Military’s embrace of recruitment via online gaming, which was the U.S. marines. Last year, the Marine Corps Recruiting Command wrote that they would “not establish eSports teams or create branded games… due in part to the belief that the brand and issues associated with combat are too serious to be ‘gamified’ in a
responsible manner” (Military.com). Representative Ocasio-Cortez echoed this concern in another tweet:

Image: AOC justifying her legislation to ban the military from recruiting on Twitch (Polygon.com).

This tweet highlights the dangerous potential for the U.S. Army’s recruitment via fake prize giveaways on esports channels to lead minors and young adults to conclude that being a member of the armed forces is as easy as playing a war-related game. Sgt. 1st Class Joshua David, deployed as a Green Beret, says that reality could not be more different from the game. According to Sgt. 1st Class Christopher Jones, “He’ll tell every single person that we engage with that there’s no comparison between the two. There’s no way soldiers are going to carry 90 pounds’ worth of equipment moving in an environment like that, essentially superhuman. You know, these environments are made up; they’re fictional” (Military.com). There is also an informed consent issue presented by this method of recruitment. If minors and young adults who are led to the U.S. Army’s recruitment page via Twitch and similar platforms get the impression that military service is just like playing games, then they are not making the choice of signing up for service in the U.S. Army with the knowledge of what being in the Army is actually like.

On the flip side, deputy chief marketing officer for Navy Recruiting Command Allen Owens says that esports also has the potential to enlighten young people about the realities of military service. If, for example, an aircraft mechanic is good at a shooter game and the person their playing with asks them if shooting is their specialty in the military, the mechanic can explain what their real job is and that being good at shooting in real life is completely different from in a game (Military.com). While those possibilities are on the horizon, however, there are steps that both the U.S. Military and live-streaming companies need to take to resolve the ethical issues presented by recruitment via platforms like Twitch.

References

1. “After impassioned speech, AOC’s ban on US military recruiting via Twitch fails House vote”. The Verge. https://www.theverge.com/2020/7/30/21348451/military-recruiting-twitch-ban-block-amendment-ocasio-cortez
2. “Amendment would ban US Army from recruiting on Twitch”. Polygon.
https://www.polygon.com/2020/7/30/21348126/twitch-military-ban-alexandria-ocasio-cortez-aoc-law-congress-amendment-army-war-crimes
3. “As Military Recruiters Embrace Esports, Marine Corps Says it Won’t Turn War into a Game”. Military.com. https://www.military.com/daily-news/2020/05/12/military-recruiters-embrace-esports-marine-corps-says-it-wont-turn-war-game.html

November 30, 2020

Should we regulate apps like we do addictive drugs?

Should we regulate apps like we do addictive drugs?
By Blake Allen | Nov. 26, 2020

image source: Rosyscription.com

You pull your phone out at midnight, there’s the familiar buzz of a notification… waking up and loading your app you see that someone tagged a friend at a party you weren’t invited to. You get a sinking feeling… maybe they forgot? You press on, reading, scrolling and liking. Desperate for something that you can’t quite find, you eventually give up. That’s when you notice it’s 3am. Shocked at how much precious sleep you just wasted you wonder how did it get to this?

While the above story is fictionalized, it may not be for many people. As our phones and technology become more sophisticated, the apps we use are becoming more addictive… and this is by design [1]. Addictive technology is defined as software that attempts to hijack normal user behavior by subtle manipulation via hacking our innate reward systems. While trying to create a product that makes consumers use it over and over again is nothing new, for example original Coca-Cola had cocaine in it’s recipe [6]. As our societies progressed, many highly addictive substances were outlawed and controlled for our safety. Is it time for technology to receive the same process?

Technology addiction is estimated to have a rate between 1.5 and 8.2% of individuals. [2] In a country like the US this could correspond to roughly 3 to ~ 20 million individuals. Despite this shockingly high amount of affected individuals, there has been no formal governmental response to the issue of internet addiction. In fact there is a debate as to whether or not the diagnostic and statistical manual of mental disorders (DSM) should even seek to define internet addiction.The American Society of Addiction Medicine (ASAM) recently released a new definition of addiction as a chronic brain disorder, officially proposing for the first time that addiction is not limited to substance use. [3]

How to classify technological addiction?
What defines someone who is highly engaged between an addicted individual? Internet addiction can be summarized as the inability to control the amount of time spent interfacing with digital technology, withdrawal symptoms when not engaged, a diminishing social life and adverse work or academic consequences. [1] This may not describe you, but how many people who define themselves as ‘influencers’ could this define? In fact, phone addiction is becoming so prevalent a new word was defined “phubbing” which stands for phone snubbing – or the act when someone ignores another to look at their phone. [7]

How technology gets us hooked:
How did addictive technology even get created in the first place? Addictive apps such as instagram employ what is known as the hooked model, aptly described by Nir Eyal in his booked titled “hooked: how to build habit forming products. [4] In it Eyal describes a four step process of a trigger, which causes an action, a variable reward, and an investment by the user.
The trigger, often a push notification, interrupts our daily life and sends us down a distracting rabbit hole that may consume precious hours of our daily life. It could be argued that social media is actually manipulating us to take action. This manipulation is being codified, studied and amplified through the use of machine learning which can scale this in an unprecedented way. The end result is that each year our technology learns exactly what we like and how to press our buttons in order to increase engagement.

1. Trigger – External or internal cues that prompt certain behavior
2. Action – Use of the product, based on ease of use and motivation
3. Variable Reward – The reason for product use, which keeps the user engaged
4. Investment – A useful input from the user that commits him to go through the cycle again
Source: “Hooked” by Nir Eyal [4]

Who is at most risk?
While most individuals are likely to have a handle on their technology use, which users are most likely to fall victim to addictive technology? In a study done with rats, it was shown that rats preferred social interaction to highly addictive substances such as heroin and meth. [5] This is actually quite surprising and leads to some interesting conclusions. The reward for social engagement is actually more motivating than extremely addictive substances. When applying this to humans, one could argue that individuals who are socially isolated are at a higher risk for all forms of addictions including internet addiction.

Ethical concerns:
If a user is psychologically dependent on a technology, are they in fact being manipulated by that technology? I argue that addictive technology is manipulation, as it is hijacking internal reward systems in order to create a habitual activity which benefits a private company (ie: facebook, twitter, netflix, etc.). These companies have a perverse incentive to make the most addictive technologies as it directly corresponds to a larger bottom line. The more addictive their technology, the more successful they are. With a complete lack of regulation there is no incentive to quell this technology, in fact, if technologies don’t employ addictive technology they may be at a disadvantage in the marketplace.

Inherent in addictive technology is a lack of informed consent. On the surface each social media seems to provide a simple service, connecting with friends. What is often lurking underneath the surface are highly sophisticated machine learning agents which are learning exactly which buttons to press in order to get you to engage with their service. This isn’t something that the average user is informed about, nor are they aware that this is going behind the scenes. If the user were prompted with a warning label similar to say the ones found on cigarettes, they may have a better understanding of the potential harms of using such a service.

In addition the algorithms which are driving user interaction could be problematic as they could be causing a net negative experience for an individual. For instance, perhaps an individual feels like they aren’t beautiful enough, the applications may be feeding this insecurity because it is a primary driver for user behavior. The user has more activity, but as time goes on has an increasing negative impact on that user’s mental health as their insecurities are constantly being reinforced. This becomes a runaway process which could lead to disastrous consequences if left unchecked.

Mitigation / Management:
Addictive technology is a fairly new field but it leverages the years of psychological research that has been done to classify and codify what actions leverage user behavior. One could codify these addictive elements and either put safeguards in place or outright outlaw them in order to protect consumers. Additional outreach could be made to individuals who use technology above a certain threshold which seeks to engage them and promote social interactions, which could lessen the desire for addictive technology.

What is important is that we as a society understand the root causes of addiction and treat this as a mental health issue. If safeguards can be made, then it’s possible we can have a meaningful and safe interaction with our applications that doesn’t lead us down the rabbit hole of addiction. Furthermore we should penalize any company employing these addictive techniques without regulation. There is far too much at stake for our mental health if these companies are left unchecked.

Question to the reader, do you feel like you’re addicted to an app / apps? If so, which ones and why? comment below!

Sources:
[1] Hilarie Cash et al. Internet Addiction: A brief summary of Research and Practice. November, 2012. Current Psychiatry Review. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3480687/

[2] Weinstein A, Lejoyeux M. Internet addiction or excessive Internet use. The American Journal of Drug and Alcohol Abuse. 2010 Aug;36(5 ):277–83. https://www.ncbi.nlm.nih.gov/pubmed/20545603

[3] American Society of Addiction Medicine. Public Policy Statement: Definition of Addiction. 2011 [cited 2011 August 21]; http: //www.asam.org/1DEFINITION_OF_ ADDICTION_LONG_4-11.pdf. Public Policy Statement: Definition of Addiction. 2011 [cited 2011 Augus.

[4] Hooked: How to build Habit-Forming Products, Nir Eyal. Penguin Random House. December 26, 2013

[5] Venniro, M., Zhang, M., Caprioli, D., et al. Volitional social interaction prevents drug addiction in rat models. Nature Neuroscience. 21(11):1520-1529, 2018.

[6] Did Coca Cola have cocaine in the original recipe? https://teens.drugabuse.gov/blog/post/coca-colas-scandalous-past

[7] Phubbing, a definition. https://www.healthline.com/health/phubbing

November 30, 2020

Automatic License Plate Readers – For your Safety?

Automatic License Plate Readers – For your Safety?
By Anonymous | November 27, 2020

Imagine a system that can track where you have been and knows where you will be going in the future. This system is not science fiction; it is reality. A network of interconnected video cameras located on interstate highways, local streets, outside of homes, and in police cars creates a universal mesh of data of your personal breadcrumbs. When interconnected, the system can find individuals in real-time by capturing your license plate number. The technology of Automatic License Plate Readers (ALPR) has exploded in growth with databases containing more than 15 Billion license plates, covering the majority of the United States and present in over 50 countries. Though the solution was sold as a way to reduce crime while keeping up with shrinking law enforcement budgets, personal privacy protection has been lagging.

Image: Electronic Frontier Foundation

Historically, law enforcement would look for license plate numbers by walking and driving down streets within the city. This process would naturally limit how much information was gathered. Today, officers can incorporate cameras within their cars that will indiscriminately capture images on vehicles, not because of suspected criminal activity, but because the information might be useful in future investigations. This method is called “gridding” and feeds the data into the ALPR system.

Any law enforcement agency that participates in ALPR will have access to real-time data of over 150 million plate reads per month at no cost. In return, most ALPR systems require access to license plate reads from cameras established in that specific jurisdiction. The use of ALPR systems goes beyond law enforcement agencies and is available to anyone, even private citizens. Local neighborhoods can pay $2,000/mo for a neighborhood camera that will scan license plates, allowing neighbors to view travel patterns and follow up on suspicious vehicles.

Image: Rekor Recognition System

The collection of massive amounts of data is not new. Google has been collecting images of everything outdoors and is viewable through its product, Google Streets. The same argument was used for the mass collection of license plate information, it is already publicly visible and accessible to everyone to record or take pictures. Companies creating ALPR systems are taking publicly available information and using the data to help catch criminals and reduce city expenses. So what is the harm?

The harm is that law enforcement could misuse ALPR systems to stalk individuals at their work, events, or political rallies, or even at their doctors’ offices. It enables law enforcement or anyone to analyze travel patterns that could reveal sensitive information, regardless of whether they are suspected of criminal activity. For instance, police can use ALPR data to determine the places people visit, which doctors they go to, and what religious services they attend.

If these technologies were deployed without reasonable suspicion, personal bias could intervene and police could deploy this technology more heavily in low income and minority neighborhoods. Police could grid these neighborhoods more often, leading to over-policing these areas. In 2016, a BuzzFeed investigation found that ALPRs in Port Arthur, Texas, were primarily used to find individuals with unpaid traffic citations, leading to their incarceration.

Image: Gridding – Tempe Police Department

Eric J. Richard was driving his white Buick LaCrosse on Interstate 10, when Louisiana State Police stopped him for following a truck too closely. During the roadside interrogation, the trooper asked where Richard was traveling from. “I was coming from my job right there in Vinton,” Richard replied. The officer had already looked up the travel records for Richard’s car and already knew it had crossed into Louisiana from Texas earlier in the day. Based on this “apparent lie,” the trooper extended the traffic stop by asking more questions and calling in a drug dog.

The privacy harms with ALPR systems become very apparent when collecting millions of license plates of innocent individuals are analyzed together, creating new use cases that were never possible before. The ability to track individuals in real-time and analyzing travel patterns to predict where people might be, provide new capabilities that were not available before. These new capabilities, coupled with a weak privacy policy, has resulted in numerous harms stemming from the lack of training, inconsistent data retention policies, unclear access, auditing, and security policies.

References:

Image Source: Image: Electronic Frontier Foundation. https://www.youtube.com/watch?v=ofpxX49vdXY
Image Source: Image: Rekor Recognition System. https://www.google.com/streetview/
Image Source: Image: Gridding – Tempe Police Department. https://www.azmirror.com/2019/07/08/arizona-police-agencies-gather-share-license-plate-data-but-few-ensure-rules-are-being-followed
https://www.eff.org/deeplinks/2020/02/california-auditor-releases-damning-report-about-law-enforcements-use-automated
https://www.nbcbayarea.com/news/local/south-bay/neighbors-install-license-plate-reader-in-los-gatos/2233057
https://massprivatei.blogspot.com/2019/09/massive-30-state-real-time-alpr.html?q=Rekor+Systems
https://massprivatei.blogspot.com/2020/02/rekor-systems-uses-video-doorbells-to.html
https://www.ncjrs.gov/pdffiles1/nij/grants/247283.pdf
https://www.technocracy.news/police-use-license-plate-readers-to-grid-neighborhoods/

October 21, 2020

Clearview AI: The startup that is threatening privacy

Clearview AI: The startup that is threatening privacy
By Stefania Halac, October 16, 2020

Imagine walking down the street, a stranger points their camera at you and can immediately pull up all your pictures from across the internet; they may see your instagram posts, your friends’ posts, any picture that you appear in, some which you may have never seen before. This stranger could now ascertain where you live, where you work, where you went to school, whether you’re married, who your children are… This is one of many compromising scenarios that may become part of our normal life if facial recognition software is widely available.

Clearview AI, a private technology company, offers facial recognition software that can effectively identify any individual. Facial recognition technology is intrinsically controversial, so much so that certain companies like Google don’t offer facial recognition APIs due to ethical concerns. And while some large tech companies like Amazon and Microsoft do sell facial recognition APIs, there is an important distinction between Clearview’s offering and that of the other tech giants. Amazon and Microsoft only allow you to search for faces from a private database of pictures supplied by the customer. Clearview instead allows for recognition of individuals in the public domain — practically anyone can be recognized. What sets Clearview apart is not its technology, but rather the database it assembled of over three billion pictures scraped from the public internet and social media. Clearview AI did not obtain consent from individuals to scrape these pictures, and has been sent cease and desist orders from major tech companies like Twitter, Facebook and Youtube over its practices due to policy violations.

In the wake of the Black Lives Matter protests earlier this year, IBM, Microsoft and Amazon updated their policies to restrict the sale of their facial recognition software to law enforcement agencies. On the other hand, Clearview AI not only sells to law enforcement and government agencies, but until May of this year was also selling to private companies, and has even been reported to have granted access to high net-worth individuals.

So what are the risks? One on hand, the algorithms that feed these technologies are known to be heavily biased and perform more poorly on certain minority populations such as women and African Americans. In a recent study, Amazon’s Rekognition was found to misclassify women as men 19% of times, and darker-skinned women for men 31% of time. If this technology were to be used in the criminal justice system, one implication here is that dark-skinned people would be more likely to be wrongfully identified and convicted.

Another major harm is that this technology essentially provides its users the ability to find anyone. Clearview’s technology would enable surveillance at protests, AA meetings and religious gatherings. Attending any one of these events or locations would become a matter of public record. In the wrong hands, such as those of a former abusive partner or a white supremacist organization, this surveillance technology could even be life-threatening for vulnerable populations.

In response, the ACLU filed a lawsuit against Clearview AI in May for violation of the Illinois Biometric Information Privacy Act (BIPA), alleging the company illegally collected and stored data on Illinois citizens without their knowledge or consent and then sold access to its technology to law enforcement and private companies. While some cities like San Francisco and Portland have enacted facial recognition bans, there is no overarching national law protecting civilian privacy from these blatant privacy violations. With no such law in sight, this may be the end of privacy as we know it.

References:

We’re Taking Clearview AI to Court to End its Privacy-Destroying Face Surveillance Activities

October 21, 2020

The Gender Square: A Different Way to Encode Gender

The Gender Square: A Different Way to Encode Gender
By Emma Tebbe, October 16, 2020

Image: square with two axes, the horizontal reading Masculine and Feminine and the vertical reading Low Gender Association / Agender and Strong Gender Association

As non-gender-conforming and transgender folks become more visible and normalized, the standard male / female / other gender selections we all encounter in forms and surveys become more tired and outdated. First of all, the terms “male” and “female” generally refer to sex, or someones biological configuration, “including chromosomes, gene expression, hormone levels and function, and reproductive/sexual anatomy.” Male and female are not considered the correct terms for gender orientation, which “refers to socially constructed roles, behaviours, expressions and identities of girls, women, boys, men, and gender diverse people.” Although sex exists on a spectrum which includes intersex people, gender has a wide range of identities, including agender, bigender, and genderqueer. This gender square method of encoding gender aims to encompass more of the gender spectrum than a simple male / female / other selection.

Image: triangle defining sex, gender expression, gender attribution, and gender identity

Upon encountering this square in a form or survey, the user would drag the marker to the spot on the square that most accurately represents their gender identity. This location would then be recorded as a coordinate pair, where (0, 0) is the center of the square. The entity gathering the data would then likely use those coordinates to categorize respondents. However, using continuous variables to represent gender identity allows for many methods of categorization. The square could be divided into quadrants, as pictured above, vertical halves (or thirds, or quarters), or horizontal sections. This simultaneously allows for flexibility in how to categorize gender and reproducibility of results by other entities. Other analysts would be able to reproduce results if they are given respondents’ coordinates and the categorization methodology used. Coordinate data could even be used as it was recorded, turning gender from a categorical variable into a continuous one.

Although this encoding of gender encompasses more dimensions, namely representing gender as a spectrum which includes agender identities, it still comes with its own problems. First of all, the gender square still does not leave room for flexible gender identities including those whose gender is in flux or those who identify as genderfluid or bigender. There are a few potential solutions for this misrepresentation on the UI side, but these create new problems with data encoding. Genderfluid folks could perhaps draw an enclosed area in which their gender generally exists, but recording this data is much more complex than a simple coordinate pair, and would become an array of values rather than a coordinate pair. People who identify as bigender could potentially place two markers, one for each of the genders they experience. Both this approach and an area selection approach make the process of categorization more complex – if an individual’s gender identity spans two categories, would they be labeled twice? Or would there be another category for people who fall into multiple categories?

Image: a gender spectrum defining maximum femininity as “Barbie” and maximum masculinity as “G.I. Joe”

Another issue might arise with users who haven’t questioned their gender identity along either of these axes, and may not understand the axes (particularly the Highly Gendered / Agender axis) enough to accurately use the gender square. When implemented, the gender square would likely need an explanation, definitions, and potentially suggestions. Suggestions could include examples such as “If you identify as a man and were assigned that gender at birth, you may belong in the upper left quadrant.” Another option may be to include examples such as in the somewhat problematic illustration above.

This encoding of gender would likely first be adopted by groups occupying primarily queer spaces, where concepts of masculinity, femininity, and agender identities are more prominent and considered. If used in places where data on sex and transgender status is vital information, such as at a doctor’s office, then the gender square would need to be supplemented by questions obtaining that necessary information. Otherwise, it is intended for use in spaces where a person’s sex is irrelevant information (which is most situations where gender information is requested).

Although still imperfect, representation and identification of gender along two axes represents more of the gender spectrum than a simple binary, and still allows for categorization, which is necessary for data processing and analytics. With potential weaknesses in misunderstanding and inflexibility, it finds its strength in allowing individuals to more accurately and easily represent their own identities.

References:
https://cihr-irsc.gc.ca/e/48642.html
https://www.glsen.org/activity/gender-terminology
https://journals.sagepub.com/doi/full/10.1177/2053951720933286
Valentine, David. The Categories Themselves. GLQ: A Journal of Lesbian and Gay Studies, Volume 10, Number 2, 2004, pp. 215-220
https://www.spectator.co.uk/article/don-t-tell-the-parents for image only

October 19, 2020

When Algorithms Are Too Accurate

When Algorithms Are Too Accurate
By Jill Cheney, October 16, 2020

An annual rite of passage every Spring for innumerable students is college entrance exams. Regardless of their name, the end result is the same: to influence admission applications. When the Covid-19 pandemic swept the globe in 2020, this milestone changed overnight. Examinations were cancelled, leaving students and universities with no traditional way to evaluate admission. Alternative solutions emerged with varying degrees of veracity.

In England, the solution used to replace their A-level exams involved developing a computer algorithm to predict student performance. In the spirit of a parsimonious model, two parameters were used: the student’s current grades and the historical test record of the attending school. The outcome elicited nationwide ire by highlighting inherent testing realities.

Overall, the predicted exam scores were higher – more students did better than on any previous resident exam with 28% getting top scores in England, Wales and Northern Ireland. However, incorporating the school’s previous test performance into the algorithm created a self-fulfilling reality. Students at historically high performing schools had inflated scores; conversely, students from less performing schools had deflated ones. Immediate cries of AI bias erupted. However, the data wasn’t wrong – the algorithm simply highlighted the inherent biases and disparity in the actual data modeled.

Reference points did exist for the predicted exam scores. One was from teachers since they provide a prediction on student performance. The other was from student scores on previous ‘mock’ exams. Around 40 percent of students received a predicted score that was one step lower than their teachers’ predictions. Not surprisingly, the largest downturn in predictions occurred amongst poorer students. Many others had predicted scores below their ‘mock’ exam scores. Mock exam results support initial university acceptance; however, they must be followed-up with commensurate official exam scores. For many
students, the disparity between their predicted and ‘mock’ exam scores jeopardized their university admission.

Attempting to rectify the disparities came with its own challenges. Opting to use teacher predicted scores required accepting that not all teachers provided meticulous student predictions. Based on teacher predictions alone, 38% of predicted scores would have been at the highest levels: A*s and As. Other alternatives included permitting students to retake the exam in the Fall or allowing the ‘mock’ exam scores to stand-in should they be higher than the predicted ones. No easy answers existed when attempting to navigate an equitable national response.

As designed, the computer model assessed the past performance of a school over student performance. Individual grades could not offset the influence of a school’s testing record. It also clearly discounted more qualitative variables, such as test performance skills. In the face of a computer-generated scoring model, a feeling of powerlessness emerged. No longer did students feel they possessed control over their future and schooling opportunities.

Ultimately, the predictive model simply exposed the underlying societal realities and quantified how wide the gap actually is. In the absence of the pandemic, testing would have continued on the status quo. Affluent schools would have received higher scores on average than fiscally limited schools. Many students from disadvantaged schools would have individually succeeded and gained university admission. The public outcry this predictive algorithm generated underscores how the guise of traditional test conditions assuages our concerns about the realities of standardized testing.

Sources:
https://www.theverge.com/2020/8/17/21372045/uk-a-level-results-algorithm-biased-coronavirus-covid-19-pandemic-university-applications

https://www.bbc.com/news/education-53764313

October 19, 2020

Data as Taxation

Data as Taxation
By Anonymous, October 16, 2020

Data is often analogized with transaction. We formulate our interactions with tech companies as an exchange of our data as payment for services, which in turn allow for the continued provision of those services.

Metaphors like these can be useful in that they allow us to port developed intuitions from a well-trodded domain (transactions) to help us navigate more less familiar waters (data). In this spirit, I wanted to further develop this “data collection = economic transaction” metaphor, and explore how our perceptions of data collection change with a slight tweak: “data collection = taxation”

In the context of data collection, the following quote from Supreme Court Justice Oliver Wendall Holmes might give one pause. Is this applicable, or entirely irrelevant?

Here’s what I mean: with taxation, government bodies mandate that citizens contribute a certain amount of resources to fund public services. The same goes for data – while Google, Facebook, and Amazon are not governments, they also create and maintain enormous ecosystems that facilitate otherwise impossible interactions. Governments allow for a coordination around national security, education, and supply chains, and Big Tech provides the digital analogues. Taxation and ad revenue allow for the perpetual creation of this value. Both can embody some (deeply imperfect) notion of “consent of the governed” through voter and consumer choice, although neither provides an easy way to “opt out.”

Is this metaphor perfect? Not at all, but there is still value in making the comparison. We can recycle centuries of bickering over fairness in taxation.

For instance, one might ask “when is taxation / data collection exploitative?” On one end, some maintain that “all taxation is theft,” a process by which private property is coercively stripped. Some may feel a similar sense of violation as their personal information is harvested – for them, perhaps the amorphous concept of “data” latches onto the familiar notion of “private property,” which might in turn suggest the need for some kind of remuneration.

At the other extreme, some argue that taxation cannot be the theft of private property, because the property was never private to begin with. Governments create the institutions and infrastructure that allows the concept of “ownership” to even exist, and thus all property is on loan. One privacy analogue could be that the generation of data is impossible and worthless without the scaffolding of Big Tech, and thus users have a similarly tenuous claim on their digital trails.

The philosophy of just taxation has provided me an off-the-shelf frame by which to parse a less familiar space. Had I stayed with the “data collection = economic transaction” metaphor, I would have never thought about data from this angle. As is often the case, a different metaphor illuminates different dimensions of the issue.

Insights can flow the other way as well. For example, in data circles there is a developing sophistication around what it means to be an “informed consumer.” It is recognized by many that merely checking the “I agree” box does not constitute a philosophically meaningful notion of consent, as the quantity and complexity of relevant information is too much to expect from any one consumer. Policies and discussions around the “right to be forgotten”, user control of data, or the right to certain types of transparency acknowledge the moral tensions inherent in the space.

These discussions are directly relevant to justifications often given for a government’s right to tax, like the “social contract” or the “consent of the governed.” Both often have some notion of informed consent, but this sits on similarly shaky ground. How many voters know how their tax dollars are being spent? While government budgets are publicly available, how many are willing to sift through reams of legalese? How many voters can tell you what military spending is within an even order of magnitude? Probably as many as who know exactly how their data is packaged and sold. The data world and its critics have much to contribute to the question of how to promote informed decision-making in a world of increasing complexity.

Linguists George Lakoff and Mark Johnson suggest that metaphors are central to our cognitive processes.

Of course, all of these comparisons are deeply imperfect, and require much more space to elaborate. My main interest in writing this was exploring how this analogical shift led to different questions and frames. The metaphors we use have a deep impact on our ability to think through novel concepts, particularly when navigating the abstract. They shape the questions we ask, the connections we make, and even the conversations we can have. To the extent that that’s true, metaphors can profoundly reroute society’s direction on issues of privacy, consent, autonomy, and property, and are thus well-worth exploring.

October 14, 2020

When an Algorithm Replaces Cash Bail

When an Algorithm Replaces Cash Bail
Allison Godfrey
October 9th, 2020

In an effort to make the criminal justice system more equitable, California Senate Bill 10 replaced cash bail with a predictive algorithm producing a risk assessment score to determine if the accused needs to remain in jail before their trial. The risk assessment places suspects into low, medium, or high risk categories. Low risk individuals are generally released before trial, while high risk individuals remain in jail. In cases with medium risk individuals, the judge has much more discretion in determining their placement before trial and conditions of release. This bill also releases all suspects charged with a misdemeanor without needing a risk assessment. This bill was signed into law in 2018 and effective in October 2019. California Proposition 25 seeks to repeal this bill and return to cash bail on the basis that this algorithm biases the system even more than cash bail. People often see data and algorithms as purely objective, since they are based on numbers and formulas. However, they are often “black box” models where we have no way of knowing exactly how the algorithm arrived at the output. If we cannot follow the model’s logic, we have no way of identifying and modifying its bias.

Image from this article

By the nature of predictive algorithms, they learn from the data in much of the same way as humans learn from their life’s inputs (experiences, conversations, schooling, family, etc). Our life experiences make us inherently biased since we hold a unique perspective purely shaped by this set of experiences. Similarly, algorithms learn from the data we feed into them and spit out the perspective that the data creates: an inherently biased perspective. Say, for example, we feed a predictive model some data about 1,000 people with pending trials. While the Senate Bill is not clear on the exact inputs to the model, say we feed the model the following attributes of each person: age, gender, charge, past record, income, zip code, and education level. We exclude the person’s race from the model in an effort to eliminate racial bias. But, have we really eliminated racial bias?

Image from this article

Let’s compare two people: Fred and Marc. Fred and Marc have the exact same charge, identify as the same gender, have similar incomes, both have bachelor’s degrees, but live in different towns. The model learns from past data that people from Fred’s zip code are generally more likely to commit another crime than people from Marc’s zip code. Thus, Fred receives a higher risk score than does Marc and he awaits his trial in jail while Marc is allowed to go home. Due to the history and continuation of systemic racism in the country, neighborhoods are often racially and economically segregated, so people from one zip code may be much more likely to be people of color and lower income than those from their neighboring town. Thus, by including an attribute like zipcode, we are introducing economic and racial bias into the model even if these additional attributes are not explicitly stated. While the original goal of Senate Bill 10 was to eliminate the ability for wealth to be a determining factor in bail decisions, it inadvertently reintroduces wealth as a predictor in the algorithm through the economic bias that is woven into it. Instead of equalizing the scale in the criminal justice system, the algorithm tips the scale even further.

Image from this article

Additionally, the purpose of cash bail is to ensure the accused will show up to their trial. While it is true that the system of cash bail can be economically inequitable, the algorithm does not seem to be addressing the primary purpose of bail. There is no part of Senate Bill 10 that helps ensure that the accused will be present at their trial.

Lastly, Senate Bill 10 allows judge discretion for any case, particularly medium risk cases. Human bias in the courtroom has historically played a big role in the inequality of our justice system today. The level of discretion the judge has to overrule the risk assessment score could re-introduce the human bias the model partly seeks to avoid. It has been shown that judges exercise this power more often to place someone in jail than they do to release them. In the time of Covid-19, going to jail has an increased risk of infection. With this heightened risk of jail, our decision system, whether that be algorithmic, monetary, and/or human centered, should err more on the side of release, not detainment.

The fundamental question is one that neither cash bail nor algorithms can answer:
How do we eliminate wealth as a determining factor in the justice system while also not introducing other biases and thus perpetuating systemic racism in the courtroom?

July 27, 2020

Authentication and the State

Authentication and the State
By Julie Nguyen

Introduction

For historical and cultural reasons, the American society is one of very few democracies in the world where there is no universal authentication system at the national level. Surprisingly, the Americans don’t trust governments as they do toward corporations because they consider such identifier system a serious violation of privacy and a major opening to Big Brother government. I will argue that it is more beneficial for the US to create a universal authentication system to replace the patchwork of de facto paper documents currently in use in a disparate fashion at the state level in the United States.

Though controversial and difficult to be implemented, a national-level authentication system would entail a lot of benefits.

It is not reasonable to argue that it is too complex to create a national-level authentication system. No, it is hard but possible elsewhere.

The debate on a national-level authentication system is not new. In Europe, national census scheme inspired a lot of resistance as it tended to focus the attention on privacy issues. One of the earliest examples was the protest against a census in the Netherland in 1971. Likewise, nobody foresaw the storms of protests over the censuses in Germany in 1983 and 1987. In both countries, the memories of the World War II and how the governments had terrorized the Dutch and German people during and after the war could explain such kind of reactions.

Similarly, proposals for a national-level identity cards produced the same reactions in numerous countries. Today, however, almost all modern societies have developed systems to authenticate their citizens. Those systems have evolved with the advent of new technologies in particular the biometric cards or e-cards: the pocket-sized “ID cards” have become biometric cards in almost all European countries and E-cards in Estonia. Citizens of many countries, including democracies, are required by law to have ID cards with them all the time. Surprisingly, these cards are still viewed by Americans as a major tool of oppressive governments and any discussion on establishing a national-level ID cards are not in general considered fit for discussion.

In some countries where people shared the same American view, their governments have learnt their hard lessons. As the result, contemporary national identification policies tend to be introduced more gradually under other symbols than the ID system per se. Thus, the new Australian policy is termed an Access Card since its introduction in 2006. The Canadian government now talks of a national Identity Management policy. More recently, the Indian government has implemented Aadhaar, the biggest world-wide biometric identification scheme containing the personal details, fingerprints and iris patterns of 1.2 billion people – nine out of ten Indians.

It is time that the federal government, taking lessons from other countries, create a national-level authentication system in the Unites States given that the system would create a lot of benefits for the Americans.

The advantages of a national authentication system would outperform its disadvantages in contrast to the argument of the opponents related to privacy and discrimination issues. I will use two main arguments to justify my statement. First and foremost, the most significant justification for identifying citizens is to insure the public’s safety and well-being. Even in Europe where the right to privacy is extremely important, Europeans have made a trade-off in favour of their safety. Documents captured from Al Queda or ISIS show that terrorists are aware that anonymity is a valuable tool for penetrating an open society. For domestic terrorist acts, it would be also easier and simpler to get terrorists caught in the case the country has a universal authentication system. For instance, Unabomber is one of the most notorious terrorists in the United States due to fact that it was extremely hard to track him as he had quasi no identity in the society.

Second, the opponents of a national authentication system argue that traditional ID cards or a national authentication system are a source of discrimination. Actually, universal identifiers could serve to reduce discrimination in some areas. All job applicants would be identified to avoid the fake identity, not only immigrant people or those who look or sound “foreign”. Taking the example of E-Verify which is a voluntary online system operated by the U.S. Department of Homeland Security (DHS) in partnership with the Social Security Administration (SSA). It’s used to verify an employee’s eligibility to legally work in the United States. E-Verify checks workers’ Form I-9 information for authenticity and work authorization status against SSA and Citizenship and Immigration Services (CIS) database. Today, more than 20 states have adopted laws that require employers to use the federal government’s E-Verify Program. As the E-verify entails further administrative costs for potential employers, it is a driver of discrimination towards immigrant workers in the United States. A national “E-verify” system of all US residents would prevent such a source of discrimination.

Lack of a national-wide authentication system results in a lot of social costs.

Identity theft has become a serious problem in the United States. Though the federal government passed the Identity Theft and Assumption Deterrence Act in 1998 in order to crackdown the problem and make it a federal felony, the cost of identity theft has continued to increase significantly[1]. Identity thieves have stolen over $107 billion in the US for the past six years. Identity theft is particularly frightening because there is no completely effective way for most people to protect themselves. Rich and powerful persons can be also caught in the trap. For example, Abraham Abdallah, a busboy in New York, succeeded in stealing millions of dollars from famous people’s bank accounts, using the Social Security Numbers, home addresses and birthdays of Warren Buffet, Oprah Winfrey and Steven Spielberg…

People usually think that identity theft is mainly a case of someone using another person’s identity to steal money from this person, mostly via stolen credit cards or more complicated in the case of the above-mentioned New Yorker. But the reality is much more complex. In his book The Limits of Privacy, Amitai Etzioni lists several categories of crime related to identity theft:

- Criminal fugitive
- Child abuse and sex offenses
- Income tax fraud and welfare fraud
- Nonpayment of child support
- Illegal immigration

Additionally, the highest hidden cost for American society due to the lack of a universal identity system is, in my opinion, the vulnerability of their democracy and the inefficient function of the whole society. In most democracies, a universal authentication system permits citizens to interact with government, reducing transaction cost and increasing the trust in governments at the same time. Moreover, it is a step toward an e-election in these countries where, like in the United States, the turnout rate has become critical. Without a universal and secured authentication system, any reform of the election in the country would be very difficult to put in place.

Overall, the tangible and intangible cost of not having a national authentication system is very high.

Conclusion

The United States is one of the very few democracies that has no standardized universal identification system. The social cost is very significant. The new technologies today can make it possible to protect the system from abuse. There is no zero-sum game in a society. Opponents of such kind of authentication system are wrong and their arguments would not hold today anymore. “Information does not kill people; people kill people” as Dennis Bailey wrote in The open Society Paradox. It is time to create a single, secure and standardized national-level ID to replace the patchwork of de facto paper documents currently in use in the United States. An incremental implementation of an Estonian-like system with a possible opting-out solution like Canadian approach can be an appropriate answer to the opponents of a national authentication system in the United States.

Bibliography

1/ The Privacy Advocates – Colin J. Bennett, The MIT Press, 2008.

2/ The Open Society Paradox – Dennis Bailey, Brassey’s Inc., 2004.

3/ The Limits of Privacy – Amitai Etzioni, Basic Books, 1999.

4/ E-Estonia: The power and potential of digital identity – Joyce Shen, 2016. https://blogs.thomsonreuters.com/answerson/e-estonia-power-potential-digital-identity/

5/ E-Authentication Best Practices for Government – Keir Breitenfeld, 2011. http://www.govtech.com/pcio/articles/E-Authentification-Best-Practices-for-Government.html

6/ My life under Estonia’s digital government – Charles Brett, 2015. https://www.theregister.co.uk/2015/06/02/estonia/

7/ Hello Aadhaar, Goodbye Privacy – Jean Drèze, 2017. https://thewire.in/government/hello-aadhaar-goodbye-privacy

June 24, 2020June 24, 2020

Data-driven policy making in the Era of ‘Truth Decay’

Data-driven policy making in the Era of ‘Truth Decay’
By Silvia Miramontes-Lizarraga

Through the advances of digital technology, it has been made possible to collect, store and analyze large amounts of data that contain information of various subjects of interest, otherwise known as Big Data. One of the effects of this field is the increase in data-driven methods for decision making in businesses, technology, and sports; as these methods have been proven to boost innovation, productivity and economic growth. But if the availability of data has been significantly increasing, why do we lack data-driven methods in policy-making to target issues of social value?

Background on Policy-making:

Society expects the government to deliver solutions to address social issues. Thus, its challenge is to improve the quality of life of its constituents. Public policy is a goal-oriented course of action which encompasses a series of steps: 1) Recognition of the Problem, 2) Agenda Setting, 3) Policy Formulation, 4) Adopting of Policy, 5) Policy Implementation, and 6) Policy Evaluation. This type of decision-making involves numerous participants. Consequently, the successful implementation of these policies cannot be ideologically driven. The process requires government officials to be transparent, accountable, and effective.

So how could these methods help?

The lack of utilization of data-driven methods is conspicuous when addressing the many problems of our educational system. For example, government officials could utilize data to efficiently locate the school districts in need of more resources. Similarly, when addressing healthcare, they can successfully compare plans to determine the best procedures and most essential expenditures to complete in the middle of a global pandemic. Thus, by successfully adopting these new technologies, our officials can begin closing ‘data gaps that have long impeded effective policy making’. However, in order to achieve this, government officials and their constituents must develop the awareness and appreciation of concrete and unbiased data.

Why ‘Truth Decay’ complicates things

Although there is potential in implementing data-driven methods to better inform policy makers, we have stumbled upon a hurdle: the on-going rise of ‘Truth Decay’, a phenomena described in a RAND initiative which aims to restore the role of facts and analysis in public life.

In recent years, we have heard about the problem with misinformation and fake news, but most importantly, we have reached a point where people no longer agree on basic facts. And if we do not agree on basic facts how can we possibly address social issues such as education, healthcare, and the economy?

Whether we have heard it from a friend in the middle of a Facebook political comment war, or from a random conversation on the street, we have come to realize that people tend to disagree on basic objective facts. More often than not, we get very long texts from friends filling us in on their latest social media comment debacle with ‘someone’ who does not seem to ‘agree’ with the presented facts – the facts are drowned out by their opinions. The line between opinion and facts fades to the point where facts are no longer disputed, but instead rejected or simply ignored.

So what now? How do we actively fight this decay to keep the validity of facts afloat and demystify quantitative methods to influence our representatives, and possibly transform them into better informed policy makers?

First Steps

Whenever we find someone with distinct political views, say from an opposing political party, we could try to convince them to look at the issue from another perspective. Perhaps, we can point out the disparity between facts and beliefs in an understated way.

We can also actively avoid tribalization. Rather than secluding ourselves from groups with opposing political views, we can try to build understanding and empathy.

Additionally, we can also change our attitude toward ourselves and others. We must acknowledge that sometimes we need to change our beliefs in order to grow. Meaning that making our beliefs part of our identity is not the optimal way to fight this ongoing ‘Truth Decay’. It is important to remember that our beliefs may be inconsistent over time, and thus, we are not defined by them.

Lastly, we can embrace a new attitude: call yourself a truth seeker, try your best to remain impartial, and be curious. Keeping your mind open might allow you to learn more about yourself and others.

Sources:
RAND Study