The Police State Is Monitoring Your Social Media Activity and Is Encouraged To Track and Arrest You For Exercising Your First Amendment Rights

The Police State Is Monitoring Your Social Media Activity and Is Encouraged To Track and Arrest You For Exercising Your First Amendment Rights
By Anonymous | June 5, 2020

In light of the nationwide protests following outrage over the deaths of George Floyd and several others in the hands of police officers this past week, the nation is as polarized as ever. Millions of citizens are supporting grassroots organizations that aim to highlight systemic injustice and advocate for police reform, while some police departments, city governments, and other political actors are pushing back against the gatherings.

The president of the United States himself has verbalized his position against the demonstrations that are occurring across the country, antagonizing protestors and encouraging police to become more forceful in their suppression of citizens’ First Amendment rights. Just weeks after the President commended protestors for opposing the nationwide lockdown in response to Covid-19, his rhetoric has quickly shifted to condemnation of Black Lives Matter protestors. Audio released from the President’s call with the Governors regarding how to handle the demonstrations reveals that Trump said, “You’ve got to arrest people, you have to track people, you have to put them in jail for 10 years and you’ll never see this stuff again.” Trump’s overt endorsement of the surveillance and incarceration of citizens is alarming and provides a necessary junction for discussion about the ethics of data monitoring by law enforcement. When the President encourages police across the country to track and persecute its civilians, especially those who are in ideological opposition to the Administration and the police state, many Americans’ are at risk.

Law enforcement can and has data from mobile devices to investigate citizens in the past. From companies like Geofeedia and their social media monitoring, to Google’s Sensorvault database of location histories, to companies like Securus which used geolocation data from cell phone carriers to track citizens, Americans are facing ubiquitous threats to their privacy. These three instances of data collection and use by law enforcement elucidate the argument for greater corporate responsibility and the urgent need for legislative reform. In this post, the focus will be on Geofeedia and the risk that this type of data collection and monitoring brings to light.

Figure 1

In 2016, the ACLU uncovered that a company called Geofeedia had been providing personal data about protestors to police departments. Geofeedia aggregated and sold data accessed through Facebook, Twitter, and Instagram to multiple police departments. This data was used to track the movements of protestors, which led to the identification, interception, and arrest of several people. This type of surveillance was occurring in notable hotspots of civil unrest like Ferguson in 2014 (following the murder of Michael Brown) and Baltimore in 2015 (following the murder of Freddie Gray), but the ACLU of California published public records showing that police departments across the state were rapidly acquiring social media monitoring software to monitor activists. There has been extremely little debate about the ethics of this technology or oversight on the legality of its use. And, only after the ACLU reviewed public records and released the information, did the social media platforms suspend Geofeedia from utilizing their data. Still, many civil liberties activists have voiced reasonable concerns around the lack of foresight and responsibility by these social media companies. Nicole Ozer, technology and civil liberties policy director for the ACLU of California, made the point that, “the ACLU shouldn’t have to tell Facebook or Twitter what their own developers are doing. The companies need to enact strong public policies and robust auditing procedures to ensure their platforms aren’t being used for discriminatory surveillance.”

Ozer’s point is especially poignant when considering that Geofeedia is just the tip of the iceberg. Despite the pubic criticism of Geofeedia by the social media companies involved, this has not reduced the utilization of social media profiling by law enforcement. There is a myriad of other companies who perform similar services that were not exposed in the ACLU report, and even Geofeedia emails detailed that their contract with Facebook allowed them a gradual reactivation of data.

With a federal administration that is visibly callous and ignorant of the constitution, it is as important as ever for companies and local legislators to fight to protect the data rights of citizens and ensure that technology companies are acting in the best interests of the people. Individuals who show their solidarity with victims of police brutality and systemic racism could be subjected to unconstitutional surveillance and oppression because of the content of their speech on social media or their presence at public assemblies. If the police use technological tools to continually monitor the movement of citizens, certain individuals will essentially be made political prisoners of a country under martial law that is quickly demonstrating its totalitarian nature.

Figure 2

Did you know you are helping Apple track the effects of COVID-19?

Did you know you are helping Apple track the effects of COVID-19?
By Henry Bazakas | June 5, 2020

At the onset of the COVID-19 pandemic, amongst the deluge of information, misinformation, and opinion, were story after story (after story, etc.) about the necessity and inevitability of widespread behavioral changes. Some predicted that people’s travel and movement decisions would be impacted, with people limiting it to the essentials. Here are a few quotes from those articles:

  • “People are going to start asking, ‘Do we have to meet in person?’”
  • “Digital commerce has also seen a boost as new consumers migrate online for grocery shopping – a rise that is likely to be sustained post-outbreak.”
  • “The experience may help us change our lifestyles for the better”
  • “Coronavirus is a once in a lifetime chance to reshape how we travel”

While the long term validity of these predictions is unknown, since it has been almost three months since Donald Trump declared a state of national emergency , we are at a point where we can start to evaluate them. Apple Maps is utilizing user search trends to help do so.

Data

One way to make inferences about people’s behavior is through the anonymized data Apple collects on requests made via Apple maps. This data is made available on Apple’s website, with a link to download the data for yourself. It contains data on over 2,000 counties or regions, drawn from 47 countries. The data uses mapping requests as a proxy for the level of mobility throughout society. This data can be of use in a social science capacity as a means of understanding the effects of COVID-19, but its collection raises ethical questions.

Understanding Societal Change

This information can be a valuable tool for researching human behavior. It can be interpreted as a natural experiment of sorts, as Apple can compare current to historical data to see just how big of an effect COVID-19 is having and how that is changing over time. This can help researchers appraise how effectively people are obeying social distancing measures over time and is a possible indicator of COVID-19 case trends at a county level.

On Apple’s website one can look at charts for any of the regions represented in the dataset, even breaking down by mode of transportation for some areas. I’ve included charts above for a variety of western countries, as well as for the San Francisco Bay Area, New York City, and Salt Lake City.

In most places Apple maps requests dropped by over 60% during early social distancing periods, since which time they have steadily risen. Some nations, including Germany and The United States are even above their “baseline” pre-COVID-19 values. This trend is showing at the regional level as well, although the extent of the bounceback varies. The extent of recovery also varies for different modes of transportation, with walking and driving recovering much more strongly than transit. This aversion to mass transportation is akin to what has happened in the airline industry , whose recovery has been very slow thus far as well. It remains to be seen whether people will fully revert back to 2019 transportation levels, but it does appear that walking and driving habits are showing meaningful return.

Ethical Conflict

Some would argue that this level of data collection is excessive or an invasion of user privacy. “The information collected will not personally identify you”, Apple assures on their website, but can this be guaranteed, and does it give Apple the right to collect such private data? Information about where people are going is certainly data that some would be unwilling to knowingly divulge if they were given the opportunity to opt out. Apple does make efforts to prevent their users from being identified from this data by not tying it to ID variables or accounts and aggregating it at the county level. However, individual data has been collected without the informed consent of many users. The possibility of it being released is never zero, and even if it were, that doesn’t give companies the right to collect it.

Maybe you read this and do not think anything of it. In today’s world it would be foolish not to assume some level of surveillance. It is up to you to decide whether the joys of Apple’s product line warrant this surrender of privacy. However, I believe that that decision should be a conscious one rather than via an unread privacy policy.

Works Cited:

https://www.apple.com/covid19/mobility
https://www.airlines.org/dataset/impact-of-covid19-data-updates/#
https://www.accenture.com/us-en/insights/consumer-goods-services/coronavirus-consumer-behavior-research
https://www.discovermagazine.com/health/how-the-covid-19-pandemic-will-change-the-way-we-live
https://www.sciencemag.org/news/2020/04/crushing-coronavirus-means-breaking-habits-lifetime-behavior-scientists-have-some-tips
https://singularityhub.com/2020/04/16/coronavirus-what-are-the-chances-well-change-our-behavior-in-the-aftermath/
https://theconversation.com/coronavirus-is-a-once-in-a-lifetime-chance-to-reshape-how-we-travel-134764
https://www.thinkglobalhealth.org/article/updated-timeline-coronavirus

How census data leads to the underrepresentation of minority voters

How census data leads to the underrepresentation of minority voters
By Anonymous | June 5, 2020

As of the writing of this piece (June 2020), the United States (U.S.) is in turmoil over systemic police violence against black Americans.

Every incident of police violence against black Americans follow a cyclical pattern: the initial killing occurs, images of the killing are spread across the media, the nation becomes outraged and protests occur. Companies and politicians offer some semblance of public support to the protestors and to the black community. Eventually the protests stop, new stories crop up, and America movies on. Despite this pattern, the relevant policies that lead to the initial killing rarely change meaningfully.

This blog post examines through a data science lens some of the underlying reasons for low voter turnout in populations that are interested in changing these policies. I look at the collection and usage of census data and the resulting impact on voter turnout, congressional representation, and policy formation.

The U.S. census is a decennial survey of every resident of the U.S., with the most recent occurring in 2020. The census is the closest there is to a comprehensive view of who lives in the U.S.. Residents receive a letter that generically describes how the data collected from the census will be used. A Pew Research Center study found that this description does not do enough and that most Americans do not know what questions are on the census, let alone how their responses are used.


Figure 1: The census letter received by U.S. citizens

This is problematic due to the far-reaching consequences that the census has on citizens’ representation in government. One use case for the census is data is to re-draw voting districts. This process has largely become partisan gerrymandering – redrawing districts such that political candidates from your party are more likely to win an election. The use of census data by political parties to draw voting districts is one reason why policies do not change after each cycle of police shootings – redistricting puts those who are more likely to vote for change in districts where their votes are less impactful.


Figure 2: Gerrymandering visualized: the same map with 4 different electoral outcomes

In addition to gerrymandering, politicians use census data collection to lower representation of certain populations in government. The Trump administration recently attempted to collect data via the 2020 census on citizenship status. Many worried that by collecting this data, U.S. residents who are not U.S. citizens (particularly those who entered in the U.S. illegally) would not respond to the census. Consequently, as many as 3 million U.S. residents (around 1% of the overall U.S. population) would not be counted in the census. This lowers both the congressional representation and the allocation of votes in the Electoral College for those people, effectively ensuring that any resident who lives in that area has less of a say in national government.

The collection and use of census data is structurally comparable to the experiments detailed by the Belmont report. As detailed on the Census website itself, data is used in research settings across the world by people from “all walks of life”. There is a lack of respect of persons here as the Census bureau tells citizens that responses to the census are “required by law” but cannot tell citizens how the information will be used. Secondly, politicians of census data violates the Belmont Report tenets of both beneficence and justice as the benefits of voting district and congressional representation generally harm populations that want change.

Let’s turn to Nissenbaum’s contextual approach to privacy. Although this data may be considered public use, the census does not clearly explain to its subjects how lawmakers and researchers will use the data. Through the Nissenbaum lens, the reader can more easily draw a disconnection between traditional privacy frameworks and the way politicians use census data for voter disenfranchisement.

Voter suppression and disenfranchisement is widespread among black voters and other populations that wish to enact legal change in the policies that lead to the disproportionately high rate of black men killed by police. I hope that this post can help shine some light on one possible reason for the lack of policy change in this area. As an initial step, I also recommend that individuals vote in officials who have the power to and will take responsibility for ending gerrymandering as well as other malicious uses of census data. I also hope that this blog post can start a longer and more thorough conversation on how Americans can collectively improve our data collection and usage practices to better protect voter rights, and in turn protect our voting populations.

References:

  • Cohn, D., Brown, A., & Keeter, S. (2020, February 20). Most Adults Aware of 2020 Census and Ready to Respond, but Donít Know Key Details. Retrieved from Pew Research Center: https://www.pewsocialtrends.org/2020/02/20/most-adults-aware-of-2020-census-and-ready-to-respond-but-dont-know-key-details/
  • Nissenbaum, H. F. (Fall 2011). A Contextual Approach to Privacy Online. Daedalus, 32 – 48.
    NPR Code Switch. (2020, May 31). Code Switch: A Decade Of Watching Black People Die. Retrieved from NPR: National Public Radio: https://www.npr.org/2020/05/29/865261916/a-decade-of-watching-black-people-die
  • Ray, R., & Whitlock, M. (2019, September 12). Setting the record straight on Black voter turnout. Retrieved from Brookings: https://www.brookings.edu/blog/up-front/2019/09/12/setting-the-record-straight-on-black-voter-turnout/
  • Ryan, K. J., Brady, J. V., Cooke, R. E., Height, D. I., Jonsen, A. R., King, P., . . . Turtle, R. H. (1979). The Belmont Report. U.S. Department of Health & Human Services.
  • The Census Bureau. (n.d.). Importance of the Data. Retrieved from United States Census 2020: https://2020census.gov/en/census-data.html
  • The Census Bureau. (n.d.). What To Look For in the Mail. Retrieved from United States Census 2020: https://2020census.gov/en/mailings.html
  • Topaz, J. (2018, October 29). How the Census Citizenship Question Could Affect Future Elections. Retrieved from American Civil Liberties Union: https://www.aclu.org/blog/voting-rights/fighting-voter-suppression/how-census-citizenship-question-could-affect-future
  • Wang, S. (2019, December 8). What North Carolinaís redistricting cases suggest for 2021 strategy. Retrieved from Princeton Election Consortium: http://election.princeton.edu/2019/12/08/what-north-carolinas-redistricting-cases-suggest-for-2021-strategy/#comments
  • Wines, M. (2019, June 27). What Is Gerrymandering? And Why Did the Supreme Court Rule on It? Retrieved from The New York Times: https://www.nytimes.com/2019/06/27/us/what-is-gerrymandering.html

Signing Away your Personalized Data: Service for Data Models

Signing Away your Personalized Data: Service for Data Models
By JJ Sahabu | May 29, 2020

In today’s society, 1 in 3 people have Facebook; it has become so widespread that it has become part of our digital identity. For instance, many websites provide a “Sign in with Facebook” option, almost as if Facebook has become a medium for online identification. Besides Facebook, many other tech companies like Uber, Google, and Amazon have become integrated into our daily lives leaving consumers at the will of these companies’ terms & conditions which often include rights over their personalized data. Some may say if you don’t agree with a company’s terms and conditions, you can abstain and not use the site. However, the cost of abstinence may be too great, putting individuals at a disadvantage to users. Take the instance of electricity. If users abstain from purchasing electricity from their local provider, not only do they recess to a time prior to the industrial revolution, they also don’t have an alternative, similar to refusing service from these technologies. This idea stems from a larger conversation of data ownership, and who has the right to the data. We seek to look deeper at the ethical considerations of user data collection.

In the Belmont Report, it discusses the importance of informed consent, where the user is educated to the level where they can consent. In the case of terms and conditions, they must be presented to the user in a way that makes the user understand what they are signing up for. However, when tech companies provide long documents of the terms outlined in small font, does the user really read through and understand what is going on? In addition, being that accepting the terms is mandatory to gain access to the company’s services, there leaves very limited choices to the user: abide to the terms or abstain from the service. Some services like Facebook, Instagram, or Twitter may be easier to abstain from, but consider apps that are more essential such as driving or using Uber. Some individuals may be financially reliant on working for Uber; thus, they have very little choice but to abide by the terms. And in the case of many social media platforms, users can be coerced by a “crowd effect” where they are tempted to join because everyone they know is on as well. In either case, the odds are leveraged against the user.

The reason this issue exists today is that there lies very little regulation over these technology firms due to the lack of knowledge surrounding the company’s capabilities to harness the data. When Facebook first came out in 2004, no one expected it to be able to collect and store your personal information. Thus, Facebook grew to a point where they are “Too Big to Fail,” a term usually coined for banking companies that will collapse the financial system if bankrupted. In Facebook’s case however, they have already collected enough users that even if some users decide to abstain from the service, Facebook is not concerned over the lost usership thus reducing the leverage the user has over them. Though some features benefit from personalizing the user experience, the ramifications of the data collected raises serious privacy concerns.

The article referenced below offers the solution of changing the model from a service for data model to a pay for service model enabling users to take back control of their private data. Although this would be solving the issue regarding data ownership, this does not solve the data problem for apps like Uber that don’t fall under the same business model. In addition, this can be seen as tech companies selling back your data, implying they have the first rights to your digital identity.

Digital ownership is a huge issue that surrounds the way tech companies run their businesses. On one hand, the data is used to advance technology by creating personalized content and making us more efficient. On the other, we are sacrificing our privacy. There must be a balance between the potential benefits and costs, but without some sort of regulation to strike that balance, tech companies will continue to reap the maximum benefits at the cost of consumer privacy.

References:

Should Big Tech Own Our Personal Data
Digital Fingerprint Image
Social Privacy Image

Zoom Is Ethically Required to Improve Privacy And Regulation Is Needed

Zoom Is Ethically Required to Improve Privacy And Regulation Is Needed
By Matthew McElhaney | May 29, 2020

Zoom is a ubiquitous video conferencing software whose popularity increased dramatically during the stay at home orders during the 2020 pandemic. Zoom built its platform on ease of use and ?frictionless communication? (Barolo, 2018), and it appears the architecture and design decisions to allow individuals to easily video conference has led to extreme privacy concerns. The internet has been filled with accounts of ?zoom bombing?-where uninvited individuals join video conferences and screen share pornography or shout racist epithets (O’Flaherty, 2020). The impact of this deplorable behavior is amplified given the increased use of zoom by grade schools across the United States during the stay at home orders related to COVID 19.


*Zoom Is Being Used for Social Distancing Court Hearings (Holmes, 2020)*

Company Value Has Skyrocketed Despite Privacy Concerns

To say the stay at home orders have been a boon to Zoom is an understatement. Zoom is publicly traded which allows us to apply near real time mark to market valuations to the company.


*Zoom Market Capitalization May 2019 to May 2020 (source: macrotrends.net)*

To give some perspective on the enterprise value of Zoom, the value of its equity at the time of writing is greater than that of Delta, American, and Southwest Airlines combined. It is also 2x that of Ford Motor Company-an iconic American institution that produces 5.5 million vehicles per year.


*Ford Market Capitalization May 2019 to May 2020. COVID 19 has moved value from many companies into a select few winners. (source: macrotrends.net)*

It?s unclear if the temporary drops in Zoom market cap in March and April could be attributed to the privacy concerns but it?s apparent that the market doesn?t link the value of the company to those concerns given the current all-time high company value. This is a signal that Zoom will not address these concerns because the market demands it, and that a different incentive is needed.

Zoom Has The Resources Needed To Change

Based on the above data Zoom certainly has the required resources to change. It would be feasible for Zoom to sell $100M of treasury stock (a mere 0.2% of its market capitalization) to apply the proceeds to improve privacy for their users. Assuming an all-in cost per employee of $300k per year, this issuance would give Zoom the resources to hire over 300 security architects, software developers, and other positions to address this issue.

Regulation Is Needed To Incentivize Change

Given that Zoom has been slow to apply the required changes even though it has the resources, and that their company valuation has increased during these privacy concerns, it?s clear that regulation should be considered in these situations to incentivize companies like Zoom to provide products with the necessary privacy controls. The United States government has been hesitant to do so in the past, likely because there is a fear of stifling innovation. The concerns are not valid given the harm done to consumers of the product because of poor privacy controls at the benefit of innovation. If laws were passed to apply extremely punitive penalties (e.g. the exact opposite of Facebook?s fine by the EU fine for being misleading during Whatsapp acquisition) to companies like Zoom for privacy and security violations, there would be far fewer children being exposed to pornography during English class. Companies respond to incentives and unless there is regulation to correct this behavior similar incidents will certainly happen again in the technology space.

References

Barolo, P. (2018, January 31). Zoom Launches Enhanced Product Suite to Deliver Frictionless Communications. Retrieved May 24, 2020, from https://blog.zoom.us/wordpress/2018/01/30/zoom-launches-enhanced-product-suite-to-deliver-frictionless-communications/

Holmes, A. (2020, April 18). Courts and government meetings have fallen into chaos after moving hearings to Zoom and getting swarmed with nudity and offensive remarks. Retrieved May 24, 2020, from https://www.businessinsider.com/zoom-courts-governments-struggle-to-adapt-video-tools-hearings-public-2020-4

O’Flaherty, K. (2020, March 27). Beware Zoom Users: Here’s How People Can ‘Zoom-Bomb’ Your Chat. Retrieved May 24, 2020, from https://www.forbes.com/sites/kateoflahertyuk/2020/03/27/beware-zoom-users-heres-how-people-can-zoom-bomb-your-chat/

Zoom Video Communications Market Cap 2019-2020: ZM. (n.d.). Retrieved May 24, 2020, from https://www.macrotrends.net/stocks/charts/ZM/zoom-video-communications/market-cap

Is the health QR Code really healthy?

Is the health QR Code really healthy?
By Joanna Wang | May 29, 2020

The magical tool for a peace of mind
Imagine this scenario: You’ve read about the covid-19 outbreak on the news and learned that it is a very contagious virus. You thought about the people coughing on the subway, a coworker that was under the weather but still showed up at work and some random person that sneezed on the street but didn’t cover their month. You are worried about your own health and constantly asking “am I exposed to the virus?” “Did people near me got the virus?” The fear is real, and I have experienced it first-hand. Wouldn’t it be great if there is a tool to find out these questions and have real time update? Won’t you want the get an alert if someone you have been in close contact got diagnosed for covid? This has been done in China already. A digital QR code: a magical tool for a peace of mind.

How does the QR code work
Citizens have to fill in their personal information to obtain a QR health code on their smartphone. Citizens need to provide their name, national identity number or passport number, phone number, travel history, any Covid symptoms, any contact with confirmed or suspected Covid-19 patients in the past 14 days. After the information is verified by authorities, each user will be assigned a QR code in red (need quarantine for 14 days), amber (need quarantine for 7 days) or green (free to move). Citizens will need to scan their QR code every time they enter a public facility: restaurants, subway stations, workplaces, shopping malls and so on. Once a confirmed case is diagnosed, authorities are able to quickly backtrack where the patient has been and identify people who have been in contact with that individual. People now can feel safer from Covid, so problem solved! But is it really?


Figure 1: A hand holding a cellphone

The lingering helper
With the help of health QR code, China was able to quickly contain the outbreak and reopened the economy. However, the health QR code will not fade away. Instead, it will turn into something more advanced or even invasive: it will give the user a 0-to-100 score based on how healthy their lifestyles are, for example how much they sleep, how many steps they take, how much they smoke and drink and other unspecified metrics. Thoughtful or creepy?

The concerns
Desperate time for desperate measure, we get it. Just like most of the technology, the intension of creating health tracking QR code is good (I hope so. Although people can disagree on the intention part). The government is trying everything they can to protect people from Covid-19, but at the same time, privacy is sacrificed. The QR code collect people’s location, who they have contacted and other rather private information. Maybe under extreme circumstances like Covid-19 outbreak, people are willing to sacrifice some privacy in exchange for safety. That still doesn’t make this whole collecting citizen information thing ethical. Furthermore, the “Dividing people into colors” method leads to people getting discriminated and not to mention the false positive results that the app generate (users got flagged for Red for no obvious reasons). The health QR code feels like a step closer to a scoring system for citizens. People’s movement, lifestyles should not be the governments concern and certainly should not be used to put people into categories. If the heath QR code further evolve and start collecting people’s medical record, how do we make sure the app development company has the necessary measures to protect user data? And how will these data be used for or against the users? If the QR code is required to enter public facilities, what about people who don’t want to use the code? These are all very sensitive questions that we need to address before we turn the health QR code into a civilian spy.


Figure 2: A group of people standing around a luggage carousel at an airport

Reference

https://www.cnn.com/2020/04/15/asia/china-coronavirus-qr-code-intl-hnk/index.html

A Problem to A Dress: Algorithmic Transparency and Appeal

A Problem to A Dress: Algorithmic Transparency and Appeal
By Adam Johns | April 13, 2020

Once upon a time, a million years ago (Christmas 2019), people cared about things buying fashionable gifts for their friends and family, rather than access to bleach. At this time, I was in the process of attempting to purchase a dress for my partner from an online store I’d shopped with in the past. Several days before the delivery cutoff for Christmas delivery, I received an unceremonious computer-generated email that my order had been cancelled. No sweat, I thought, and repeated the purchase. Cancelled again. As the deadline for the holidays approached, I called the particular merchant, who informed me that my order had been flagged by an algorithm as a security risk, my purchase had been cancelled, and there was in fact nobody I could speak to to appeal to, and no possibility of determining what factors had contributed to this verdict. I hung up the phone, licked my wounds, and moved on to other merchants for my last-minute shopping.

Upon later reflection, chastened by a nearly missed holiday gift deadline, I mused at what could have possibly resulted in the rejection. Looking back over my past purchases, it became apparent that in a year or two of shopping with this particular retailer, I hadn’t actually bought any women’s clothes. Perhaps it was the sudden change from menswear to dresses that led the algorithm to flag me (a not very progressive criteria for an otherwise progressive-seeming retailer). Whatever the reason, this frivolous example got me thinking about some very serious aspects of algorithmic decision making. What made this particular example so grating? Firstly, the decision was not transparent—I wasn’t informed that an algorithm had flagged my purchase until a number of calls to customer service. Secondly, I had no recourse to appeal—even after calling up, credit card info and personal identification in hand, nobody at the company was willing or able to overturn the decision. While such an algorithmic “hard no” was easy to shake off for a gift purchase, imagining such an approach applied to a credit decision, an insurance purchase, or a college application was disconcerting.

In 2020, algorithmic adjudication is becoming an increasingly frequent part of life. Machine learning may be broadly accurate in the aggregate, but individual decisions can always suffer from false positives and false negatives. When such a decision is applied to customer service or security, bad decisions can alienate customers and lead previously loyal customers to take their business elsewhere. When algorithms impact more consequential social matters like person’s access to health care, housing, or education, the consequences of a poor prediction take on higher stakes. Instead of just resulting in disappointed customers writing snarky blog posts, such decision making can amplify inequity, reinforce detrimental trends in society, and lead to self-reinforcing feedback loops of diminished individual and societal potential.

The growing importance of machine learning in commercial and government decision making isn’t likely to decline any time in the future. But to apply algorithms for maximum benefit, organizations should ensure that algorithmic decision making embeds transparency and a right to appeal. Let somebody know when they’ve been flagged, and what factored into the decision. Give them the right to speak to a person and correct the record if the decision is wrong (Crawford and Schultz’s concept of algorithmic due process offers a solid base for any organization trying to apply algorithms fairly). As a bonus, letting subjects of algorithmic decision making appeal offers a tantalizing opportunity to the data scientist: More training data to improve the algorithm. While it requires more investment, and a person on the other end of a phone, transparency and right to appeal can result in a rare win-win for algorithmic designers and the people to whom those algorithms are being applied, and ultimately lead us toward a more perfect future of algorithmic coexistence.

Reference:
Kate Crawford & Jason Schultz, Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms, 55 B.C.L. Rev. 93 (2014), https://lawdigitalcommons.bc.edu/bclr/vol55/iss1/4

Transgender Lives and COVID-19

Transgender Lives and COVID-19
By Ollie Downs | April 10, 2020

Transgender Day of Visibility (TDOV) is March 31st every year; it is a day to celebrate the trans experience and “to bring attention to the accomplishments of trans people around the globe while fighting cissexism and transphobia by spreading knowledge of the trans community”. I spent this year’s TDOV voluntarily sheltering in place in my home in Berkeley, California, with two other non-binary housemates of mine. During this shelter-in-place, I am reminded of the struggles faced uniquely by trans and non-binary folks in light of COVID.

Being Counted
Being counted is essential to dealing with issues like COVID-19, but there are challenges associated with counting trans people. Knowing who is getting it, where and when, and how they are dealing with it, are all crucial questions to answer. Unique groups like the non-binary community may very well be at higher risk for contracting COVID-19–and we need to know that. The ethical implications of collecting this data are tricky. Being visible as trans/non-binary is crucial for some people, and dangerous for others. On one hand, being able to quantify how many people, and what kinds of people, identify that way and where allows us to not only understand the demographics–and thus potential challenges and experiences–of those people. Especially in public health and government settings, knowing where things are happening, and to whom, is crucial in designing solutions like enforcing quarantines and distributing resources. On the other hand, forcing people to identify themselves as one thing or the other is challenging for many, and divides the world into discrete parts when actual identities are fluid and on spectrums. The truth may be lost when a person is forced to choose between imprecise options.

Social isolation and Abuse
Shelter-in-place orders are effective tools for containing the spread of a disease. But they’re also very effective at containing people who may not get along. As many of us have experienced firsthand, being isolated with others can create tension and conflict–which can be deadly for people with identities or characteristics outside ‘the norm.’ Transgender people, especially youth, may be trapped with abusive parents, partners, or other people who may seek to harm them, especially in situations where other identities intersect with their gender. Many transgender individuals find community in social spaces like communithy centers or bars, and without access to them, these communities (like many other marginalized communities) will suffer.

Other intersecting identities
The intersection of gender with other identities is complex and precarious. Other examples of discrimination against people with marginalized identity are everywhere. One example can be found here. In this post, Nadya Stevens reveals the danger faced by “poor people, Black people and Brown people” who are “essential workers” who must commute on crowded, reduced-service public transportation. Transgender and non-binary people, who face poverty and racism at alarmingly high levels, are directly impacted by the policy changes like that of the MTA. There is some light at the end of this particular tunnel. Actor Indya Moore began a campaign to take direct action to support transgender people of color (donate on Cashapp to $IndyaAMoore), and Moore’s campaign raised so much money in its first week that their account was frozen. This cannot be an isolated campaign: policy efforts must be made to continue this action.

Education at Home
Policy shifts towards turning education online during this time have been extremely difficult, especially for anyone in an unsafe home environment, without access to the Internet, or who are otherwise unable to consume material or who learn better in classroom settings. Transgender and non-binary people, again, experience poverty and violence at high rates, which may be worsened by these policy measures, and also often face medical discrimination, and may be impacted by failure to make online learning accessible to deaf, blind, or otherwise ‘non-normative’ students.

Medical issues
It makes sense that hospitals and medical care providers are halting ‘non-essential’ services like surgeries to focus on the care of COVID-19 patients. But the classification of some surgeries as ‘non-essential’ can be devastating, especially for trans and non-binary patients. Gender-affirming procedures are often categorized this way, but for many patients, they are crucial for their health and safety in a transphobic world. Additionally, patients with AIDS–many of whom are transgender–are at a higher risk of death from COVID-19.

The Unknowns
What we don’t know could be the worst part of this epidemic. We don’t know if, or how, COVID-19 interacts with hormone treatments or HIV medication. We don’t know how it will impact the future of education or policy, or how social isolation and intersecting identities might change these outcomes.

What’s next?
Taking action is very difficult in a pandemic. This situation impacts everyone differently, but impacts transgender people as a community especially. What can be done? Until we can return to normal life, there are several actionable ideas; donate to funds you know will go towards transgender lives (Cashapp: $IndyaAMoore and many others), check in with your friends, family, colleagues, coworkers, and acquaintances who you know are transgender and offer your support; educate yourself and others about the struggles of the trans community; volunteer for organizations committed to transgender health. Finally, have hope. The transgender community has been more than resilient before. We will continue to be resilient now.

If you or anyone you know who is trans/non-binary/gender non-conforming and facing suicidality, please call Trans Lifeline at 877-565-8860.

Photo credits:
https://www.state.gov/coronavirus/
https://commons.wikimedia.org/wiki/File:Nonbinary_Gender_Symbol.svg
https://en.m.wikipedia.org/wiki/File:A_TransGender-Symbol_black-and-white.svg

The Ethics of Not Sharing

The Ethics of Not Sharing
By George Tao | April 10, 2020

In this course, we’ve thoroughly covered the potential dangers of data in many different forms. Most of our conclusions have led us to believe that sharing our data is dangerous, and while this is true, we still must remember that data is and will be an instrumental part in societal development. To switch things up, I’d like to present the data ethics behind not sharing your data and steps that can be taken to improve trust between the consumer and the corporation.

The Facebook-Cambridge Analytica data scandal and the Ashley Madison data leaks are among many news stories regarding data misuse that have been etched into our minds. However, we often remember the bad more vividly than the good, so as consumers, we seek to hide our data whenever possible to protect ourselves from the bad. However, we also must remember the tremendous benefits that data can provide for us.

One company has created a sensor that pregnant women can wear to predict when they are going into labor. This app can provide great benefits in reducing maternal and infant mortality, but it can also be very invasive in the type of data it collects. However, childbirth is an area that can use this invasive type of data collection to improve upon current research. Existing research regarding female labor is severely outdated. The study that modern medicine bases its practices on was done in the 1950s on a population of 500 women who were exclusively white. By allowing this company to collect data regarding women’s pregnancy and labor patterns, we are able to replace these outdated practices.

Shot of a beautiful group of young pregnant women taking a selfie together after a yoga session in studio

This may seem like an extremely naive perspective on sharing data, and it is. As a society, we have not progressed to the point where consumers can trust corporations with their data. One suggestion that this article provides is that data collectors should provide their consumers with a list of worst case scenarios that could happen with their data, similar to how a doctor lists side effects that can come with a medicine. This information not only provides consumers with necessary knowledge, but also helps corporations make decisions that will avoid these outcomes.

I believe that one issue that hinders trust between consumer and corporation is that of the privacy policy. Privacy policies and terms of agreement are filled with technical jargon that make them too lengthy and too confusing for consumers to read. This is a problem because I believe that privacy policies should be the bridge that builds trust between the consumer and the corporation. My proposed solution is to create two separate but identical privacy policies: one that is designed for legal purposes and one that is designed for understandability. By doing this, we provide consumers with knowledge of what the policy is saying while not losing any legal protections that the policy provides.

There are many different ways to approach the problem of trust, but ultimately, the goal is to create trust between the consumer and the corporation. When we have achieved this trust, we can use the data built by this trust to improve upon current practices that may be outdated.

Works Cited
https://www.wired.com/story/ethics-hiding-your-data-from-machines/

Ethical CRISP-DM: The Short Version

Ethical CRISP-DM: The Short Version
By Collin Cunningham | April 11, 2020

If you could impart one lesson to a fledgling data scientist, what would it be? I asked myself this question last year when data science author Bill Franks called for contributors to his upcoming book, 97 Things About Ethics Every Data Scientist Should Know.

The data scientists I have managed and mentored most often struggle with transitioning from academic datasets to real world business problems. In machine learning classes, we are given clearly defined problems with manicured datasets. This could not be further from the reality of a data science job: requirements are vague, data is messy and often doesn’t exist, and causality hides behind spurious correlations.

This is why I teach junior data scientists the ​Cross Industry Standard Process for Data Mining (CRISP-DM). ​Even though it was developed for data mining long ago, it is perfectly applicable to modern data science. The steps of CRISP-DM are:

  • Business Understanding
  • Data Understanding
  • Data Preparation\
  • Modeling
  • Evaluation
  • Deployment

These steps are not necessarily sequential as shown in the diagram; you often find yourself back at ​Business Understanding after an unsuccessful deployment. However, this framework gives much needed structure which smoothes the awkward transition from academia to industry.

And yet,​ this would not be the singular lesson I would impart. That lesson would be ethics. Without instilling ethics in data science education, we are arming millions of young professionals with tools of immense power but no notion of responsibility. Thus, I sought to combine the simplicity and applicability of CRISP-DM with ethical guardrails in developing Ethical CRISP-DM. Each step in CRISP-DM is augmented with a question on which to reflect during that stage.

Business understanding – What are potential externalities of this solution? We ask data scientists to lean on those with domain experience when refining requirements into problem statements. Similarly, these subject matter experts are the people who have the most insight into those who may be affected by a model.

Data understanding< – ​Does my data reflect unethical bias?/strong> As imperfect creatures, it is naive to view anyone as void of bias. It follows that data generated by humans inevitably holds the shadow of these biases. We must reflect on what biases could exist in our data and perform specific analysis to identify these biases.

Data preparation – ​How do I cleanse data of bias? ​The data cleansing we are all familiar with has a parallel cleansing phase in which we seek to mitigate the biases identified in the previous step. Some of these biases are easier to address than others; filtering explicitly racist words from a language model is easier than removing relationships between sex and career choice. Furthermore, we must acknowledge that it is impossible to completely scrape bias from data, but attempting to do so is a worthwhile endeavor.

Modeling – ​Is my model prone to outside influence? ​With the growing ubiquity of online learning, models often adapt to their environment without human oversight. To maintain the ethical standard we have cultivated so far, guardrails must be put in place to prevent nefarious evolutions of a model. When Microsoft released Tay onto Twitter, users were able to pervert her language model resulting in a racist, anti-semetic, sexist, Trump-supporting cyborg.

Evaluation and Deployment – ​How can I quantify an unethical consequence? ​The foundation of artificial intelligence is feedback. It is critical we create metrics to monitor high-risk ethical consequences. For example, predictive policing applications should monitor the distribution of crimes across neighborhoods to avoid over-policing.

Ultimately, we are responsible for the entire products we deliver including their consequences. Ethical CRISP-DM holds us to a strict regime of reflection throughout the development lifecycle, thereby assuring the models we deliver are built ethically.