June 2020 – Data Science W231 | Behind the Data: Humans and Values

June 29, 2020July 1, 2020

Your Health, Your Rights: medical information not covered by HIPAA

Your Health, Your Rights: medical information not covered by HIPAA
By Adam Sohn | June 26, 2020

HIPAA

HIPAA (Health Insurance Portability and Accountability Act of 1996) protects your personal medical information as possessed by a medical provider. By HIPAA, you may obtain your record, add information to your record, seek to change your record, learn who sees your information, and perhaps most importantly, exercise limited control over who sees your information.

HIPAA protection provides security enshrined in law. However, the internet and Artificial Intelligence have provided additional vectors for personal medical information to be ascertained and distributed outside of a person’s control. The implications of data release from any vector are comparable to sharing from a medical setting.

Technology Generates and Discloses Medical Information
An example of entities not bound by HIPAA for most transactions, yet dealing in medical information, is the retail sector. As customers purchase a market basket of products aligned to certain medical status, astute predictive analytics systems operated by a retailer can infer the medical status. This medical status is free from HIPAA protections as it has no origins in a medical setting. Furthermore, the status is not provided information.

Famously, the astuteness of Target’s predictive analytics was on display in 2012 when coupons for baby supplies were sent to the home of a teenage girl. While it is alarming enough that Target has a database of inferred medical information (in this case, pregnancy), Target went a step further by disclosing this information for anyone handling the teenage girl’s mail to happen upon. This triggered a public understanding of the privacy risks related to data aggregation; where mundane data becomes a building block of sensitive information.

Exploring Privacy Protections
Exploring the state of protections that do exist to prevent unwanted disclosures such as the Target case reveals a picture of a system that has room to mature.

– One way to prevent unwanted disclosure is to personally opt-out of mailed advertisements from Target, per instructions in Target’s Privacy Policy. This is an unrealistic expectation for a customer to be able to foresee such a need.
– Another method is to submit a complaint to the FTC regarding a violation of a Privacy Policy. However, Target’s Privacy Policy is vague on these matters.

Expanding the view to regulatory changes that do not yet exist, yet are in the approval progress, there is a relevant bill in Congress. CONSENT (The Customer Online Notification for Stopping Edge-provider Network Transgressions Act) was brought to the Senate in 2018 and is currently under review in the Committee on Commerce, Science, and Transportation. CONSENT would turn the tide into the public’s favor with regard to the security of Personally-Identifying Information (PII) by requiring a distinct opt-in for sharing or using PII. However, the bill is only applicable to data transacted online, which is only a portion of the relationship a consumer has with a retailer.

Clearly, consumer behavior is trending towards online purchases. However, brick-and-mortar purchasing can not be overlooked, as it is also increasing.

Advice to Consumers
In light of the general laxness of protections, the methods for keeping your information secure falls under the adage caveat emptor – buyer beware. For individual consumers, options to keep your information safe are:
– Only share the combination of PII and medical information in a setting where you are explicitly protected by a Privacy Policy.
– Forgo certain conveniences in order to remain obscure. This entails using cash in a brick-and-mortar store and refraining from participating in loyalty programs.

Sources
[HIPPA]
[New York Times – Shopping Habits]
[Consumer Privacy Bill of Rights]
[CONSENT]

June 29, 2020June 29, 2020

Discriminatory practices in interest-based advertising

Discriminatory practices in interest-based advertising
By Anonymous | June 26th, 2020

Economics and ethics

The multi-billion dollar online advertising industry is incentivised to ensure that ad dollars convert into sales, or at least high click-through rates. Happy clients equate to healthy revenues. The way to realize this goal is to match the right pair of eyeballs for each ad – quality, not quantity, matters.

Interest-based ads (sometimes referred to as personalized or targeted ads) are strategically placed for specific viewers. The criteria for viewer selection could be from immutable traits like race, gender and age, or online behavioral pattern. Unfortunately, both approaches are responsible for amplifying racial stereotypes and deepening social inequality.

Baby and the bathwater

Dark ads exclude a person or group from seeing an ad by targeting viewers based on an immutable characteristic, such as sex or race. This is not to be confused with the notion of big data exclusion, where ‘datafication unintentionally ignores or even smothers the unquantifiable, immeasurable, ineffable parts of human experience.’ Instead, dark ads refer to a deliberate act by advertisers to shut out certain communities from participating in its product or service offering.

Furthermore, a behaviorally targeted ad can act as a social label even when it contains no explicit labeling information. When consumers recognize that the marketer has made an inference about their identity in order to serve them the ad, the ad itself functions as an implied social label.

Source: The Guardian

That said, it’s not all bad news with these personalized ads. While there are calls to simply ban targeted advertising, one could argue for the benefits of having public health campaigns, say, delivered in the right language to the right populace. Targeted social programs could also have better efficacy if it reaches the eyes and ears that need them. To take away this potentially powerful tool for social good is at best a lazy approach in solving the conundrum.

Regulatory oversight

In 2018, the U.S. Department of Housing and Urban Development filed a complaint against Facebook, alleging that the social media platform had violated the Fair Housing Act. Facebook’s ad targeting tools enabled advertisers to express unlawful preferences by suggesting discriminatory options, and Facebook effectuates the delivery of housing-related ads to certain users and not others based on those users’ actual or imputed protected traits.

Source: The Truman Library

A 2016 investigation by ProPublica found that Facebook advertisers could create housing ads allowing posters to exclude black people. Its privacy and public policy manager defended the practice, underlining the importance for advertisers to have the ability to both include and exclude groups as they test how their marketing performs – nevermind that A/B testing itself often straddles the grey area in the ethics of human subject research.

Source: ProPublica

Opinion

Insofar as the revenues for online businesses are driven by advertising revenue, which is dictated by user traffic, interest-based ads are here to stay. Stakeholders with commercial interests will continuously defend its marketing tools with benevolent use cases. Lawmakers need to consistently address the harm itself – that the deliberate exclusions (and not just the ones from algorithmic bias and opacity) serve to exacerbate inequalities from discriminatory practices in the physical world.

In the example above, the HUD authorities did well to call out Facebook’s transgressions, which are no less serious to those of the Jim Crow era. As a society, we have moved forward with Brown v. Board of Education. Let us not slip back in complacency in justifying segregatory acts; and of being complicit in Plessy v. Ferguson.

June 24, 2020June 24, 2020

Data-driven policy making in the Era of ‘Truth Decay’

Data-driven policy making in the Era of ‘Truth Decay’
By Silvia Miramontes-Lizarraga

Through the advances of digital technology, it has been made possible to collect, store and analyze large amounts of data that contain information of various subjects of interest, otherwise known as Big Data. One of the effects of this field is the increase in data-driven methods for decision making in businesses, technology, and sports; as these methods have been proven to boost innovation, productivity and economic growth. But if the availability of data has been significantly increasing, why do we lack data-driven methods in policy-making to target issues of social value?

Background on Policy-making:

Society expects the government to deliver solutions to address social issues. Thus, its challenge is to improve the quality of life of its constituents. Public policy is a goal-oriented course of action which encompasses a series of steps: 1) Recognition of the Problem, 2) Agenda Setting, 3) Policy Formulation, 4) Adopting of Policy, 5) Policy Implementation, and 6) Policy Evaluation. This type of decision-making involves numerous participants. Consequently, the successful implementation of these policies cannot be ideologically driven. The process requires government officials to be transparent, accountable, and effective.

So how could these methods help?

The lack of utilization of data-driven methods is conspicuous when addressing the many problems of our educational system. For example, government officials could utilize data to efficiently locate the school districts in need of more resources. Similarly, when addressing healthcare, they can successfully compare plans to determine the best procedures and most essential expenditures to complete in the middle of a global pandemic. Thus, by successfully adopting these new technologies, our officials can begin closing ‘data gaps that have long impeded effective policy making’. However, in order to achieve this, government officials and their constituents must develop the awareness and appreciation of concrete and unbiased data.

Why ‘Truth Decay’ complicates things

Although there is potential in implementing data-driven methods to better inform policy makers, we have stumbled upon a hurdle: the on-going rise of ‘Truth Decay’, a phenomena described in a RAND initiative which aims to restore the role of facts and analysis in public life.

In recent years, we have heard about the problem with misinformation and fake news, but most importantly, we have reached a point where people no longer agree on basic facts. And if we do not agree on basic facts how can we possibly address social issues such as education, healthcare, and the economy?

Whether we have heard it from a friend in the middle of a Facebook political comment war, or from a random conversation on the street, we have come to realize that people tend to disagree on basic objective facts. More often than not, we get very long texts from friends filling us in on their latest social media comment debacle with ‘someone’ who does not seem to ‘agree’ with the presented facts – the facts are drowned out by their opinions. The line between opinion and facts fades to the point where facts are no longer disputed, but instead rejected or simply ignored.

So what now? How do we actively fight this decay to keep the validity of facts afloat and demystify quantitative methods to influence our representatives, and possibly transform them into better informed policy makers?

First Steps

Whenever we find someone with distinct political views, say from an opposing political party, we could try to convince them to look at the issue from another perspective. Perhaps, we can point out the disparity between facts and beliefs in an understated way.

We can also actively avoid tribalization. Rather than secluding ourselves from groups with opposing political views, we can try to build understanding and empathy.

Additionally, we can also change our attitude toward ourselves and others. We must acknowledge that sometimes we need to change our beliefs in order to grow. Meaning that making our beliefs part of our identity is not the optimal way to fight this ongoing ‘Truth Decay’. It is important to remember that our beliefs may be inconsistent over time, and thus, we are not defined by them.

Lastly, we can embrace a new attitude: call yourself a truth seeker, try your best to remain impartial, and be curious. Keeping your mind open might allow you to learn more about yourself and others.

Sources:
RAND Study

June 24, 2020June 24, 2020

Police Shootings: A Closer Look at Unarmed Fatalities

Police Shootings: A Closer Look at Unarmed Fatalities
By Anonymous

Last year, fifty-five people were killed by police shootings while “unarmed.” This number comes from the Washington Post dataset of fatal police shootings, which is aggregated by “local news reports, law enforcement websites and social media” as well as other independent databases. In this dataset, the recorded weapons that victims were armed with during the fatal encounter range from wasp spray to barstools. Here is a breakdown of the types of arms that were involved in the fatal police shootings of 2019.

We see a large number of fatalities among people who were armed with guns and knives, but also vehicles and toy weapons. In my opinion cars and toys are not weapons and would more appropriately fit the category of “unarmed.” But what exactly does “unarmed” mean? The basic Google search definition is “not equipped with or carrying weapons.” Okay, well what is a weapon? Another Google search defines weapons as “a thing designed or used for inflicting bodily harm or physical damage.” Toys and cars were not designed for inflicting bodily harm, but may have been used to do so. Now with the same logic, would we call our arms and legs “weapons,” since many people have used their appendages to inflict bodily harm? No. So why do we distinguish the cars and toys from the “unarmed” status?

This breakdown of the categories leads to bias in the data. When categorizing the armed-status of victims of police shootings, the challenge of specificity arises. Some may find value in having more specific descriptions for each of the cases in the dataset, but this comes at the cost of distinguishing certain cases that really should be in the same bucket; in this case, “vehicles” and “toy weapons” should be contained in the “unarmed” bucket, rather than their own separate categories. The exclusion of those cases would provide lower counts to the actual number of unarmed people who were killed by police. Including the cases that involved vehicles and toy weapons brings the count of unarmed fatalities from 55 to 142. In other words, the bias inflicted by granular categorization underestimated the number of unarmed victims of police shootings in 2019.

Now let’s look at the breakdown by race, specifically White versus Black (non-Hispanic).

For Washington Post’s definition of unarmed, 45% of the victims were White, while 25% were Black. For toy weapons, 50% were White, and 15% were Black. For vehicles, 41% were White, and 30% were Black. For all of those cases combined, 44% were White, and 25% were Black.

Now some may interpret this as “more White people are being killed by police,” and that is true, but let’s think about the population of White and Black folks in the United States. According to the 2019 U.S. Census Bureau, 60% of the population are White while only 13% are Black or African American. So when we consider, by race, the percentage of unarmed people who were killed by police in comparison with the percentage of people in the U.S., we see a disproportionate effect on Black folks. If the experience of Black and White folks were the same, then we would expect only 13% of police-shooting victims to be Black and 60% to be White. However, we see a much lower number, proportionally for White folks (44% unarmed White victims), and a much higher number for Black folks (25% unarmed Black victims).

This highlights the disproportionate effect of police brutality towards Black folks, yet the point estimates provided from this data may not be fully comprehensive. When police reports are fabricated, when horrific police killings of Black and Brown folks go under the radar, we risk the data provided by the Washington Post to be further biased. However, this bias would suggest an even greater disparity between the victimization by police shootings of unarmed Black and Brown folks. As we consider data in our reflections of current events, we have to be mindful of the potential biases that may exist in the creation and collection of the data, as well as our interpretation of it.

June 22, 2020

Digital Equity

Digital Equity
By Anusha Praturu | June 19, 2020

It’s no secret that in 2020, it is becoming increasingly difficult to participate in modern society without some level of access and literacy with basic technologies, the Internet being chief among them. And with the widespread growth of WiFi hotspots, smartphones, and other Internet-capable devices, it’s becoming easier for most of us to remain connected all the time. Despite this, the technology gap appears to be widening for lower-income American households.

According to a 2019 Pew Study detailed in the figure below, 44% of individuals with annual household incomes under $30,000 do not have home broadband services, and 46% do not have access to a traditional computer. These individuals are becoming progressively more dependent on smartphones for Internet access, even for tasks that are traditionally undertaken on a larger screen, such as applying to jobs or for educational purposes.

Pew Research graphic depicting lower levels of technology adoption among lower income American households

Obviously these issues have many downstream implications, which contribute to perpetuating a cycle of inequality. These circumstances start to describe an issue known as digital inequity.

Defining Digital Equity and Digital Inclusion
Simply put, digital equity refers to people’s equal ability to access and use the necessary technology to participate in the social, democratic, and economic activities of modern society. The National Digital Inclusion Alliance (NDIA) defines Digital Equity as follows:

Digital Equity is a condition in which all individuals and communities have the information technology capacity needed for full participation in our society, democracy and economy. Digital Equity is necessary for civic and cultural participation, employment, lifelong learning, and access to essential services.

A related concept, Digital Inclusion, refers to the acts of remediation that governments, activists, and other stakeholders are proposing and attempting in order to achieve digital equity. NDIA describes it as:

Digital Inclusion refers to the activities necessary to ensure that all individuals and communities, including the most disadvantaged, have access to and use of Information and Communication Technologies (ICTs). This includes 5 elements: 1) affordable, robust broadband internet service; 2) internet-enabled devices that meet the needs of the user; 3) access to digital literacy training; 4) quality technical support; and 5) applications and online content designed to enable and encourage self-sufficiency, participation and collaboration. Digital Inclusion must evolve as technology advances. Digital Inclusion requires intentional strategies and investments to reduce and eliminate historical, institutional and structural barriers to access and use technology.

Causes of Inequity
As you can imagine, the causes of digital inequity are deep-rooted and manyfold. Some of the primary causes, as detailed by the Benton Institute include a lack of robust infrastructure and discrimination in delivering technology and digital services to specific areas or populations. Other barriers stem from broader issues such as disparities in socio-economic status, digital literacy, accommodations for special needs and disabilities, or resources for non-English speakers. These multifaceted sources of inequity cannot simply be addressed with a single piece of legislation or grant in funding. Rather, they require radical systemic change, including substantial and ongoing investment in lower-income and rural communities, as well as broader awareness of the growing issue.

Ongoing Attempts at Remediation
In April 2019, Senator Patty Murray of Washington introduced the Digital Equity Act of 2019. This act would establish grants for the purposes of (1) promoting digital equity, (2) supporting digital inclusion activities, and (3) building capacity for state-led efforts to increase adoption of broadband by their residents. These efforts would contribute to achieving the goals outlined in the graphic below.

Infographic detailing the three primary goals of the Digital Equity Act of 2019

Since being first introduced to the Senate in April and subsequently reintroduced to the House in September 2019 by Rep. Jerry McNerney of California, this act has yet to face a vote in either chamber.

In addition to legislation, there are several non-profit organizations that are committed to addressing the issue of digital inequity in the US. The leader of such organizations is the aforementioned National Digital Inclusion Alliance, which “combines grassroots community engagement with technical knowledge, research, and coalition building to advocate on behalf of people working in their communities for digital equity.” Organizations such as NDIA take a many-sided approach to digital inclusion, including spreading awareness, fundraising, research, advocacy, and lobbying policymakers.

Outstanding Barriers to Digital Equity
On top of those outlined above, there are still several barriers to achieving digital equity in the US. Simply the fact that legislation on the issue was introduced over a year ago and has not made any progress through the legislative branch of government is a clear indication that more public advocacy and awareness is needed for such action to have momentum. Another issue that must be addressed by digital inclusion efforts is the rapid pace of new developments in technology. Inclusion efforts must be up to date and compatible with the evolving technological landscape. Digital equity cannot be achieved if lower-income and otherwise disadvantaged populations are restricted to outdated technology or insufficient access to the latest developments.

The bottom line is, the road to digital equity is long and not without obstacles, but efforts to close the gap are long overdue. Further, the gap will only continue to widen with each year that goes by and technology advances, yet no action is taken to promote digital inclusion. The first of many steps will be to lobby policymakers to push the Digital Equity Act through to a vote so that the issue of inequity might finally start to be addressed and penetrate the American consciousness in a meaningful way.

June 10, 2020

You Already Live in a Smart Home, but Jarvis Doesn’t Work For You

You Already Live in a Smart Home, but Jarvis Doesn’t Work For You
By Isaac Chau | June 5, 2020

Ask people to describe a home in “the future,” and more than a few will describe a house that’s something like Tony Stark’s mansion: lights that turn on when you walk in, appliances that brew your coffee when your alarm goes off, and voice-controlled everything, from ambient music to air conditioning, all done through conversation with your butler that’s actually a computer.

On second thought, that sounds a lot like a modern smart home, consisting of Google Assistant or Amazon Alexa controlling and coordinating the actions of products like smart refrigerators, smart light bulbs and switches, smart doorbells, smart TVs, and of course, smart toasters. Sure, Google Assistant and Alexa might not be as interesting to talk to as Jarvis or Rosie Jetson, but given that the vast majority of US adults own smartphones and the fact that voice-controlled home speakers have been incredibly affordable for years now, it’s hard to argue that we aren’t already living in the future. However, I’m willing to bet that Tony Stark never worried about his A.I. butler sharing his personal information. Indeed, the sci-fi future many of us already live in takes on a dystopic tint when you look closer at the devices that enable it.

Google and Amazon, the two main players in the smart-home market, provide the masses inexpensive voice-controlled speakers for basically no profit because what users actually trade for the convenience of a smart-home device is not money but rather access to their homes. Through these devices, users can buy products and services, and the speakers direct them to choices that result in money going back to the company that sold them their speaker. Users’ queries to the speakers are collected and integrated with web browsing, location tracking, their social network, and other information to further personalize ads that can be delivered to any of their internet-connected devices.

You might say, “So what?” to the smart-speaker business model. Google and Amazon’s privacy policies state they do not share your information with third parties (besides those working for them). Hundreds of millions of people already entrust these companies with their information in the form of online shopping and services, and while they can sometimes be a little creepy, targeted ads are widely accepted and can even improve the online experience. If Google and Amazon can hold up their end of the privacy agreements, what’s wrong with building a smart home around their voice assistants?

The problem lies with other internet-connected products necessary for a smart home, many of which have more lax data privacy and security infrastructure. For example, Ring, which sells home security systems based around video-doorbells, has been criticized heavily for its practices which left users vulnerable to hackers, not to mention encouraging users to share video with police departments. The ethics of helping police surveil neighbors aside, the ease with which bad actors can access improperly set up Ring video cameras is alarming and directly opposes the intentions of any consumer security-minded enough to install video cameras in their home.

Televisions are a more popular smart device than doorbells that can compromise the privacy of a home. Smart televisions are connected to the internet and can recognize the content you watch in order to deliver targeted content. Compared to Google and Amazon, however, television manufacturers are more free with user information and more prone to security breaches. Samsung controlled more than 40% of the North American television market in 2019, and their privacy policy allows them to share with third parties any information they collect from users, including voice recordings. In 2017, the Federal Trade Commission fined Vizio for collecting user information without consent, and that same year, WikiLeaks released documents alleging that the CIA can hack smart TVs and use their microphones as listening devices.

There are few new non-smart TVs for sale. Just like with smart speakers, manufacturers make little money selling TV hardware, at least at the lower- and middle-end of the market. The real money is in data, and the purchase of a television means years of access to observing a household’s behavior. And unlike with smart light bulbs, refrigerators, and doorbells, there are no reasonable dumb alternatives to smart TVs for the consumers that care about their privacy. As consumer demand grows for smarter, connected home goods, other products may end up just like the television: always watching us back. There’s a chance we’ll all be living in smart homes soon, whether we want to or not.

June 8, 2020

The Police State Is Monitoring Your Social Media Activity and Is Encouraged To Track and Arrest You For Exercising Your First Amendment Rights

The Police State Is Monitoring Your Social Media Activity and Is Encouraged To Track and Arrest You For Exercising Your First Amendment Rights
By Anonymous | June 5, 2020

In light of the nationwide protests following outrage over the deaths of George Floyd and several others in the hands of police officers this past week, the nation is as polarized as ever. Millions of citizens are supporting grassroots organizations that aim to highlight systemic injustice and advocate for police reform, while some police departments, city governments, and other political actors are pushing back against the gatherings.

The president of the United States himself has verbalized his position against the demonstrations that are occurring across the country, antagonizing protestors and encouraging police to become more forceful in their suppression of citizens’ First Amendment rights. Just weeks after the President commended protestors for opposing the nationwide lockdown in response to Covid-19, his rhetoric has quickly shifted to condemnation of Black Lives Matter protestors. Audio released from the President’s call with the Governors regarding how to handle the demonstrations reveals that Trump said, “You’ve got to arrest people, you have to track people, you have to put them in jail for 10 years and you’ll never see this stuff again.” Trump’s overt endorsement of the surveillance and incarceration of citizens is alarming and provides a necessary junction for discussion about the ethics of data monitoring by law enforcement. When the President encourages police across the country to track and persecute its civilians, especially those who are in ideological opposition to the Administration and the police state, many Americans’ are at risk.

Law enforcement can and has data from mobile devices to investigate citizens in the past. From companies like Geofeedia and their social media monitoring, to Google’s Sensorvault database of location histories, to companies like Securus which used geolocation data from cell phone carriers to track citizens, Americans are facing ubiquitous threats to their privacy. These three instances of data collection and use by law enforcement elucidate the argument for greater corporate responsibility and the urgent need for legislative reform. In this post, the focus will be on Geofeedia and the risk that this type of data collection and monitoring brings to light.

Figure 1

In 2016, the ACLU uncovered that a company called Geofeedia had been providing personal data about protestors to police departments. Geofeedia aggregated and sold data accessed through Facebook, Twitter, and Instagram to multiple police departments. This data was used to track the movements of protestors, which led to the identification, interception, and arrest of several people. This type of surveillance was occurring in notable hotspots of civil unrest like Ferguson in 2014 (following the murder of Michael Brown) and Baltimore in 2015 (following the murder of Freddie Gray), but the ACLU of California published public records showing that police departments across the state were rapidly acquiring social media monitoring software to monitor activists. There has been extremely little debate about the ethics of this technology or oversight on the legality of its use. And, only after the ACLU reviewed public records and released the information, did the social media platforms suspend Geofeedia from utilizing their data. Still, many civil liberties activists have voiced reasonable concerns around the lack of foresight and responsibility by these social media companies. Nicole Ozer, technology and civil liberties policy director for the ACLU of California, made the point that, “the ACLU shouldn’t have to tell Facebook or Twitter what their own developers are doing. The companies need to enact strong public policies and robust auditing procedures to ensure their platforms aren’t being used for discriminatory surveillance.”

Ozer’s point is especially poignant when considering that Geofeedia is just the tip of the iceberg. Despite the pubic criticism of Geofeedia by the social media companies involved, this has not reduced the utilization of social media profiling by law enforcement. There is a myriad of other companies who perform similar services that were not exposed in the ACLU report, and even Geofeedia emails detailed that their contract with Facebook allowed them a gradual reactivation of data.

With a federal administration that is visibly callous and ignorant of the constitution, it is as important as ever for companies and local legislators to fight to protect the data rights of citizens and ensure that technology companies are acting in the best interests of the people. Individuals who show their solidarity with victims of police brutality and systemic racism could be subjected to unconstitutional surveillance and oppression because of the content of their speech on social media or their presence at public assemblies. If the police use technological tools to continually monitor the movement of citizens, certain individuals will essentially be made political prisoners of a country under martial law that is quickly demonstrating its totalitarian nature.

Figure 2

June 8, 2020

Did you know you are helping Apple track the effects of COVID-19?

Did you know you are helping Apple track the effects of COVID-19?
By Henry Bazakas | June 5, 2020

At the onset of the COVID-19 pandemic, amongst the deluge of information, misinformation, and opinion, were story after story (after story, etc.) about the necessity and inevitability of widespread behavioral changes. Some predicted that people’s travel and movement decisions would be impacted, with people limiting it to the essentials. Here are a few quotes from those articles:

“People are going to start asking, ‘Do we have to meet in person?’”
“Digital commerce has also seen a boost as new consumers migrate online for grocery shopping – a rise that is likely to be sustained post-outbreak.”
“The experience may help us change our lifestyles for the better”
“Coronavirus is a once in a lifetime chance to reshape how we travel”

While the long term validity of these predictions is unknown, since it has been almost three months since Donald Trump declared a state of national emergency , we are at a point where we can start to evaluate them. Apple Maps is utilizing user search trends to help do so.

Data

One way to make inferences about people’s behavior is through the anonymized data Apple collects on requests made via Apple maps. This data is made available on Apple’s website, with a link to download the data for yourself. It contains data on over 2,000 counties or regions, drawn from 47 countries. The data uses mapping requests as a proxy for the level of mobility throughout society. This data can be of use in a social science capacity as a means of understanding the effects of COVID-19, but its collection raises ethical questions.

Understanding Societal Change

This information can be a valuable tool for researching human behavior. It can be interpreted as a natural experiment of sorts, as Apple can compare current to historical data to see just how big of an effect COVID-19 is having and how that is changing over time. This can help researchers appraise how effectively people are obeying social distancing measures over time and is a possible indicator of COVID-19 case trends at a county level.

On Apple’s website one can look at charts for any of the regions represented in the dataset, even breaking down by mode of transportation for some areas. I’ve included charts above for a variety of western countries, as well as for the San Francisco Bay Area, New York City, and Salt Lake City.

In most places Apple maps requests dropped by over 60% during early social distancing periods, since which time they have steadily risen. Some nations, including Germany and The United States are even above their “baseline” pre-COVID-19 values. This trend is showing at the regional level as well, although the extent of the bounceback varies. The extent of recovery also varies for different modes of transportation, with walking and driving recovering much more strongly than transit. This aversion to mass transportation is akin to what has happened in the airline industry , whose recovery has been very slow thus far as well. It remains to be seen whether people will fully revert back to 2019 transportation levels, but it does appear that walking and driving habits are showing meaningful return.

Ethical Conflict

Some would argue that this level of data collection is excessive or an invasion of user privacy. “The information collected will not personally identify you”, Apple assures on their website, but can this be guaranteed, and does it give Apple the right to collect such private data? Information about where people are going is certainly data that some would be unwilling to knowingly divulge if they were given the opportunity to opt out. Apple does make efforts to prevent their users from being identified from this data by not tying it to ID variables or accounts and aggregating it at the county level. However, individual data has been collected without the informed consent of many users. The possibility of it being released is never zero, and even if it were, that doesn’t give companies the right to collect it.

Maybe you read this and do not think anything of it. In today’s world it would be foolish not to assume some level of surveillance. It is up to you to decide whether the joys of Apple’s product line warrant this surrender of privacy. However, I believe that that decision should be a conscious one rather than via an unread privacy policy.

Works Cited:

https://www.apple.com/covid19/mobility
https://www.airlines.org/dataset/impact-of-covid19-data-updates/#
https://www.accenture.com/us-en/insights/consumer-goods-services/coronavirus-consumer-behavior-research
https://www.discovermagazine.com/health/how-the-covid-19-pandemic-will-change-the-way-we-live
https://www.sciencemag.org/news/2020/04/crushing-coronavirus-means-breaking-habits-lifetime-behavior-scientists-have-some-tips
https://singularityhub.com/2020/04/16/coronavirus-what-are-the-chances-well-change-our-behavior-in-the-aftermath/
https://theconversation.com/coronavirus-is-a-once-in-a-lifetime-chance-to-reshape-how-we-travel-134764
https://www.thinkglobalhealth.org/article/updated-timeline-coronavirus

June 8, 2020June 9, 2020

How census data leads to the underrepresentation of minority voters

How census data leads to the underrepresentation of minority voters
By Anonymous | June 5, 2020

As of the writing of this piece (June 2020), the United States (U.S.) is in turmoil over systemic police violence against black Americans.

Every incident of police violence against black Americans follow a cyclical pattern: the initial killing occurs, images of the killing are spread across the media, the nation becomes outraged and protests occur. Companies and politicians offer some semblance of public support to the protestors and to the black community. Eventually the protests stop, new stories crop up, and America movies on. Despite this pattern, the relevant policies that lead to the initial killing rarely change meaningfully.

This blog post examines through a data science lens some of the underlying reasons for low voter turnout in populations that are interested in changing these policies. I look at the collection and usage of census data and the resulting impact on voter turnout, congressional representation, and policy formation.

The U.S. census is a decennial survey of every resident of the U.S., with the most recent occurring in 2020. The census is the closest there is to a comprehensive view of who lives in the U.S.. Residents receive a letter that generically describes how the data collected from the census will be used. A Pew Research Center study found that this description does not do enough and that most Americans do not know what questions are on the census, let alone how their responses are used.

Figure 1: The census letter received by U.S. citizens

This is problematic due to the far-reaching consequences that the census has on citizens’ representation in government. One use case for the census is data is to re-draw voting districts. This process has largely become partisan gerrymandering – redrawing districts such that political candidates from your party are more likely to win an election. The use of census data by political parties to draw voting districts is one reason why policies do not change after each cycle of police shootings – redistricting puts those who are more likely to vote for change in districts where their votes are less impactful.

Figure 2: Gerrymandering visualized: the same map with 4 different electoral outcomes

In addition to gerrymandering, politicians use census data collection to lower representation of certain populations in government. The Trump administration recently attempted to collect data via the 2020 census on citizenship status. Many worried that by collecting this data, U.S. residents who are not U.S. citizens (particularly those who entered in the U.S. illegally) would not respond to the census. Consequently, as many as 3 million U.S. residents (around 1% of the overall U.S. population) would not be counted in the census. This lowers both the congressional representation and the allocation of votes in the Electoral College for those people, effectively ensuring that any resident who lives in that area has less of a say in national government.

The collection and use of census data is structurally comparable to the experiments detailed by the Belmont report. As detailed on the Census website itself, data is used in research settings across the world by people from “all walks of life”. There is a lack of respect of persons here as the Census bureau tells citizens that responses to the census are “required by law” but cannot tell citizens how the information will be used. Secondly, politicians of census data violates the Belmont Report tenets of both beneficence and justice as the benefits of voting district and congressional representation generally harm populations that want change.

Let’s turn to Nissenbaum’s contextual approach to privacy. Although this data may be considered public use, the census does not clearly explain to its subjects how lawmakers and researchers will use the data. Through the Nissenbaum lens, the reader can more easily draw a disconnection between traditional privacy frameworks and the way politicians use census data for voter disenfranchisement.

Voter suppression and disenfranchisement is widespread among black voters and other populations that wish to enact legal change in the policies that lead to the disproportionately high rate of black men killed by police. I hope that this post can help shine some light on one possible reason for the lack of policy change in this area. As an initial step, I also recommend that individuals vote in officials who have the power to and will take responsibility for ending gerrymandering as well as other malicious uses of census data. I also hope that this blog post can start a longer and more thorough conversation on how Americans can collectively improve our data collection and usage practices to better protect voter rights, and in turn protect our voting populations.

References:

Cohn, D., Brown, A., & Keeter, S. (2020, February 20). Most Adults Aware of 2020 Census and Ready to Respond, but Donít Know Key Details. Retrieved from Pew Research Center: https://www.pewsocialtrends.org/2020/02/20/most-adults-aware-of-2020-census-and-ready-to-respond-but-dont-know-key-details/
Nissenbaum, H. F. (Fall 2011). A Contextual Approach to Privacy Online. Daedalus, 32 – 48.
NPR Code Switch. (2020, May 31). Code Switch: A Decade Of Watching Black People Die. Retrieved from NPR: National Public Radio: https://www.npr.org/2020/05/29/865261916/a-decade-of-watching-black-people-die
Ray, R., & Whitlock, M. (2019, September 12). Setting the record straight on Black voter turnout. Retrieved from Brookings: https://www.brookings.edu/blog/up-front/2019/09/12/setting-the-record-straight-on-black-voter-turnout/
Ryan, K. J., Brady, J. V., Cooke, R. E., Height, D. I., Jonsen, A. R., King, P., . . . Turtle, R. H. (1979). The Belmont Report. U.S. Department of Health & Human Services.
The Census Bureau. (n.d.). Importance of the Data. Retrieved from United States Census 2020: https://2020census.gov/en/census-data.html
The Census Bureau. (n.d.). What To Look For in the Mail. Retrieved from United States Census 2020: https://2020census.gov/en/mailings.html
Topaz, J. (2018, October 29). How the Census Citizenship Question Could Affect Future Elections. Retrieved from American Civil Liberties Union: https://www.aclu.org/blog/voting-rights/fighting-voter-suppression/how-census-citizenship-question-could-affect-future
Wang, S. (2019, December 8). What North Carolinaís redistricting cases suggest for 2021 strategy. Retrieved from Princeton Election Consortium: http://election.princeton.edu/2019/12/08/what-north-carolinas-redistricting-cases-suggest-for-2021-strategy/#comments
Wines, M. (2019, June 27). What Is Gerrymandering? And Why Did the Supreme Court Rule on It? Retrieved from The New York Times: https://www.nytimes.com/2019/06/27/us/what-is-gerrymandering.html

June 1, 2020

Signing Away your Personalized Data: Service for Data Models

Signing Away your Personalized Data: Service for Data Models
By JJ Sahabu | May 29, 2020

In today’s society, 1 in 3 people have Facebook; it has become so widespread that it has become part of our digital identity. For instance, many websites provide a “Sign in with Facebook” option, almost as if Facebook has become a medium for online identification. Besides Facebook, many other tech companies like Uber, Google, and Amazon have become integrated into our daily lives leaving consumers at the will of these companies’ terms & conditions which often include rights over their personalized data. Some may say if you don’t agree with a company’s terms and conditions, you can abstain and not use the site. However, the cost of abstinence may be too great, putting individuals at a disadvantage to users. Take the instance of electricity. If users abstain from purchasing electricity from their local provider, not only do they recess to a time prior to the industrial revolution, they also don’t have an alternative, similar to refusing service from these technologies. This idea stems from a larger conversation of data ownership, and who has the right to the data. We seek to look deeper at the ethical considerations of user data collection.

In the Belmont Report, it discusses the importance of informed consent, where the user is educated to the level where they can consent. In the case of terms and conditions, they must be presented to the user in a way that makes the user understand what they are signing up for. However, when tech companies provide long documents of the terms outlined in small font, does the user really read through and understand what is going on? In addition, being that accepting the terms is mandatory to gain access to the company’s services, there leaves very limited choices to the user: abide to the terms or abstain from the service. Some services like Facebook, Instagram, or Twitter may be easier to abstain from, but consider apps that are more essential such as driving or using Uber. Some individuals may be financially reliant on working for Uber; thus, they have very little choice but to abide by the terms. And in the case of many social media platforms, users can be coerced by a “crowd effect” where they are tempted to join because everyone they know is on as well. In either case, the odds are leveraged against the user.

The reason this issue exists today is that there lies very little regulation over these technology firms due to the lack of knowledge surrounding the company’s capabilities to harness the data. When Facebook first came out in 2004, no one expected it to be able to collect and store your personal information. Thus, Facebook grew to a point where they are “Too Big to Fail,” a term usually coined for banking companies that will collapse the financial system if bankrupted. In Facebook’s case however, they have already collected enough users that even if some users decide to abstain from the service, Facebook is not concerned over the lost usership thus reducing the leverage the user has over them. Though some features benefit from personalizing the user experience, the ramifications of the data collected raises serious privacy concerns.

The article referenced below offers the solution of changing the model from a service for data model to a pay for service model enabling users to take back control of their private data. Although this would be solving the issue regarding data ownership, this does not solve the data problem for apps like Uber that don’t fall under the same business model. In addition, this can be seen as tech companies selling back your data, implying they have the first rights to your digital identity.

Digital ownership is a huge issue that surrounds the way tech companies run their businesses. On one hand, the data is used to advance technology by creating personalized content and making us more efficient. On the other, we are sacrificing our privacy. There must be a balance between the potential benefits and costs, but without some sort of regulation to strike that balance, tech companies will continue to reap the maximum benefits at the cost of consumer privacy.

References:

Should Big Tech Own Our Personal Data
Digital Fingerprint Image
Social Privacy Image