Competition: A Solution to Poor Data Privacy Practices in Big Tech?

Competition: A Solution to Poor Data Privacy Practices in Big Tech?
By Anonymous | July 9, 2021

Competition

President Biden recently gave a press conference during which he spoke of a newly signed executive order on anticompetitive practices.  In his introductory remarks, he highlighted the effects of a lack of competition on hearing aids. He explained, “Right now, if you need a hearing aid, you can’t just walk into a pharmacy and pick one up over the counter.  You have to get it from a doctor or a specialist.  Not only does that make getting hearing aids inconvenient, it makes them considerably more expensive, and it makes it harder for new companies to compete, innovate, and sell hearing aids at lower prices. As a result… a pair of hearing aids can cost thousands of dollars.” ( Biden, 2021 ) This example, however, is not unique. It explores the fundamental relationship between consumer interests and companies’ products and practices.

Commonly, it is seen that a lack of competition allows companies to charge higher prices than consumers would reasonably pay for an item under ideal competition. If there is one gas station in a town charging $4.50 per gallon of gas, people will pay $4.50 per gallon. If 9 more gas stations open up, and costs for the gas stations are equivalent to $2.50 per gallon, each gas station will lower its price per gallon to gain customers while still earning a healthy margin, resulting in a gas price that might hover around $2.70 per gallon. This results in fair pricing for residents of the town.  Most people have thought about and understand this simple economic reality, but often do not think about a less tangible but equally existent application of this same effect; over-consolidation and anti-competitive practices among tech companies has led to the prevalence of poor privacy practices.  

“Rather than competing for consumers, they are consuming their competitors.” – President Joseph Biden (Biden, 2021 )

All other variables held equal, higher competition between companies causes a larger variety of products, services, and practices. As President Biden proclaimed, “The heart of American capitalism is a simple idea: open and fair competition — that means that if your companies want to win your business, they have to go out and they have to up their game; better prices and services; new ideas and products.” (Biden, 2021  If this is the case, as the President implies in his speech, the logical inverse is also true; lower competition between companies leads to a smaller variety of products, services, and practices.  Privacy and data protection practices are one of the many casualties of low competition.  Note that while not all lack of competition is due to anti-competitive practices, a lack of meaningful competition exists among tech companies nonetheless, and even those examples that are not necessarily attributable to noncompetitive practices are useful in seeing the effect of a lack of competition on privacy. If we consider the case where there is only one company providing an important or essential service, privacy practices are nearly irrelevant. If a user wants to use that service, the user must accept the privacy policy no matter its contents. Due to a lack of user-privacy focused legislation, however, current privacy policy writing and presentation practices lead a large majority of the population to almost never read a privacy policy ( Auxier et al., 2019 ).

Despite this lack of readership, which may be able to be fixed through education about privacy policies and reforms to their complexity and presentation, an increasing number of people do care significantly about data privacy practices, as can be seen by an increase in articles focused on privacy.

One example of the aforementioned lack of competition leading to poor privacy policies is Snapchat, owned by Snap, Inc.  Almost everyone in the younger generations uses Snapchat, and it is not interoperable with other platforms. It can even be considered a social necessity in many groups.  A user is therefore pressured into using the platform despite the severe privacy violations allowed by its terms of service and privacy policy, including Snap, Inc.’s right to save and use any user produced content – including self destructing messages ( Snap, Inc., 2019).  Imagine a hypothetical society in which the network effect is a nonissue and privacy policies are easily accessible to everyone.  There are three companies that offer similar services to Snapchat. Company A takes users’ privately sent pictures and uses them, stating as such in its privacy policy. Company B generally does not take users’ privately sent pictures or use them, but states in its privacy policy that it has the right to if it so chooses. Company C does not take or sell users’ privately sent pictures, and specifically states in its privacy policy that it does not have the right to do so. Company B here represents how Snapchat actually operates.  Which company would you choose? Through competition, which company do you think would come out on top?

Snapchat logo

While the lack of competition in the Snapchat example is due primarily to the network effect and not documentedly to anticompetitive practices taken by Snap, Inc., promoting competition in tech more generally can lead to a change in prevailing privacy and data security practices, thus leading to a systemic shift to fairer and more private privacy and data practices.

References:

Regarding the University of California’s Newest Security Event

Regarding the University of California’s Newest Security Event
By Ash Tan | July 9, 2021

Figure 1. The University of California describes its security event in an email to a valued member of the
UC community.

Figure 1. The University of California describes its security event in an email to a valued member of the UC community.

If you’re reading this, there’s a good chance that your personal data has been leaked. Important data too – your address, financial information, even your social security number could very well be floating around the Internet at this very moment. Of course, this prediction is predicated on the assumption that you, the reader, have some connection to the University of California system. Maybe you’re a student, or a staff or faculty member, or even a retired employee; it doesn’t really matter. This spring, the UC system announced that its data, along with the data of a hundred other institutions and schools, had been compromised in a euphemistically-described “security event.” In short, the UC system hired Accellion, an external firm, to handle their file transfers, and Accellion was the victim of a massive cybersecurity attack in December of 2020 (Fn. 1). This resulted in the information of essentially every person involved in the UC system being leaked to the internet, and while the UC system has provided access to credit monitoring and identity theft protection for the space of one year (Fn. 2), it should be noted that Experian, their chosen credit monitoring company, was responsible for a massive data breach of roughly 15 million people’s financial information in 2017(Fn. 3).

Figure 2. Affected individual receiving recompense for damages sustained from the 2015 Experian data breach.

Perhaps the framework that applies most intuitively to this system is Solove’s Taxonomy of Privacy (Fn. 4), which compels us to seek a comparable physical analog in order to better understand this situation. One might consider the relation between paperwork and a filing cabinet: we give our paperwork to the UC, which then stores it in a filing cabinet which is maintained by Accellion. We entrust our data to the UC system with the expectation that they safeguard our information, while the UC system entrusts our data to Accellion with the same expectation. When something goes wrong, this results in a chain of broken expectations that can make parsing accountability a difficult issue. Who, then, is to blame when the file cabinet is broken into: the owner of the cabinet, or the one who manages the paperwork within?

Figure 3. Cyber crime.

One take is that the laws that enabled a data breach of this scale are to blame. In an op-ed piece by The Hill, two contributors with backgrounds in education research, public policy, and political science point at a certain section of California Proposition 24 that exempts schools from privacy protection requirements (Fn. 5). These exemptions, including disallowing students of the right to be forgotten, allow a greater possibility of data mismanagement/misuse. The authors Evers and Hofer claim that stronger regulatory protections could have prevented this data breach along with numerous other ransomware attacks on educational institutions across the country, and that, in line with the Nissenbaum framework of contextual privacy (Fn. 6), “opt in/out” options could have potentially limited the amount of information leaked in this event while respecting individual agency. The “right to be forgotten” could have limited the amount of leaked information regarding graduated students, retired employees, and all individuals who have exited the UC system. In addition, California currently has no defined rights that enable an individual’s right to privately pursue recompense for damages resulting from negligent data management; the authors hold that defining such rights would incentivize entities such as the UC system to secure data more responsibly.

Notably, Evers and Hofer make no mention of Nissenbaum or Solove or even the esteemed Belmont Report (Fn. 7) when prescribing their recommendations. These proposed policy changes are not necessarily grounded in a theoretical, ethical framework of abstract rights and conceptual wrongs; they are intended to minimize real-life harms of situations that have already happened and could happen again. In the context of this very close-to-home example, we can see how these frameworks are more than academic in nature. But then again, the difference between framework and policy is the same as that between a skeleton and a living, breathing creature. The question that remains to be answered is whether the UC system will take this as an opportunity to better student data protections – between GDPR and California’s more recent privacy laws, there is plenty of groundwork to draw upon – or whether they will consider it nothing more than as an unfortunate security event, the product of chance rather than the result of an increasingly dangerous digital world (Fn. 5).

References:
1. https://www.businesswire.com/news/home/20210510005214/en/UC-Notice-of-Data-Breach
2. https://ucnet.universityofcalifornia.edu/data-security/updates-faq/index.html
3. https://www.theguardian.com/business/2015/oct/01/experian-hack-t-mobile-credit-checks-personal-information
4. https://www.law.upenn.edu/journals/lawreview/articles/volume154/issue3/Solove154U.Pa.L.Rev.477(2006).pdf
5. https://thehill.com/opinion/technology/550959-massive-school-data-breach-shows-we-need-better-privacy-policies?rl=1
6. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2567042
7. https://www.hhs.gov/ohrp/sites/default/files/the-belmont-report-508c_FINAL.pdf

Image sources:
1. Edited from a personal email.
2. https://topclassactions.com/lawsuit-settlements/lawsuit-news/hp-printer-experian-data-breach-settlement-checks-mailed/
3. https://www.linkedin.com/pulse/experian-data-breach-andrew-seldon?trk=public_profile_article_view

Brain-Machine Interfaces and Neuralink: privacy and ethical concerns

Brain-Machine Interfaces and Neuralink: privacy and ethical concerns
By Anonymous | July 9, 2021

Brain-Machine Interfaces

As the development of microchips furthers and advances in neuroscience occur, the possibility for seamless brain-machine interfaces, where a device decodes inputs from the user’s brain to perform functions, becomes more of a reality. These various forms of these technologies already exist. However, technological advances have made implantable and portable devices possible. Imagine a future where humans don’t need to talk to each other, but rather can transmit their thoughts directly to another person. This idea is the eventual goal of Elon Musk, the founder of Neuralink. Currently, Neuralink is one of the main companies involved in the advancement of this type of technology. Analysis of the Neuralink’s technology and their overall mission statement provide an interesting insight into the future of this type of human-computer interface and the potential privacy and ethical concerns with this technology.

Diagram of brain-computer interface

Brain-machine interfaces have actually been in existence for over 50 years. Research on these interfaces began in the 70’s at UCLA. However, with recent developments in wireless technologies, implanting devices, computational power and electrode design, a world where a device can be implanted to read the motor movements of a brain is now possible. In fact, Neuralink has already achieved this in a chimpanzee. The company successfully allowed this chimpanzee to control a game of pong with its mind. Their current goal is to advance the prosthetic space by allowing prosthetic devices to directly read input from the user’s motor cortex. However, the applications of this technology are vast and Musk has mentioned other ideas about the use of this technology such as downloading languages into brain, essentially allowing the device to write onto the brain. For now, this remains out of the realm of possibility, as our current understanding of the brain is insufficiently advanced. Yet, we are making advances in this direction every year. A paper was just published in Nature that allowed for high performance decoding of motor cortex signals into handwriting using a recurrent neural network.

Picture of chimpanzee controlling game from neuralink

Privacy

As this technology further develops, several privacy and ethical concerns come into question. To begin, using Solove’s Taxonomy as a privacy framework, many areas of potential harm are revealed. In the realm of information collection, there is much risk. Brain-computer interfaces, depending on where they are implanted, could have access to people’s most private thoughts and emotions. This information would need to be transmitted to another device for processing. The collection of this information by companies such as advertisers would represent a major breach of privacy. Additionally, there is risk to the user from information processing. These devices must work concurrently with other devices and often wirelessly. Given the widespread importance of cloud computing in much of today’s technology, offloading information from these devices to the cloud would be likely. Having the data stored in a database puts the user at the risk of secondary use if proper privacy policies are not implemented. The trove of information stored within the information collected from the brain is vast. These datasets could be combined with existing databases such as browsing history on Google to provide third parties with unimaginable context on individuals. Lastly, there is risk for information dissemination, more specifically, exposure. The information collected and processed by these devices would need to be stored digitally. Keeping such private information, even if anonymized, would be a huge potential for harm, as the contents of the information may in itself be re-identifiable to a specific individual. Lastly there is risk for invasions such as decisional interference. Brain-machine interfaces would not only be able to read information in the brain but also write information. This would allow the device to make potential emotional changes in its users, which be a major example of decisional interference. Similar devices are already present in major depression treatment devices that implant electrodes for deep brain stimulation.

Ethics

One of the most common ethical principles for guiding ethical behavior in research and science is the Belmont Principles, which include respect for persons, beneficence and justice. Future brain-machine interfaces present challenges to all three guiding principles. In order to protect the respect for persons, people’s autonomy must be respected. However, with these devices, the emotions of the users could physically be altered by the device, and thus affecting their autonomy. Beneficence involves doing no harm to the participants. However, as mentioned with the privacy potential harms, there is likely to be harm towards the first adopters of the technology. In regards to justice, these devices may also be lacking. The first iterations of the devices are extremely expensive and not attainable to most people. However, the potential cognitive benefits of such devices would be vast. This could further emphasize the already wide wealth inequality gap. The benefits of such a device would not be spread fairly across all participants and would mostly benefit those who could afford the devices.

Based on other respected neuroscientists invested in brain-machine interfaces, these sort of devices with the abilities purported by Elon are quite far away. We currently still lack fundamental knowledge about the details of the brain and its inner workings. Yet, our existing guidelines for privacy and ethics fail to encompass the potential of such advances in brain-machine interfaces, which is why further thought is needed to provide polices and frameworks to properly guide the development of the technology.

What Will Our Data Say About Us In 200 years?

What Will Our Data Say About Us In 200 years?
By Jackson Argo | June 18, 2021

Just two weeks ago, Russian scientists published a paper explaining how they extracted a 24,000 year old living bdelloid rotifer, a microorganism with exceptional survival skills, from Siberian permafrost. This creature is not only a biological wonder, but comes with a trove of genetic curiosities soon to be studied by biotechnologists. Scientists have found many other creatures preserved in ice, including Otzi the Iceman, a man naturally preserved in ice for over 5300 years. Unlike the rotifer, Otzi is a human, and even though he nor any of his family can give consent for the research conducted on his remains, he has been the subject of numerous studies. This research does not pose a strong moral dilemma for the same reason it is impossible to get consent, he has been dead for more than five millennia and it’s hard to imagine what undo harm could affect Otzi or his family. Frameworks such as the Belmont Report emphasize the importance of consent from the living, but make no mention of the deceased. However, the dead are not the only ones whose data is at the mercy of researchers. Even with legal and ethical frameworks in place, there are many cases where the personal data of living people is used in studies they might have not consented to.

*A living bdelloid rotifer from 24,000-year-old Arctic permafrost.*

It’s not hard to imagine that several hundred years from now, historians will be analyzing the wealth of data collected by today’s technology, regardless of the privacy policies we may or may not have read. Ortiz’s remains only provide a snapshot of his last moments, and this limited information has left scientists many unanswered questions about his life. Similarly, today’s data does not capture a complete picture of our world and some may even be misleading. Historians are no stranger to limited or misleading data, and are constantly filling in the gaps and revising their understanding as new information surfaces. But, what kind of biases will historians face when looking at these massive datasets of personal and private information?

Missing in Action

To answer this question, we first look for the parts of our world that are not captured or underrepresented in these datasets. Kate Crawford gives us two examples of this in the article Hidden Biases in Big Data. A study of Twitter and Foursquare data revealed interesting features about New Yorker’s activity during Hurricane Sandy. However this data also revealed it’s inherent bias; the majority of the data was produced in Manhattan and little data was produced in the harder-hit areas. In a similar way, a smartphone app designed to detect potholes will be less effective in lower-income areas where smartphones are not as prevalent.

For some, absence from these datasets is directly built into legal frameworks. GDPR, as one example, gives citizens in the EU the right to be forgotten. There are some constraints, but this typically allows an individual to request that a data controller, a company like Google that collects and stores data, should erase that individual’s personal data from the company’s databases. Provided the data controller complies, this individual will no longer be represented in that dataset. We should not expect that the people who exercise this right are evenly distributed in some demographic. Tech savvy and security-conscious individuals may be more likely to fall into this category than others. 

The US has COPPA], the children’s privacy act, which puts heavy restrictions on data that companies can collect from children. Many companies, such as the discussion website Reddit, chose to omit children under 13 entirely in their user agreements or terms of service. Scrolling through the posts in r/Spongebob, a subreddit community for the tv show Spongebob Squarepants, might suggest that no one under 13 is talking about Spongebob online.

Context Clues

For those of us who are collected into the nebulous big data-sphere, how accurately does your data actually represent you? Data collection is getting more and more sophisticated as the years go on. To name just a few sources of your data, virtual reality devices capture your motion data, voice controlled devices capture your speech patterns and intonation, and cameras capture your biometric data like faceprints and fingerprints. There are even now devices that interface directly with the neurons in primate brains to detect intended actions and movements. 

Unfortunately, this kind of data collection is not free from contextual biases. When companies like Google and Facebook collect data, they are only collecting data particular to their needs, which is often to inform advertising or product improvements. Data systems are not able to capture all the information that they detect; this is far too ambitious, even for our biggest data centers. A considerable amount of development time is spent deciding what data is important and worth capturing, and the result is never to paint a true picture of history. Systems that capture data are designed to emphasize the important features, and everything else is either greatly simplified or dropped. Certain advertisers may only be interested in whether an individual is heterosexual or not, and nuances like gender and sexuality are heavily simplified in their data. 

Building an indistinguishable robot replica of a person is still science fiction, for now, but several ai based companies are already aiming to replicate people and their emotions through chatbots. These kinds of systems learn from our text and chat history from apps like Facebook and Twitter to create a personalized chatbot version of ourselves. Perhaps there will even be a world where historians ask chatbots questions about our history. But therein lies another problem historians are all too familiar with, the meaning of words and phrases we use today can change dramatically in a short amount of time. This is, of course, assuming that we can even agree on the definition of words today.

In the article Excavating AI, Kate Crawford and Trevor Paglen discuss the political context surrounding data used in machine learning. Many machine learning models are trained using a set of data and corresponding labels to indicate what the data represents. For example, a training dataset might contain thousands of pictures of different birds along with the species of the bird in the picture. This dataset could train a machine learning model to identify species of birds from satellite images. The process begins to break down when the labels are more subjectively defined. A model trained to differentiate planets from other celestial bodies may incorrectly determine that Pluto is a planet if the training data was compiled before 2006. The rapidly evolving nature of culture and politics makes this kind of model training heavily reliant on the context of the dataset’s creation.

*A Venezuelan Troupial in Aruba*

Wrapping Up

200 years from now, historians will undoubtedly have access to massive amounts of data to study, but they will face the same historical biases and misinformation that plague historians today. In the meantime, we can focus on protecting our own online privacy and addressing biases and misinformation in our data to make future historians’ job just a little easier.

Thank you for reading!

References

  • https://www.cell.com/current-biology/fulltext/S0960-9822(21)00624-2
  • https://www.nationalgeographic.com/history/article/131016-otzi-ice-man-mummy-five-facts
  • https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/read-the-belmont-report/index.html
  • https://hbr.org/2013/04/the-hidden-biases-in-big-data
  • https://gdpr.eu/right-to-be-forgotten/
  • https://www.ftc.gov/tips-advice/business-center/privacy-and-security/children%27s-privacy
  • https://www.wired.com/story/replika-open-source/
  • https://excavating.ai/

AI Bias: Where Does It Come From and What Can We Do About It?

AI Bias: Where Does It Come From and What Can We Do About It?
By Scott Gatzemeier | June 18, 2021

Artificial Intelligence (AI) bias is not a new topic but it is certainly a heavily debated and hot topic right now. AI can be an incredibly powerful tool that provides tremendous business value from automating or accelerating routine tasks to discovering insights not otherwise possible. We are in the big data era and most companies are working to take advantage of these new technologies. However, there are several examples of poor AI implementations that enable biases to infiltrate the system and undermine the purpose of using AI in the first place. A simple search on DuckDuckGo for ‘professional haircut’ vs ‘unprofessional haircut’ depicts a very clear gender and racial bias.

Scott Gatzmeier 1
Professional Haircut
Unprofessional Haircut

In this case, a picture is truly worth 1000 words. This gender and racial bias is not hard-coded in the algorithm by the developers maliciously. Rather it is a reflection of the word-to-picture associations that the algorithm picked up from the authors of the web commentary. So the AI is simply reflecting back historical societal biases to us in the images returned. If these biases are left unchecked by AI developers they are perpetuated. These perpetuated AI biases have proven to be especially harmful in several cases, such as Amazon’s Sexist Hiring Algorithm that inadvertently favored male candidates and the Racist Criminological Software COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) where black defendants were 45% more likely to be assigned higher risk scores than white defendants.

Where does AI Bias Come From?

There are several potential sources of AI bias. First, AI will inherit the biases that are in the training data. (Training data is a collection of labeled information that is used to build a machine learning (ML) model. Through training data, an AI model learns to perform its task at a high level of accuracy.) Garbage in, Garbage out. AI reflects the views of the data that it is built on and can only be as objective as the data. Any historical data that is used would be subject to the same societal biases at the time the data was generated. When used to generate predictive AI, for example, this can lead to the perpetuation of stereotypes that impact decisions which can have real consequences and harms.

Next, most ML algorithms are built upon statistical math and look to make decisions based on distributions of data and key features that can be used to separate data points into categories or other items it associates together. Outliers that don’t fit the primary model tend to be weighted lower, especially when focusing only on the model accuracy. When working with people-focused data, often the outlier data points are in an already marginalized group. This is how biased AI can come from good clean non-biased data. AI is only able to learn about different biases (race, gender, etc.) if there is a high enough frequency of each group in the data set. The training data set must contain an adequate size for each group, otherwise this statistical bias can further perpetuate marginalizations.

Finally, most AI algorithms are built on correlation to the training data. As we know, correlation doesn’t always equal causation. The AI algorithm doesn’t understand what any of the inputs mean in context. For example, you get a few candidates from a particular school but you don’t hire them because you have a position freeze due to business conditions. The fact that they weren’t hired gets added to the training data. AI would start to correlate that school with bad candidates and potentially stop recommending candidates from that school even if they are great potentially because it doesn’t know the causation of why they weren’t selected.

What can we do about AI Bias?

Before applying AI to a problem, we need to ask what level of AI is appropriate? What should the role of AI be depending on the sensitivity and impact of the decision on people’s lives? Should it be an independent decision maker, a recommender system, or not used at all? Some companies are applying AI even if it is not at all suited to the task in question and other means would be more appropriate. So, there is a moral decision that needs to be made prior to implementing AI. Obed Louissaint the Senior Vice President of Transformation and Culture talks about “Augmented Intelligence”. This refers to leveraging the AI algorithms as “colleagues” to assist company leaders in making better decisions and better reasoning rather than replace human decision making. We also need to focus on the technical aspects of AI development and work to build models that are more robust against bias and against bias propagation. Developers need to focus on explainable, auditable, and transparent algorithms. When major decisions are made by humans the reasoning associated with that decision is an expectation and there is accountability. Algorithms should be subject to the same expectations, regardless of IP protection. Visualization tools that help to explain how AI works and the ‘why’ behind the conclusion that AI came to continue to be a major area of focus and opportunity.

In addition to AI transparency, there are emerging AI technologies such as Generative Adversarial Networks (GAN) that can be used to create synthetic unbiased training data based on parameters defined by the developer. Causal AI is another promising area that is building momentum and could provide cause and effect understanding to the algorithm. This could give AI some ‘common sense’ and prevent several of these issues.

AI is being adopted rapidly and the world is just beginning to capitalize on its potential. As Data Scientists, it is increasingly important to understand the sources of AI bias and continue to develop fair AI that prevents the social and discriminatory issues that arise from that bias.

References

  • https://www.inc.com/guadalupe-gonzalez/amazon-artificial-intelligence-ai-hiring-tool-hr.html
  • https://hbr.org/2020/10/ai-fairness-isnt-just-an-ethical-issue
  • https://www.logically.ai/articles/5-examples-of-biased-ai
  • https://towardsdatascience.com/why-your-ai-might-be-racist-and-what-to-do-about-it-c081288f600a
  • https://medium.com/ai-for-people/the-ethics-of-algorithmic-fairness-aa394e12dc43
  • https://towardsdatascience.com/survey-d4f168791e57
  • https://techcrunch.com/2020/06/24/biased-ai-perpetuates-racial-injustice/
  • https://towardsdatascience.com/reducing-ai-bias-with-synthetic-data-7bddc39f290d
  • https://towardsdatascience.com/ai-is-flawed-heres-why-3a7e90c48878

Are we truly anonymous in public health databases?

Are we truly anonymous in public health databases?
By Anonymous | June 18, 2021

Source: Iron Mountain

Privacy in healthcare data seems to be the baseline when speaking of personal data and privacy issues for many users, even for those who hold a more relaxing attitude toward privacy policy issues. We may think that social media, tech, and financial companies could be selling user data for profit, but people tend to trust hospitals, healthcare institutions, and pharmaceutical companies for at least trying to keep their users’ data safe. Is that really true? How safe is our healthcare data? Can we really be anonymous in the public databases?

As a user and as a patient, we have to share a lot of personal and sensitive information when we see a doctor or healthcare practitioner in order for them to provide precise and useful healthcare services to us. Doctors might know about our blood type, potential genetic risks or diseases in our family, pregnancy experiences, etc. Not only those, the health institutions behind doctors also keep records of our insurance information, home address, zip code, payment information. Healthcare institutions might hold more comprehensive sensitive and private information about you than any of those organizations who also try to retain as much information about you.

What kind of information healthcare providers collect and share with third parties? In fact, most of the healthcare providers should follow HIPAA’s privacy policy guidance. For example, I noticed that Sutter Health said they follow or refer to the HIPAA privacy in their user agreement.

For example, Sutter’s privacy policy talked about its usage of your healthcare. In Sutter’s privacy policy, it is stated that “We can use your health information and share it with other professionals who are treating you. We may use your health information to provide you with medical care in our facilities or in your home. We may also share your health information with others who provide care to you such as hospitals, nursing homes, doctors, nurses or others involved in your care.” Those usage seem reasonable to me.

In addition to above expected usage of your healthcare data, Sutter also mentioned that they are allowed to share your information in “ways to contribute to public good, such as public health and research”. Those ways include, “preventing disease, helping with product recalls, reporting adverse reactions on medications, reported suspected abuse, …”. One concern arising from one of the usage — public health and research, can we really be anonymous in the public database?

In fact the answer is no. Most of the healthcare records can be de-anonymized through information matching ! “Only a very small amount of data is needed to uniquely identify an individual. Sixty three percent of the population can be uniquely identified by the combination of their gender, date of birth, and ZIP code alone”, according to a post on Georgetown Law Technology Review published in April 2017. Thus, it is totally possible for both people with good intentions such as a research team and data scientists whose true intention is to provide public good, or people with bad intention, such as hackers, to legally or illegally get healthcare information from multiple sources and aggregate together. And in fact they can be-anonymous the data, especially with the help of current computing resources, algorithms, and machine learning.

So do companies that hold your healthcare information have to follow some kind of privacy framework? Are there laws out there to regulate companies who have your sensitive healthcare information and protect the vulnerable public like you and me? One guidance that most healthcare providers should follow is the Health Insurance Portability and Accountability Act (HIPAA), which became effective in 1996. This act stated who has rights to access what kind of health information, what information is protected, and how information should be protected. It also covered who must follow these laws, including Health plans, most healthcare providers, and health care clearinghouses, and health insurance companies. Companies that could have your health information and do not have to follow these laws include life insurers, most of the schools and school districts, state agencies like child protective services agencies, law enforcement agencies, and municipal offices.

Normal people like you and me are vulnerable individuals. We don’t have the knowledge, patience, and knowledge to understand every term stated in the long and full-of-law-jargon user agreement and privacy policies. But what we can do and should do is advocate for strong protection for our personal information, especially sensitive healthcare data. And government and policy makers should also establish and enforce more comprehensive privacy policies to protect everyone, to limit the scope and ability of healthcare data sharing thus to prevent de-anonymous events from happening.

________________
Reference:

1. Stanford Medicine. Terms and Conditions of Use. Stanford Medicine. https://med.stanford.edu/covid19/covid-counter/terms-of-use.html.
2. Stanford Medicine. Datasets. Stanford Medicine. https://med.stanford.edu/sdsr/research.html.
3. Stanford Medicine. Medical Record. Stanford Medicine. https://stanfordhealthcare.org/for-patients-visitors/medical-records.html.
4. Sutter Health. Terms and Conditions. Sutter Health. https://mho.sutterhealth.org/myhealthonline/terms-and-conditions.html.
5. Sutter Health. HIPAA and Privacy Practices. Sutter Health. https://www.sutterhealth.org/privacy/hipaa-privacy.
6. Wikipedia (14 May 2021). Health Insurance Portability and Accountability Act. Wikipedia. https://en.wikipedia.org/wiki/Health_Insurance_Portability_and_Accountability_Act.
7. Your Rights Under HIPAA. HHS. https://www.hhs.gov/hipaa/for-individuals/guidance-materials-for-consumers/index.html.
8. Adam Tanner (1 February, 2016). How Data Brokers Make Money Off Your Medical Records. Scientific American. https://www.scientificamerican.com/article/how-data-brokers-make-money-off-your-medical-records/.

What You Should Know Before Joining an Employee Wellness Program

What You Should Know Before Joining an Employee Wellness Program
By Ashley Moss | June 18, 2021

Weigh-in time at NBC’s The Office

In 2019 more than 80% of large employers offered a workplace wellness program as healthcare trends in America turn toward disease prevention. Some wellness programs focus on behavioral changes like smoking cessation, weight loss, or stress management. Participants might complete a health survey or undergo biometric tests in a lab. Many employers offer big financial incentives for participating and/or reaching target biometric values. While on the surface this may seem like a win-win for employers and employees, this article takes a closer look at the potential downsides of privacy compromise, unfairness, and questionable effectiveness.

Laws and regulations that normally protect your health data may not apply to your workplace wellness program. The federal government’s HIPAA laws cover doctor’s offices and insurance companies who use healthcare data. These laws limit information sharing to protect your privacy and require security measures to ensure the safety of electronic health information.

If a workplace wellness program is offered through an insurance plan, HIPAA applies. However, if it is offered directly through the employer, HIPAA does not apply. This means the program is not legally required to follow HIPAA’s anti-hacking standards. It also means employee health data can be sold or shared without legal repercussions. Experts warn that an employer with direct access to health data could use it to discriminate against, or even lay off, those with a high cost of care.

Although these programs claim to be voluntary, employers provide a financial incentive averaging hundreds of dollars. It’s unclear how much pressure this places on each employee, especially because the dollar penalty a person can afford really depends on their financial situation. There is some concern that employers have a hidden agenda: Wellness programs shift the burden of medical costs away from the employer and toward unhealthy or non-participating employees.

Wellness programs may be unfair to certain groups of people. Research shows that programs penalize lower wage workers more often, contributing to a “poor get poorer” cycle of poverty. Wellness programs may also overlook entire categories of people who have a good reason not to join, such as people with disabilities. In one case, a woman with a double mastectomy faced possible fines for missing mammograms until she provided multiple explanations to her employer’s wellness program.

Experts question the effectiveness of workplace wellness programs, since evidence shows little impact to key outcomes. Two randomized studies followed participants for 12+ months and found no improvement to medical spending or health outcomes. Wellness programs do not require physician oversight, so the interventions may not be supported by scientific evidence. For example, Body Mass Index (BMI) has fallen out of favor in the medical community but persists in wellness programs.

Wellness programs may focus on reducing sedentary time or tracking steps.

 Before joining an employee wellness program, do your homework to understand more about the potential downsides. Remember that certain activities are safer than others: are you disclosing lab results or simply attending a lecture on healthy eating? If you are asked to share information, get answers first: What data will be collected? Which companies can see it? Can it be used to make employment decisions? Lastly, understand that these programs may not be effective and cannot replace the advice of a trained physician.

References

  https://www.consumerreports.org/health-privacy/are-workplace-wellness-programs-a-privacy-problem/

https://www.researchgate.net/publication/323538785_Health_and_Big_Data_An_Ethical_Framework_for_Health_Information_Collection_by_Corporate_Wellness_Programs

 https://www.kff.org/private-insurance/issue-brief/trends-in-workplace-wellness-programs-and-evolving-federal-standards/

  https://hbr.org/2017/01/workplace-wellness-programs-could-be-putting-your-health-data-at-risk

 https://www.hhs.gov/hipaa/for-professionals/privacy/workplace-wellness/index.html

 https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html

 https://www.hhs.gov/hipaa/for-professionals/security/index.html

 https://www.hhs.gov/hipaa/for-professionals/breach-notification/breach-reporting/index.html

https://i.pinimg.com/originals/24/43/ba/2443ba5aaac0fface135795ca08d3c76.jpg

The Price of a Free Salad

The Price of a Free Salad
By Anonymous | June 18, 2021

How do you celebrate your birthday? If you’re anything like me, you prioritize friends, family, and cake, but you let some of your favorite corporations in on the mix, too. Each year, I check my email on my birthday to find myself fêted by brands like American Eagle, Nintendo, and even KLM. Their emails come in like clockwork, bearing coupon codes and product samples for birthday gifts. 

An image of multiple emails and their contents that offer birthday discounts
Birthday email promotions

Most of these emails linger unopened in my promotions tab, but my best friend makes an annual odyssey of redeeming her offers. One year I spent the day with her as we did a birthday coupon crawl, stopping by Target for a shopping spree and a free Starbucks coffee, Sephora for some fun makeup samples, and ending with complimentary nachos at a trendy Mexican restaurant downtown.

I used to want to be savvy like her and make the most of these deals, but lately, I’ve been feeling better about missing out. The work of Dr. Latanya Sweeney, a pioneering researcher in the field of data privacy has taught me what powerful information my birthday is. 

In her paper “Simple Demographics Often Identify People Uniquely”, Sweeney summarizes experiments that showed that combinations of seemingly benign personal details, such as birthday, gender, and zip code, often provide enough information to identify a single person. She and her team even developed this tool to demonstrate this fact. 

Sample uniqueness results
Sample results from the uniqueness tool

I keep trying this website out with friends and family, and I have yet to find anyone who isn’t singled out by the combination of their birthday, gender, and zip code. 

But what does this mean for our beloved birthday deals? Let’s think a bit about how this would work if you were shopping at, say, Target. You don’t have to shop at Target very often to see promotions of its 5% birthday shopping deal. 

Target circle birthday discount
Target circle birthday discount

Attentive readers may be wondering: “If Target knows who I am already, why do I care if they can identify me with my birthday?” This is a fair question. The truth is that, in our online, tech-enabled world, even brick and mortar shopping is no longer a simple cash-for-products exchange.

Target and other retailers are hungry to know more about you so that they can sell you more products. There are a lot of different ways that companies get information about you, but most fall into three main categories. 

1. They ask you questions that you choose to answer. 

2. They track your behavior online via your IP address, cookies, and other methods. 

3. They get data about you from other sources and match it up with the records that they have been keeping on you.

This last method makes your birthday tantalizing to retailers. Target already has access to data about you that is incomplete or anonymized, whether that’s from your online shopping habits or by tracking your credit card purchases in stores. Your birthday may just be the missing link that helps it get even closer to a full picture of who you are, which it will use to motivate you to spend more money.

Data exchange works both ways, so Target may decide to monetize that detailed profile that it has of you and your behavior with other companies. In fact, Target’s privacy policy says that it might do just that: “We may share non-identifiable or aggregate information with third parties for lawful purposes.” 

The more Target knows about you, the easier it would be for another company to identify you within a “non-identifiable” dataset. Even if companies like Target are diligent about removing birthdates from databases before sharing them, they are vulnerable to security breaches. Birthdays are often used by websites to verify identities. If your birthday is part of the information included in a security breach, that increases the odds that you will be targeted for identify theft and fraud.

After I started writing this post, I found an email from Sweetgreen that reminded me that somehow, despite years of using their app regularly, I still haven’t given up my birthday.

A Sweetgreen marketing email promising a discounted salad in exchange for my birthday.
A Sweetgreen marketing email promising a discounted salad in exchange for my birthday.

I’ve always loved a good deal, and I have a soft spot for $16 salads. I wonder, if I’m already being tracked, if my activity is being monitored, if my history is fragmented into databases and data leaks, why not get some joy out of it? Why not share free nachos with friends and pile a Sweetgreen salad with subsidized avocado and goat cheese? 

Ultimately, I still can’t justify it. My privacy and security is not for sale. At the very least, it’s worth much more than $10. 

Ring: Security vs. Privacy

Ring: Security vs. Privacy
By Anonymous | June 18, 2021

A little over 50 years ago, Mari Van Britten Brown invented the first home security system (timeline)– a closed circuit set of cameras and televisions with a panic button to cont act the police. In the years since, as with most household technology, advances have culminated to modern sleek and “cost effective” smart cameras such as the Amazon-owen Ring products. Ring’s mission, displayed on their website, is to “make neighborhoods safer”, proposing that connected communities lead to safer neighborhoods. Ring has also come under some scrutiny for partnering with police forces (WaPo) and, like most ‘smart’ devices and mobile apps, collects quite a bit of information from its users. While the first modern security systems of the 1960s had also relied on collaboration with the police, each individual household had its own closed circuit system, with explicit choice on when, and how much, to share with law enforcement. When Ring sets out to make neighborhoods safer, for whom are they making it safe? What is Ring’s idea of a safe neighborhood? 

Purple Camera
Purple Camera

Ring cameras have surged in popularity over the past few years, likely in some part due to the increase in home package deliveries bringing about an increase in package theft. With convenient alerts and audio/video accessible from a mobile device, the benefits of an affordable, accessible, and stylish security system loom large. 

Ring, as a company, collects each user’s name, address, geolocation of each device, and any audio/video content.  Closed circuit predecessors resulted in a data environment where each household had exclusive access to its own security footage. Thus, the sum of all surveillance was scattered among the many users and separate local law enforcement groups with whom users shared footage. Under Nissenbaum’s Contextual Integrity framework, trespassers on private property expect to be surveilled, the owners of the security systems have full agency over all transmission principle, or constraints on the flow of information. Owners can choose, at any time, to share any portion of their security data with the police. 

Ring, by contrast, owns all of the audio and video content of its millions of users, allowing all of the data to be centralized and accessible.  About 10% of police departments in the US have been granted access, by request, to any user’s security footage. Users often purchase Ring products expecting their service to check on packages and see who is at the door, as advertised. Along with this comes with the agreement that users no longer have the authority or autonomy to prevent their data from being shared with law enforcement.

Police City Eyeball
Police City Eyeball

Via Ring’s privacy policy, they can also keep any deleted user data for any amount of time. As is the case with many data focused companies, Ring also reserves the right to change its terms and conditions at any time without any notice. One tenant of responsible privacy policy is to limit the secondary use of any data without express informed consent. As Ring has been aggressively partnering with police and providing LAPD officers with free equipment to market their products, it is not unreasonable that all current and former audio/video footage and other data will be accessible to law enforcement without a need for a warrant. 

References

https://www.theguardian.com/commentisfree/2021/may/18/amazon-ring-largest-civilian-surveillance-network-us

https://www.washingtonpost.com/technology/2019/08/28/doorbell-camera-firm-ring-has-partnered-with-police-forces-extending-surveillance-reach/

https://timeline.com/marie-van-brittan-brown-b63b72c415f0

https://ring.com/

Drones, Deliveries, and Data

Drones, Deliveries, and Data
—Estimated Arrival – Now. Are We Ready?—

By Anonymous | June 18, 2021

It’s no secret that automated robotics has the ability to propel our nation into a future of prosperity. Industries such as Agriculture, Infrastructure, Defense, Medicine, Transportation, the list goes on; the opportunities for Unmanned Aerial Vehicles, in particular, to transform our world are plentiful and by the day we fly closer and closer to the sun. However, the inherent privacy related and ethical concerns surrounding this technology need to be addressed before they take to the skies. Before we tackle this, let’s set some context.

An Unmanned Aerial Vehicle (commonly known as a UAV or Drone) can come in many shapes and sizes, all without an onboard human pilot and with a variety of configurations tailored to its specific use. You may have seen a drone like the one below (left) taking photos at your local park or have heard of its larger cousin the Predator B (right) which is a part of the United States Air Force UAV fleet.

DJI and Predator Drone
DJI and Predator Drone

The use of drone technology in United States foreign affairs is a heavily debated topic that won’t be discussed today. Instead, let’s focus on a UAV application closer to home: Drone delivery.

There are several companies that have ascended the ranks of the autonomous drone delivery race namely Amazon, UPS, Zipline, and the SF based startup Xwing who hopes to deliver not just packages, but even people to their desired destination. Aerial delivery isn’t constricted by land traffic and therefore can take the shortest route between two points. If implemented correctly the resulting increase in transportation efficiency could be revolutionary. As recently as August 2020, legislature has been passed allowing for special use of UAVs beyond line-of-sight flight, which was the previous FAA regulation. This exposes the first issue that drone delivery brings. If not controlled by a human in a line-of-sight context, then the drone necessarily must use GPS, visual, thermal, and ultrasonic sensors to navigate the airspace safely. According to Amazon Prime air, their drone contains a multitude of sensors that allow “…stereo vision in parallel with sophisticated AI algorithms trained to detect people and animals from above.” Although it’s an impressive technological feat, any application where people are identified using camera vision needs to be handled with the utmost care. Consider the situation where a person turns off location services on their device as a personal privacy choice. A delivery drone has the capability to identify that person without their knowledge and combined with on board GPS data, that person has been localized without the use of their personal device or consent. This possible future could be waiting for us if we do not create strong legislature with clear language regarding the collection of data outside what is necessary for a delivery.

There’s another shortcoming with UAV delivery that doesn’t have to do with privacy: our existing infrastructure. 5G cellular networks are increasing in size and robustness around the nation which is promising for the future of autonomous delivery as more data can be transferred to and from the UAV. However, this reveals a potential for exclusion as the lack of 5G coverage may leave areas of this nation unreachable by drone due to the UAV flying blind or from running out of power. According to Amazon, the current iteration of Prime Air drone has a 15-mile range which leaves the question, “Is this technology really helping those who need it?”

Current 5G Coverage
Current 5G Coverage

It’s not all bad however, drone deliveries have the potential to create real, positive change in our world especially in light of the ongoing COVID-19 Pandemic. Privacy forward drone tech would help reduce emissions by both using electric motors and by allowing people to order items from the comfort of their own home in a timely manner, negating a drive to the store. It’ll be exciting to see what the future holds for UAV technology, and we must stay vigilant to ensure our privacy rights aren’t thrown to the wind.

References
AI, 5G, MEC and more: New technology is fueling the future of drone delivery. https://www.verizon.com/about/news/new-technology-fueling-drone-delivery

Drones Of All Shapes And Sizes Will Be Common In Our Sky By 2030, Here’s Why, With Ben Marcus Of AirMap.https://www.forbes.com/sites/michaelgale/2021/06/16/drones-of-all-shapes-and-sizes-will-be-common-in-our-sky-by-2030-heres-why-with-ben-marcus-of-airmap/

A drone program taking flight. https://www.aboutamazon.com/news/transportation/a-drone-program-taking-flight

Federal Aviation Administration. https://www.faa.gov/