What Will Our Data Say About Us In 200 years?

What Will Our Data Say About Us In 200 years?
By Jackson Argo | June 18, 2021

Just two weeks ago, Russian scientists published a paper explaining how they extracted a 24,000 year old living bdelloid rotifer, a microorganism with exceptional survival skills, from Siberian permafrost. This creature is not only a biological wonder, but comes with a trove of genetic curiosities soon to be studied by biotechnologists. Scientists have found many other creatures preserved in ice, including Otzi the Iceman, a man naturally preserved in ice for over 5300 years. Unlike the rotifer, Otzi is a human, and even though he nor any of his family can give consent for the research conducted on his remains, he has been the subject of numerous studies. This research does not pose a strong moral dilemma for the same reason it is impossible to get consent, he has been dead for more than five millennia and it’s hard to imagine what undo harm could affect Otzi or his family. Frameworks such as the Belmont Report emphasize the importance of consent from the living, but make no mention of the deceased. However, the dead are not the only ones whose data is at the mercy of researchers. Even with legal and ethical frameworks in place, there are many cases where the personal data of living people is used in studies they might have not consented to.

*A living bdelloid rotifer from 24,000-year-old Arctic permafrost.*

It’s not hard to imagine that several hundred years from now, historians will be analyzing the wealth of data collected by today’s technology, regardless of the privacy policies we may or may not have read. Ortiz’s remains only provide a snapshot of his last moments, and this limited information has left scientists many unanswered questions about his life. Similarly, today’s data does not capture a complete picture of our world and some may even be misleading. Historians are no stranger to limited or misleading data, and are constantly filling in the gaps and revising their understanding as new information surfaces. But, what kind of biases will historians face when looking at these massive datasets of personal and private information?

Missing in Action

To answer this question, we first look for the parts of our world that are not captured or underrepresented in these datasets. Kate Crawford gives us two examples of this in the article Hidden Biases in Big Data. A study of Twitter and Foursquare data revealed interesting features about New Yorker’s activity during Hurricane Sandy. However this data also revealed it’s inherent bias; the majority of the data was produced in Manhattan and little data was produced in the harder-hit areas. In a similar way, a smartphone app designed to detect potholes will be less effective in lower-income areas where smartphones are not as prevalent.

For some, absence from these datasets is directly built into legal frameworks. GDPR, as one example, gives citizens in the EU the right to be forgotten. There are some constraints, but this typically allows an individual to request that a data controller, a company like Google that collects and stores data, should erase that individual’s personal data from the company’s databases. Provided the data controller complies, this individual will no longer be represented in that dataset. We should not expect that the people who exercise this right are evenly distributed in some demographic. Tech savvy and security-conscious individuals may be more likely to fall into this category than others. 

The US has COPPA], the children’s privacy act, which puts heavy restrictions on data that companies can collect from children. Many companies, such as the discussion website Reddit, chose to omit children under 13 entirely in their user agreements or terms of service. Scrolling through the posts in r/Spongebob, a subreddit community for the tv show Spongebob Squarepants, might suggest that no one under 13 is talking about Spongebob online.

Context Clues

For those of us who are collected into the nebulous big data-sphere, how accurately does your data actually represent you? Data collection is getting more and more sophisticated as the years go on. To name just a few sources of your data, virtual reality devices capture your motion data, voice controlled devices capture your speech patterns and intonation, and cameras capture your biometric data like faceprints and fingerprints. There are even now devices that interface directly with the neurons in primate brains to detect intended actions and movements. 

Unfortunately, this kind of data collection is not free from contextual biases. When companies like Google and Facebook collect data, they are only collecting data particular to their needs, which is often to inform advertising or product improvements. Data systems are not able to capture all the information that they detect; this is far too ambitious, even for our biggest data centers. A considerable amount of development time is spent deciding what data is important and worth capturing, and the result is never to paint a true picture of history. Systems that capture data are designed to emphasize the important features, and everything else is either greatly simplified or dropped. Certain advertisers may only be interested in whether an individual is heterosexual or not, and nuances like gender and sexuality are heavily simplified in their data. 

Building an indistinguishable robot replica of a person is still science fiction, for now, but several ai based companies are already aiming to replicate people and their emotions through chatbots. These kinds of systems learn from our text and chat history from apps like Facebook and Twitter to create a personalized chatbot version of ourselves. Perhaps there will even be a world where historians ask chatbots questions about our history. But therein lies another problem historians are all too familiar with, the meaning of words and phrases we use today can change dramatically in a short amount of time. This is, of course, assuming that we can even agree on the definition of words today.

In the article Excavating AI, Kate Crawford and Trevor Paglen discuss the political context surrounding data used in machine learning. Many machine learning models are trained using a set of data and corresponding labels to indicate what the data represents. For example, a training dataset might contain thousands of pictures of different birds along with the species of the bird in the picture. This dataset could train a machine learning model to identify species of birds from satellite images. The process begins to break down when the labels are more subjectively defined. A model trained to differentiate planets from other celestial bodies may incorrectly determine that Pluto is a planet if the training data was compiled before 2006. The rapidly evolving nature of culture and politics makes this kind of model training heavily reliant on the context of the dataset’s creation.

*A Venezuelan Troupial in Aruba*

Wrapping Up

200 years from now, historians will undoubtedly have access to massive amounts of data to study, but they will face the same historical biases and misinformation that plague historians today. In the meantime, we can focus on protecting our own online privacy and addressing biases and misinformation in our data to make future historians’ job just a little easier.

Thank you for reading!


  • https://www.cell.com/current-biology/fulltext/S0960-9822(21)00624-2
  • https://www.nationalgeographic.com/history/article/131016-otzi-ice-man-mummy-five-facts
  • https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/read-the-belmont-report/index.html
  • https://hbr.org/2013/04/the-hidden-biases-in-big-data
  • https://gdpr.eu/right-to-be-forgotten/
  • https://www.ftc.gov/tips-advice/business-center/privacy-and-security/children%27s-privacy
  • https://www.wired.com/story/replika-open-source/
  • https://excavating.ai/

AI Bias: Where Does It Come From and What Can We Do About It?

AI Bias: Where Does It Come From and What Can We Do About It?
By Scott Gatzemeier | June 18, 2021

Artificial Intelligence (AI) bias is not a new topic but it is certainly a heavily debated and hot topic right now. AI can be an incredibly powerful tool that provides tremendous business value from automating or accelerating routine tasks to discovering insights not otherwise possible. We are in the big data era and most companies are working to take advantage of these new technologies. However, there are several examples of poor AI implementations that enable biases to infiltrate the system and undermine the purpose of using AI in the first place. A simple search on DuckDuckGo for ‘professional haircut’ vs ‘unprofessional haircut’ depicts a very clear gender and racial bias.

Scott Gatzmeier 1
Professional Haircut
Unprofessional Haircut

In this case, a picture is truly worth 1000 words. This gender and racial bias is not hard-coded in the algorithm by the developers maliciously. Rather it is a reflection of the word-to-picture associations that the algorithm picked up from the authors of the web commentary. So the AI is simply reflecting back historical societal biases to us in the images returned. If these biases are left unchecked by AI developers they are perpetuated. These perpetuated AI biases have proven to be especially harmful in several cases, such as Amazon’s Sexist Hiring Algorithm that inadvertently favored male candidates and the Racist Criminological Software COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) where black defendants were 45% more likely to be assigned higher risk scores than white defendants.

Where does AI Bias Come From?

There are several potential sources of AI bias. First, AI will inherit the biases that are in the training data. (Training data is a collection of labeled information that is used to build a machine learning (ML) model. Through training data, an AI model learns to perform its task at a high level of accuracy.) Garbage in, Garbage out. AI reflects the views of the data that it is built on and can only be as objective as the data. Any historical data that is used would be subject to the same societal biases at the time the data was generated. When used to generate predictive AI, for example, this can lead to the perpetuation of stereotypes that impact decisions which can have real consequences and harms.

Next, most ML algorithms are built upon statistical math and look to make decisions based on distributions of data and key features that can be used to separate data points into categories or other items it associates together. Outliers that don’t fit the primary model tend to be weighted lower, especially when focusing only on the model accuracy. When working with people-focused data, often the outlier data points are in an already marginalized group. This is how biased AI can come from good clean non-biased data. AI is only able to learn about different biases (race, gender, etc.) if there is a high enough frequency of each group in the data set. The training data set must contain an adequate size for each group, otherwise this statistical bias can further perpetuate marginalizations.

Finally, most AI algorithms are built on correlation to the training data. As we know, correlation doesn’t always equal causation. The AI algorithm doesn’t understand what any of the inputs mean in context. For example, you get a few candidates from a particular school but you don’t hire them because you have a position freeze due to business conditions. The fact that they weren’t hired gets added to the training data. AI would start to correlate that school with bad candidates and potentially stop recommending candidates from that school even if they are great potentially because it doesn’t know the causation of why they weren’t selected.

What can we do about AI Bias?

Before applying AI to a problem, we need to ask what level of AI is appropriate? What should the role of AI be depending on the sensitivity and impact of the decision on people’s lives? Should it be an independent decision maker, a recommender system, or not used at all? Some companies are applying AI even if it is not at all suited to the task in question and other means would be more appropriate. So, there is a moral decision that needs to be made prior to implementing AI. Obed Louissaint the Senior Vice President of Transformation and Culture talks about “Augmented Intelligence”. This refers to leveraging the AI algorithms as “colleagues” to assist company leaders in making better decisions and better reasoning rather than replace human decision making. We also need to focus on the technical aspects of AI development and work to build models that are more robust against bias and against bias propagation. Developers need to focus on explainable, auditable, and transparent algorithms. When major decisions are made by humans the reasoning associated with that decision is an expectation and there is accountability. Algorithms should be subject to the same expectations, regardless of IP protection. Visualization tools that help to explain how AI works and the ‘why’ behind the conclusion that AI came to continue to be a major area of focus and opportunity.

In addition to AI transparency, there are emerging AI technologies such as Generative Adversarial Networks (GAN) that can be used to create synthetic unbiased training data based on parameters defined by the developer. Causal AI is another promising area that is building momentum and could provide cause and effect understanding to the algorithm. This could give AI some ‘common sense’ and prevent several of these issues.

AI is being adopted rapidly and the world is just beginning to capitalize on its potential. As Data Scientists, it is increasingly important to understand the sources of AI bias and continue to develop fair AI that prevents the social and discriminatory issues that arise from that bias.


  • https://www.inc.com/guadalupe-gonzalez/amazon-artificial-intelligence-ai-hiring-tool-hr.html
  • https://hbr.org/2020/10/ai-fairness-isnt-just-an-ethical-issue
  • https://www.logically.ai/articles/5-examples-of-biased-ai
  • https://towardsdatascience.com/why-your-ai-might-be-racist-and-what-to-do-about-it-c081288f600a
  • https://medium.com/ai-for-people/the-ethics-of-algorithmic-fairness-aa394e12dc43
  • https://towardsdatascience.com/survey-d4f168791e57
  • https://techcrunch.com/2020/06/24/biased-ai-perpetuates-racial-injustice/
  • https://towardsdatascience.com/reducing-ai-bias-with-synthetic-data-7bddc39f290d
  • https://towardsdatascience.com/ai-is-flawed-heres-why-3a7e90c48878

Are we truly anonymous in public health databases?

Are we truly anonymous in public health databases?
By Anonymous | June 18, 2021

Source: Iron Mountain

Privacy in healthcare data seems to be the baseline when speaking of personal data and privacy issues for many users, even for those who hold a more relaxing attitude toward privacy policy issues. We may think that social media, tech, and financial companies could be selling user data for profit, but people tend to trust hospitals, healthcare institutions, and pharmaceutical companies for at least trying to keep their users’ data safe. Is that really true? How safe is our healthcare data? Can we really be anonymous in the public databases?

As a user and as a patient, we have to share a lot of personal and sensitive information when we see a doctor or healthcare practitioner in order for them to provide precise and useful healthcare services to us. Doctors might know about our blood type, potential genetic risks or diseases in our family, pregnancy experiences, etc. Not only those, the health institutions behind doctors also keep records of our insurance information, home address, zip code, payment information. Healthcare institutions might hold more comprehensive sensitive and private information about you than any of those organizations who also try to retain as much information about you.

What kind of information healthcare providers collect and share with third parties? In fact, most of the healthcare providers should follow HIPAA’s privacy policy guidance. For example, I noticed that Sutter Health said they follow or refer to the HIPAA privacy in their user agreement.

For example, Sutter’s privacy policy talked about its usage of your healthcare. In Sutter’s privacy policy, it is stated that “We can use your health information and share it with other professionals who are treating you. We may use your health information to provide you with medical care in our facilities or in your home. We may also share your health information with others who provide care to you such as hospitals, nursing homes, doctors, nurses or others involved in your care.” Those usage seem reasonable to me.

In addition to above expected usage of your healthcare data, Sutter also mentioned that they are allowed to share your information in “ways to contribute to public good, such as public health and research”. Those ways include, “preventing disease, helping with product recalls, reporting adverse reactions on medications, reported suspected abuse, …”. One concern arising from one of the usage — public health and research, can we really be anonymous in the public database?

In fact the answer is no. Most of the healthcare records can be de-anonymized through information matching ! “Only a very small amount of data is needed to uniquely identify an individual. Sixty three percent of the population can be uniquely identified by the combination of their gender, date of birth, and ZIP code alone”, according to a post on Georgetown Law Technology Review published in April 2017. Thus, it is totally possible for both people with good intentions such as a research team and data scientists whose true intention is to provide public good, or people with bad intention, such as hackers, to legally or illegally get healthcare information from multiple sources and aggregate together. And in fact they can be-anonymous the data, especially with the help of current computing resources, algorithms, and machine learning.

So do companies that hold your healthcare information have to follow some kind of privacy framework? Are there laws out there to regulate companies who have your sensitive healthcare information and protect the vulnerable public like you and me? One guidance that most healthcare providers should follow is the Health Insurance Portability and Accountability Act (HIPAA), which became effective in 1996. This act stated who has rights to access what kind of health information, what information is protected, and how information should be protected. It also covered who must follow these laws, including Health plans, most healthcare providers, and health care clearinghouses, and health insurance companies. Companies that could have your health information and do not have to follow these laws include life insurers, most of the schools and school districts, state agencies like child protective services agencies, law enforcement agencies, and municipal offices.

Normal people like you and me are vulnerable individuals. We don’t have the knowledge, patience, and knowledge to understand every term stated in the long and full-of-law-jargon user agreement and privacy policies. But what we can do and should do is advocate for strong protection for our personal information, especially sensitive healthcare data. And government and policy makers should also establish and enforce more comprehensive privacy policies to protect everyone, to limit the scope and ability of healthcare data sharing thus to prevent de-anonymous events from happening.


1. Stanford Medicine. Terms and Conditions of Use. Stanford Medicine. https://med.stanford.edu/covid19/covid-counter/terms-of-use.html.
2. Stanford Medicine. Datasets. Stanford Medicine. https://med.stanford.edu/sdsr/research.html.
3. Stanford Medicine. Medical Record. Stanford Medicine. https://stanfordhealthcare.org/for-patients-visitors/medical-records.html.
4. Sutter Health. Terms and Conditions. Sutter Health. https://mho.sutterhealth.org/myhealthonline/terms-and-conditions.html.
5. Sutter Health. HIPAA and Privacy Practices. Sutter Health. https://www.sutterhealth.org/privacy/hipaa-privacy.
6. Wikipedia (14 May 2021). Health Insurance Portability and Accountability Act. Wikipedia. https://en.wikipedia.org/wiki/Health_Insurance_Portability_and_Accountability_Act.
7. Your Rights Under HIPAA. HHS. https://www.hhs.gov/hipaa/for-individuals/guidance-materials-for-consumers/index.html.
8. Adam Tanner (1 February, 2016). How Data Brokers Make Money Off Your Medical Records. Scientific American. https://www.scientificamerican.com/article/how-data-brokers-make-money-off-your-medical-records/.

What You Should Know Before Joining an Employee Wellness Program

What You Should Know Before Joining an Employee Wellness Program
By Ashley Moss | June 18, 2021

Weigh-in time at NBC’s The Office

In 2019 more than 80% of large employers offered a workplace wellness program as healthcare trends in America turn toward disease prevention. Some wellness programs focus on behavioral changes like smoking cessation, weight loss, or stress management. Participants might complete a health survey or undergo biometric tests in a lab. Many employers offer big financial incentives for participating and/or reaching target biometric values. While on the surface this may seem like a win-win for employers and employees, this article takes a closer look at the potential downsides of privacy compromise, unfairness, and questionable effectiveness.

Laws and regulations that normally protect your health data may not apply to your workplace wellness program. The federal government’s HIPAA laws cover doctor’s offices and insurance companies who use healthcare data. These laws limit information sharing to protect your privacy and require security measures to ensure the safety of electronic health information.

If a workplace wellness program is offered through an insurance plan, HIPAA applies. However, if it is offered directly through the employer, HIPAA does not apply. This means the program is not legally required to follow HIPAA’s anti-hacking standards. It also means employee health data can be sold or shared without legal repercussions. Experts warn that an employer with direct access to health data could use it to discriminate against, or even lay off, those with a high cost of care.

Although these programs claim to be voluntary, employers provide a financial incentive averaging hundreds of dollars. It’s unclear how much pressure this places on each employee, especially because the dollar penalty a person can afford really depends on their financial situation. There is some concern that employers have a hidden agenda: Wellness programs shift the burden of medical costs away from the employer and toward unhealthy or non-participating employees.

Wellness programs may be unfair to certain groups of people. Research shows that programs penalize lower wage workers more often, contributing to a “poor get poorer” cycle of poverty. Wellness programs may also overlook entire categories of people who have a good reason not to join, such as people with disabilities. In one case, a woman with a double mastectomy faced possible fines for missing mammograms until she provided multiple explanations to her employer’s wellness program.

Experts question the effectiveness of workplace wellness programs, since evidence shows little impact to key outcomes. Two randomized studies followed participants for 12+ months and found no improvement to medical spending or health outcomes. Wellness programs do not require physician oversight, so the interventions may not be supported by scientific evidence. For example, Body Mass Index (BMI) has fallen out of favor in the medical community but persists in wellness programs.

Wellness programs may focus on reducing sedentary time or tracking steps.

 Before joining an employee wellness program, do your homework to understand more about the potential downsides. Remember that certain activities are safer than others: are you disclosing lab results or simply attending a lecture on healthy eating? If you are asked to share information, get answers first: What data will be collected? Which companies can see it? Can it be used to make employment decisions? Lastly, understand that these programs may not be effective and cannot replace the advice of a trained physician.











The Price of a Free Salad

The Price of a Free Salad
By Anonymous | June 18, 2021

How do you celebrate your birthday? If you’re anything like me, you prioritize friends, family, and cake, but you let some of your favorite corporations in on the mix, too. Each year, I check my email on my birthday to find myself fêted by brands like American Eagle, Nintendo, and even KLM. Their emails come in like clockwork, bearing coupon codes and product samples for birthday gifts. 

An image of multiple emails and their contents that offer birthday discounts
Birthday email promotions

Most of these emails linger unopened in my promotions tab, but my best friend makes an annual odyssey of redeeming her offers. One year I spent the day with her as we did a birthday coupon crawl, stopping by Target for a shopping spree and a free Starbucks coffee, Sephora for some fun makeup samples, and ending with complimentary nachos at a trendy Mexican restaurant downtown.

I used to want to be savvy like her and make the most of these deals, but lately, I’ve been feeling better about missing out. The work of Dr. Latanya Sweeney, a pioneering researcher in the field of data privacy has taught me what powerful information my birthday is. 

In her paper “Simple Demographics Often Identify People Uniquely”, Sweeney summarizes experiments that showed that combinations of seemingly benign personal details, such as birthday, gender, and zip code, often provide enough information to identify a single person. She and her team even developed this tool to demonstrate this fact. 

Sample uniqueness results
Sample results from the uniqueness tool

I keep trying this website out with friends and family, and I have yet to find anyone who isn’t singled out by the combination of their birthday, gender, and zip code. 

But what does this mean for our beloved birthday deals? Let’s think a bit about how this would work if you were shopping at, say, Target. You don’t have to shop at Target very often to see promotions of its 5% birthday shopping deal. 

Target circle birthday discount
Target circle birthday discount

Attentive readers may be wondering: “If Target knows who I am already, why do I care if they can identify me with my birthday?” This is a fair question. The truth is that, in our online, tech-enabled world, even brick and mortar shopping is no longer a simple cash-for-products exchange.

Target and other retailers are hungry to know more about you so that they can sell you more products. There are a lot of different ways that companies get information about you, but most fall into three main categories. 

1. They ask you questions that you choose to answer. 

2. They track your behavior online via your IP address, cookies, and other methods. 

3. They get data about you from other sources and match it up with the records that they have been keeping on you.

This last method makes your birthday tantalizing to retailers. Target already has access to data about you that is incomplete or anonymized, whether that’s from your online shopping habits or by tracking your credit card purchases in stores. Your birthday may just be the missing link that helps it get even closer to a full picture of who you are, which it will use to motivate you to spend more money.

Data exchange works both ways, so Target may decide to monetize that detailed profile that it has of you and your behavior with other companies. In fact, Target’s privacy policy says that it might do just that: “We may share non-identifiable or aggregate information with third parties for lawful purposes.” 

The more Target knows about you, the easier it would be for another company to identify you within a “non-identifiable” dataset. Even if companies like Target are diligent about removing birthdates from databases before sharing them, they are vulnerable to security breaches. Birthdays are often used by websites to verify identities. If your birthday is part of the information included in a security breach, that increases the odds that you will be targeted for identify theft and fraud.

After I started writing this post, I found an email from Sweetgreen that reminded me that somehow, despite years of using their app regularly, I still haven’t given up my birthday.

A Sweetgreen marketing email promising a discounted salad in exchange for my birthday.
A Sweetgreen marketing email promising a discounted salad in exchange for my birthday.

I’ve always loved a good deal, and I have a soft spot for $16 salads. I wonder, if I’m already being tracked, if my activity is being monitored, if my history is fragmented into databases and data leaks, why not get some joy out of it? Why not share free nachos with friends and pile a Sweetgreen salad with subsidized avocado and goat cheese? 

Ultimately, I still can’t justify it. My privacy and security is not for sale. At the very least, it’s worth much more than $10. 

Ring: Security vs. Privacy

Ring: Security vs. Privacy
By Anonymous | June 18, 2021

A little over 50 years ago, Mari Van Britten Brown invented the first home security system (timeline)– a closed circuit set of cameras and televisions with a panic button to cont act the police. In the years since, as with most household technology, advances have culminated to modern sleek and “cost effective” smart cameras such as the Amazon-owen Ring products. Ring’s mission, displayed on their website, is to “make neighborhoods safer”, proposing that connected communities lead to safer neighborhoods. Ring has also come under some scrutiny for partnering with police forces (WaPo) and, like most ‘smart’ devices and mobile apps, collects quite a bit of information from its users. While the first modern security systems of the 1960s had also relied on collaboration with the police, each individual household had its own closed circuit system, with explicit choice on when, and how much, to share with law enforcement. When Ring sets out to make neighborhoods safer, for whom are they making it safe? What is Ring’s idea of a safe neighborhood? 

Purple Camera
Purple Camera

Ring cameras have surged in popularity over the past few years, likely in some part due to the increase in home package deliveries bringing about an increase in package theft. With convenient alerts and audio/video accessible from a mobile device, the benefits of an affordable, accessible, and stylish security system loom large. 

Ring, as a company, collects each user’s name, address, geolocation of each device, and any audio/video content.  Closed circuit predecessors resulted in a data environment where each household had exclusive access to its own security footage. Thus, the sum of all surveillance was scattered among the many users and separate local law enforcement groups with whom users shared footage. Under Nissenbaum’s Contextual Integrity framework, trespassers on private property expect to be surveilled, the owners of the security systems have full agency over all transmission principle, or constraints on the flow of information. Owners can choose, at any time, to share any portion of their security data with the police. 

Ring, by contrast, owns all of the audio and video content of its millions of users, allowing all of the data to be centralized and accessible.  About 10% of police departments in the US have been granted access, by request, to any user’s security footage. Users often purchase Ring products expecting their service to check on packages and see who is at the door, as advertised. Along with this comes with the agreement that users no longer have the authority or autonomy to prevent their data from being shared with law enforcement.

Police City Eyeball
Police City Eyeball

Via Ring’s privacy policy, they can also keep any deleted user data for any amount of time. As is the case with many data focused companies, Ring also reserves the right to change its terms and conditions at any time without any notice. One tenant of responsible privacy policy is to limit the secondary use of any data without express informed consent. As Ring has been aggressively partnering with police and providing LAPD officers with free equipment to market their products, it is not unreasonable that all current and former audio/video footage and other data will be accessible to law enforcement without a need for a warrant. 






Drones, Deliveries, and Data

Drones, Deliveries, and Data
—Estimated Arrival – Now. Are We Ready?—

By Anonymous | June 18, 2021

It’s no secret that automated robotics has the ability to propel our nation into a future of prosperity. Industries such as Agriculture, Infrastructure, Defense, Medicine, Transportation, the list goes on; the opportunities for Unmanned Aerial Vehicles, in particular, to transform our world are plentiful and by the day we fly closer and closer to the sun. However, the inherent privacy related and ethical concerns surrounding this technology need to be addressed before they take to the skies. Before we tackle this, let’s set some context.

An Unmanned Aerial Vehicle (commonly known as a UAV or Drone) can come in many shapes and sizes, all without an onboard human pilot and with a variety of configurations tailored to its specific use. You may have seen a drone like the one below (left) taking photos at your local park or have heard of its larger cousin the Predator B (right) which is a part of the United States Air Force UAV fleet.

DJI and Predator Drone
DJI and Predator Drone

The use of drone technology in United States foreign affairs is a heavily debated topic that won’t be discussed today. Instead, let’s focus on a UAV application closer to home: Drone delivery.

There are several companies that have ascended the ranks of the autonomous drone delivery race namely Amazon, UPS, Zipline, and the SF based startup Xwing who hopes to deliver not just packages, but even people to their desired destination. Aerial delivery isn’t constricted by land traffic and therefore can take the shortest route between two points. If implemented correctly the resulting increase in transportation efficiency could be revolutionary. As recently as August 2020, legislature has been passed allowing for special use of UAVs beyond line-of-sight flight, which was the previous FAA regulation. This exposes the first issue that drone delivery brings. If not controlled by a human in a line-of-sight context, then the drone necessarily must use GPS, visual, thermal, and ultrasonic sensors to navigate the airspace safely. According to Amazon Prime air, their drone contains a multitude of sensors that allow “…stereo vision in parallel with sophisticated AI algorithms trained to detect people and animals from above.” Although it’s an impressive technological feat, any application where people are identified using camera vision needs to be handled with the utmost care. Consider the situation where a person turns off location services on their device as a personal privacy choice. A delivery drone has the capability to identify that person without their knowledge and combined with on board GPS data, that person has been localized without the use of their personal device or consent. This possible future could be waiting for us if we do not create strong legislature with clear language regarding the collection of data outside what is necessary for a delivery.

There’s another shortcoming with UAV delivery that doesn’t have to do with privacy: our existing infrastructure. 5G cellular networks are increasing in size and robustness around the nation which is promising for the future of autonomous delivery as more data can be transferred to and from the UAV. However, this reveals a potential for exclusion as the lack of 5G coverage may leave areas of this nation unreachable by drone due to the UAV flying blind or from running out of power. According to Amazon, the current iteration of Prime Air drone has a 15-mile range which leaves the question, “Is this technology really helping those who need it?”

Current 5G Coverage
Current 5G Coverage

It’s not all bad however, drone deliveries have the potential to create real, positive change in our world especially in light of the ongoing COVID-19 Pandemic. Privacy forward drone tech would help reduce emissions by both using electric motors and by allowing people to order items from the comfort of their own home in a timely manner, negating a drive to the store. It’ll be exciting to see what the future holds for UAV technology, and we must stay vigilant to ensure our privacy rights aren’t thrown to the wind.

AI, 5G, MEC and more: New technology is fueling the future of drone delivery. https://www.verizon.com/about/news/new-technology-fueling-drone-delivery

Drones Of All Shapes And Sizes Will Be Common In Our Sky By 2030, Here’s Why, With Ben Marcus Of AirMap.https://www.forbes.com/sites/michaelgale/2021/06/16/drones-of-all-shapes-and-sizes-will-be-common-in-our-sky-by-2030-heres-why-with-ben-marcus-of-airmap/

A drone program taking flight. https://www.aboutamazon.com/news/transportation/a-drone-program-taking-flight

Federal Aviation Administration. https://www.faa.gov/

China’s Scary But Robust Surveillance System

China’s Scary But Robust Surveillance System
By Anonymous | June 18, 2021

Introducing the Problem

In 2014, the Chinese government introduced an idea that would allow them to keep track of their citizens and score their behavior. It seemed that the government wanted to have a world where their people were literally constantly monitored — whether it’s where people shop, how people are paying bills, and even the type of content they are watching. In many ways, it is what major companies in the US like Google, Facebook, etc. are doing with data collection, but on steroids and at least they are letting you know that your every move was being watched. On top of that, you are being judged and given a score based on your interactions and lifestyle. A high “citizen score” will grant people rewards such as faster internet service. Some instances where a person’s score may be decreased include posting on social media that contradicts the Chinese government. Private companies in China are constantly working with the government to gather data through social media and other behaviors on the internet.

A key potential issue is that the government will be technically capable of considering the behavior of a Chinese citizen’s friends and family in determining his or her score. For example, it is possible that your friend’s anti-government political post could lower your own score. There, this type of scoring mechanic can have implications on relationships between an individual’s friends and family. The Chinese government is taking this project seriously, and scores that one may take for granted in the US may be in jeopardy based on a person’s score. One example is accessing a visa to travel abroad or or even the right to travel by train or plane within the country. People understand the risks and dangers this poses and as one internet privacy expert says, “What China is doing here is selectively breeding its population to select against the trait of critical, independent thinking.” However, because lack of trust is a serious problem in China, many Chinese actually welcome this potential system. To relate this back to the US, I am wondering if this type of system could ever exist in our country and if so, what that would look like. Is it ethical for private companies to assist in massive surveillance and turn over their data to the government? Chinese companies are now required to assist in government spying while U.S. companies are not, but what happens when Amazon or Facebook are in the positions that Alibaba and Tencent are in now?

A key benefit for China to have so many cameras and surveillance set up throughout their major cities is that it helps to identify any criminals and helps with keeping track of crime. For example, in Chongqing, where there’s more surveillance cameras than any city in the world for its population, the surveillance system scans facial features of people on the streets from frames of video footage in real time. As a result, the scans can be compared against data that already exists within a police database, such as photos of criminals. If a match passes, typically 60% or higher, police officers are notified. One could make an argument for the massive surveillance system as being beneficial for society, but if law officials are not being transparent and enforcing good practices, then there is an issue.


Got Venmo? Protect Your Privacy

Got Venmo? Protect Your Privacy
By Anonymous | June 18, 2021

Phones With Venmo

Last month, BuzzfeedNews discovered President Joe Biden’s Venmo account and his public friends list. President Joe Biden’s and First Lady Jill Biden’s Venmo accounts were removed online the day the news broke. This news prompted Venmo to implement a new feature that allows users to hide their friends list. However, the default option for a users’ friend list is public so users will be able to see others’ friends unless they manually select to hide the list. The incident with President Joe Biden’s Venmo account and Venmo’s new feature have reraised concerns about Venmo’s privacy. Here’s some answers to some commonly asked questions about Venmo and Venmo’s privacy policy.

What Data Does Venmo Collect?

Currently, according to its privacy policy, Venmo collects a host of personal data including your name, address, email, telephone number, information about what device you are using to access Venmo, financial information (your bank account information), SNN (or other governmental issued verification numbers), geolocation information (your location), and social media information if you decide to connect your Venmo account with social media such as Twitter, FourSquare, and Facebook.

When you register for a personal Venmo account, you must verify your phone address, your email, and your bank account.

Why Should You Care About Venmo’s Privacy?

A lot of Venmo users view Venmo as a fun social media platform where they can share their transactions with accompanying notes and descriptions. They figure they’re not doing anything wrong so why should they care if their transactions are public? They don’t have anything to hide. It is not just about hiding bad information, although this may be some users’ goal, but also protecting good information from others. What do I mean by this?

According to Venmo’s privacy policy, “public information may also be seen, accessed, reshared or downloaded through Venmo’s APIs or third-party services that integrate with our products” meaning that all of your public transactions and associated comments are available to the public. Even non Venmo users can discover your data by accessing the API.

In 2018, Mozilla Fellow Hang Do Thi Duc released “Public By Default”, an analysis of all 207,984,218 public Venmo transactions from 2017. Through these transactions, she was able to discover drug dealers, breakups, and the routine life of a married couple. She was able to discover such information as where the married couple shopped and what days they usually went to the grocery store, what gas they used, and what restaurants they usually ate at. She was able to discover a drug dealer and where he lived based on his public transaction comments and the fact that his Facebook was linked to his Venmo. Thus, Venmo transactions can act as a map of your daily activities. It can be quite easy to learn about an individual through both their transactions and friends list.

Example Venmo API Image
Image of Venmo API

Your personal data may become more publicly available if you connect your account to third parties such as social media platforms. According to Venmo’s privacy policy, data shared with a “third-party based on an account connection will be used and disclosed in accordance with the third-party’s privacy practices” and “may in turn be shared with certain other parties, including the general public, depending on the account’s or platform’s privacy practices.” This means that if you connect your account with a third party, Venmo and the third party will exchange personally identifiable information about you. The information Venmo shares about you with the third party is subject to the third party’s privacy meaning that data is no longer protected by Venmo’s privacy policy. If the third party’s privacy policy states that personal information can be shared publicly, private information you have shared with Venmo can then become public.

How Can You Protect Your Privacy?

You can protect your data by making both your transactions and friends list private. These are both automatically public. You can also make your past transactions private and prevent Venmo from collecting some of your location data by turning location services for Venmo off using your mobile device. Article for how to do both of these here.This should prevent anyone from publicly accessing your Venmo transactions or friends list and prevent some geolocation tracking although Venmo may still be able to view your location.

Venmo Privacy Settings
Venmo Privacy Settings

Also be sure to read a firm’s privacy policy before you decide to connect your account with them in any way. Before connecting with any social media apps, if you haven’t already, read the social media platform’s privacy policy to see if their privacy practices match with what you would feel comfortable sharing. If you’ve already connected with social media apps, be sure to read the privacy policies of other third parties that ask to connect with your account in the future.

You should also be cautious about your Venmo profile picture. You may figure if you regret a past Venmo profile picture, you can just delete this photo and post a new one. However, this is not the case. It is still possible to recover a user’s old Venmo profile picture after they have replaced it with a new one simply by changing the photo’s URL. Try to post photos that you do not mind being public for the foreseeable future.

In summary, privacy matters especially when it concerns financial data that reveals patterns about your lifestyle. Set your transactions and friends list to private, turn off location services, be wary of connecting your account to third parties, and post profile pictures that you do not mind being public.


Duc Do Thi, Hang. (2018). Public By Default [Project]. https://publicbydefault.fyi/

How to Sign Up for a personal Venmo account. Venmo. (2021). https://help.venmo.com/hc/en-us/articles/209690068-How-to-Sign-Up-for-a-personal-Venmo-account.

Mac, R., McDonald, L., Notopoulos, K., & Brooks, R. (2021, May 15). We Found Joe Biden On Venmo. Here’s Why That’s A Privacy Nightmare For Everyone. BuzzFeed News. https://www.buzzfeednews.com/article/ryanmac/we-found-joe-bidens-secret-venmo.

Mozilla Foundation. (2019, August 28). Venmo, Are You Listening? Make User Privacy the Default. Mozilla . https://foundation.mozilla.org/en/blog/venmo-are-you-listening-make-user-privacy-default/

Notopoulos, K. (2021, May 19). Venmo Exposes All The Old Profile Photos You Thought Were Gone. BuzzFeed News. https://www.buzzfeednews.com/article/katienotopoulos/paypals-venmo-exposes-old-photos?ref=bfnsplash.

Payment Activity & Privacy. Venmo. (2021). https://help.venmo.com/hc/en-us/articles/210413717.

Perelli, A. (2021, May 30). Venmo added new privacy options after President Joe Biden’s account was discovered. Business Insider. https://www.businessinsider.in/tech/news/venmo-added-new-privacy-options-after-president-joe-bidens-account-was-discovered/articleshow/83074180.cms.

Photo Credits:





Neurotechnologies, Privacy, and the need for a revised Belmont Report

Neurotechnologies, Privacy, and the need for a revised Belmont Report
By Filippos Papapolyzos | June 18, 2021

Ever since Elon Musk’s launch of his futuristic venture Neuralink in 2016, the popularity of the field of neurotechnology – which sits at the intersection of neuroscience and engineering – has skyrocketed. Neuralink’s goal has been to develop skull-implantable chips that interface one’s brain with their devices, a technology that the founder said, in 2017, is 8 to 10 years away. This technology would have tremendous potential for individuals with disabilities and brain-related disorders, such as by allowing speech-impaired individuals to regain their voice or paraplegics to control prosthetic limbs. The company’s founder, however, has largely focused on advertising a relatively more commercial version of the technology, that would allow everyday users to control their smartphones or even communicate with each other using just their thoughts. To this day, the project is still very far from reality and MIT Technology Review has dubbed it neuroscience theatre aimed at stirring excitement and attracting engineers while other experts have called it bad science fiction. Regardless of the eventual successfulness of brain-interfacing technologies, the truth remains that we are still very underprepared from a legal and ethical standpoint, due to their immense privacy implications as well as the philosophical questions they pose.

The Link
The Link

Implications for Privacy

Given that neurotechnologies would have direct real-time access to one’s most internal thoughts, many consider them to be the last frontier of privacy. All our thoughts, including our deepest secrets and perhaps even ideas that we are not consciously aware of, would be digitized and transmitted to our smart devices or the cloud for processing. Not unlike other data we casually share today, this data could be passed on to third parties such as advertisers and law enforcement. By processing and storing this information on the cloud, it would be exposed to all sorts of cybersecurity risks, that would put individuals’ most personal information and even dignity at risk. If data breaches today expose proxies of our thoughts – i.e the data we produce – data breaches on neural data would be exposing our innermost selves. Law enforcement could surveil and arrest individuals simply for thinking of committing a crime and malicious hackers could make us think and do things against our will or extort money for our thoughts.

A slippery slope argument that tends to be made in regards to such potential data sharing is that, in some ways, it already happens. Smartphones already function as extensions of our cognition and, through them, we tend to disclose all sorts of information through social media posts or apps that monitor our fitness, for instance. A key difference between neurotechnologies and smartphones, however, is the voluntariness of our data sharing today. A social media post, for instance, constitutes an action, i.e a thought that has been manifested, and is both consensual and voluntary in the clear majority of cases. Instagram may be processing our every like or engagement but we still maintain the option of not performing that action. Neuralink would be tapping into thoughts, which have not yet been manifested into action as we have not yet applied our decision making skills to judge the appropriateness of performing said action. Another key difference is the precision of these technologies. Neurotechnologies would not be humanity’s first attempt at influencing one’s course of action but they would definitely be the most refined. Lastly, the mechanics of how the human brain functions still remain vastly unexplored and what we call consciousness may simply be the tip of the iceberg. If Neuralink were to expose what lies underneath, we would likely not be positively surprised.

Brain Privacy
Brain Privacy

Challenges to The Belmont Report

Since 1976, The Belmont Report has been a milestone for ethics in research and is still largely consulted by ethical committees, such as Institutional Review Boards in the context of academic research, having set the principles of Respect for Persons, Beneficence, and Justice. With the rise of neurotechnologies, it is evident that these principles are not sufficient. The challenges brain-interfacing poses have led many experts to plead for a globally-coordinated effort to draft a new set of rules guiding neurotechnologies.

The principle of Respect for Persons is rooted in the idea that individuals act as autonomous agents, which is an essential requirement for informed consent to take place. Individuals with diminished autonomy, such as children or prisoners, are entitled to special protections to ensure  they are not taken advantage of. Neurotechnologies could, potentially, influence one’s thoughts and actions, thereby undermining their agency and, as a result, also their autonomy. The authenticity of any form of consent provided after an individual has interfaced with neurotechnology would be subject to doubt; we would not be able to judge the extent to which third parties might have participated in one’s decision making. 

From this point onwards, the Beneficence principle would also not be guaranteed. Human decision-making is not essentially rational, which may often lead to one’s detriment, but when it is voluntary it is seen as part of one’s self. When harm is not voluntary, it is inflicted. When consent is contested, autonomy is put under question. This means that any potential harm suffered by a person using brain-interfacing technologies could be seen as a product of said technologies and therefore as a form of inflicted harm. Since Beneficence is founded on the “do not harm” maxim, neurotechnologies pose a serious challenge to this principle.

Lastly, given that the private company would, realistically, most likely sell the technology at a high price point, it would be accessible only to those who can afford it. If the device were to augment users’ mental or physical capacities, this would severely magnify pre-existing inequalities on the basis of income as well as willingness to use the technology. This poses a challenge to the Justice principle as non-users could receive an asymmetric burden as a result of the benefits received by users.

In the frequently quoted masterpiece 1984, George Orwell speaks of a dystopian future where “nothing was your own except the few cubic centimeters inside your skull”. Will this future soon be past?


Can privacy coexist with technology that reads and changes brain activity?