Genetic profiling – Gaming the “lottery of life”

Genetic profiling – Gaming the “lottery of life”
By Anonymous | November 20, 2019

Background: Genetics for healthier children

Large-scale genome-wide association studies have deepened our understanding of the interplay between genetic and environmental factors and individual traits. A 2015 meta-analysis of heritability studies showed that 40% of individual differences in personality were due to genetic, while 60% are due to environmental influences [1].

Individually, genetic differences have little consequence. But combined effects can be big. Single nucleotide polymorphism (SNP) profiling looks for combinations of differences for which research has shown correlations with diseases, such as cancer, diabetes and heart disease. Common procedures test for almost 1m SNPs. SNP profiling is already used to enhance desired attributes in livestock, so we can assume it will also work on people.

SNP profiling can be applied together with in-vitro fertilisation (IVF) to select which of a group of viable embryos should be implanted and brought to term. SNP-profiling companies such as Genomic Prediction are already offering analysis services to prospective parents.

Optimising genetic profiles promises healthier children. SNP can be used to draw up lifetime risk profiles for various medical conditions, such as coronary artery disease or breast cancer [2]. The image below shows that a British woman’s average 10-year risk of developing breast cancer at the age of 47 is 2.6%. Women who are in the top 3% risk group reach this level already at the age of 40, while the bottom 3% do not cross this threshold until they were 80.

SNP can also enable the early identification of single-gene disorders which might have severe consequences when passed on from parents to children. Furthermore, SNP profiling does not involve the dangers of gene editing, i.e. experimenting with new genetic variants.

Concerns and ethical considerations

SNP-profiling raises numerous ethical questions. The technology may provide a way to influence other factors that are not directly linked to health, e.g. height and intelligence. SNP profiling has been found to be predictive of abstract characteristics such as higher-level cognitive functions (e.g. problem-solving), television-viewing habits or the likelihood of being bullied at school [2].
SNP profiling for non-medical attributes is not yet offered by SNP profiling companies, but it is likely – as science progresses – that it will be offered as there are no legal obstacles restricting such services.

Private laboratories offering SNP have developed intellectual property on tests and procedures, so they might consider information to be proprietary. It is unclear whether prospective users of the service are sufficiently informed [3]. This might raise issues of prior consent and due process.

There is also the question of fairness: SNP and IVF are costly procedures, so access is currently restricted for those with the money to pay for it. It is a technique that could be applied from generation to generation, to further improve genetic profiles of a family over time. Thereby, SNP replicates the effects of “assortative mating”, where rich, successful people seek out each other as partners. But SNP-profiling accelerates and strengthens this effect, thereby aggravating the privileges that rich people enjoy [2]. In a worst-case scenario, it might create a genetic elite. It might also put undue pressure on people that want to leave the genetic selection up to “nature’s lottery”.

Last but not least, there is the concern of representativeness. The SNP information is gained from gene banks in industrialized countries, mainly from Europe. It is reasonable to assume that ethnical groups (e.g. from developing countries) are under-represented in the dataset. Conclusions on preferable genetic combinations could therefore be biased and wrong.

As a summary, genetic profiling may have positive health effects on future generations of children, but it is necessary to put limits on the ways that humans override natural selection mechanisms so that the technology can serve society as a whole. Direct-to-consumer medical testing needs more attention in the public debate, so that the above mentioned ethical concerns can be properly addressed.

Content references
[1] Vukasovic, T & Bradko, D. (2015) “Heritability of personality: A meta-analysis of behavior genetic studies”, Psychol Bull. 2015 Jul;141(4):769-85
[2] Economist (2019) “Gee whizz – The new genetics”, Nov 9th 2019
[3] Beaudet, A. (2010) “Ethical issues raised by common copy number variants and single nucleotide polymorphisms of certain and uncertain significance in general medical practice”, Genome Med. 2010; 2(7): 42

Image references
[1] https://pixabay.com/illustrations/dna-biology-medicine-gene-163466/
[2] https://www.economist.com/science-and-technology/2019/11/07/modern-genetics-will-improve-health-and-usher-in-designer-children

The “Objective” Interviewer: Algorithms in the Hiring Process with HireVue

The “Objective” Interviewer: Algorithms in the Hiring Process with HireVue
By TK Truong | November 1, 2019

Background

Across industries, Human Resource (HR) departments are relying more
heavily on algorithmically powered tools throughout the recruiting and
hiring process. One of the growing trends is the usage of video
interviewing to produce candidate assessments. Among many competitors,
HireVue is gaining ground with over 700 customers globally from various
industries such as finance, hospitality, and retail. Simultaneously,
they have also been subject to the most scrutiny over the ethics and
ramifications of using algorithms to assess individuals.

While the implementation may vary by company, HireVue generally acts as
an intermediate step in the hiring process. After a company receives job
applications, recruiters can select some individuals for HireVue
interviews, in which they video-record answers to screening questions.
The videos are then processed and analyzed by HireVue’s algorithms,
which were developed based on leading industrial-organizational
psychology research. They consider potentially tens of thousands of data
points from facial movement, vocal tone, and linguistic elements to
determine candidate scoring. From there, the company can use the
assessments to help decide which candidates move onto the next
stage–likely an in-person interview.

Is There Merit?

Implementing HireVue has some clear advantages for companies, especially
in terms of efficiency. For example, the hotel chain, Hilton
International, claims that HireVue helped them reduce their average
hiring time from six weeks to five days. Typically, recruiters must
devote a lot of time and resources to coordinate phone screenings with
candidates. These hurdles are removed with HireVue video interviews
since candidates can complete them at any time before a certain
deadline. The resulting assessments also allow recruiters to filter out
candidates much more quickly, which increases productivity.

Arguably, the typical hiring process already contains degrees of
implicit bias, but Hirevue claims that their technology can introduce
impartiality. Recruiters and hiring managers can find it difficult to
explain why they prefer a particular candidate, making them “ultimate
black box”, according to HireVue. On the other hand, HireVue utilizes
research to make their algorithms more objective and less prone to the
implicit bias that comes from the “gut feelings” that personnel may use
to make hiring decisions. People may not be aware of their prejudices,
whereas such bias can be detected and removed from HireVue’s algorithms.

But what are the concerns?

Despite good intentions to increase efficiency and reduce bias, HireVue
is ultimately open to much criticism. First, some experts cast doubt
over the validity of the algorithms since they rely on detecting emotion
and sentiment–still an emerging area of research. Aside from that, the
biggest concern is that the algorithms may perpetuate bias because so
many data points are related to verbal communication and facial
expression, which likely skews towards Western ideals and norms. This
suggests that the algorithms may be biased against non-native English
speakers and potentially, those from non-Western cultures. If such
discrimination can be proven, companies using HireVue may inadvertently
violate employment discrimination law. To combat these concerns, HireVue
cites their diverse training data, which draws from various cultures,
and their rigorous process to check for bias and revalidate their
models.

However, since HireVue’s algorithms are proprietary, we cannot verify
their precautionary measures. To promote transparency and fairness,
HireVue should provide further details such as which biases they check
for and how they work with companies to test this. Although HireVue has
an advisory board, more expertise from various disciplines is needed to
weigh in, given that they may end up enabling large scale employment
discrimination if unchecked. With close collaboration and algorithmic
maintenance, it may be possible to use HireVue to reduce bias in the
hiring process, but the lack of transparency or independent audit casts
doubt on the product and company itself.

Content References

[1] Bias, AI Ethics, and the HireVue Approach. (n.d.). Retrieved from <https://www.hirevue.com/why-hirevue/ethical-ai>.
[2] Harwell, D. (2019, October 25). A face-scanning algorithm
increasingly decides whether you deserve the job. Retrieved from <https://www.washingtonpost.com/technology/2019/10/22/ai-hiring-face-scanning-algorithm-increasingly-decides-whether-you-deserve-job/>.
[3] Larsen, L. HireVue Assessments and Preventing Algorithmic Bias.
Retrieved from <https://www.hirevue.com/blog/hirevue-assessments-and-preventing-algorithmic-bias>.

Image References

[1] <https://techcrunch.com/2013/10/02/video-interviewing-platform-hirevue-grabs-25-million-from-sequoia-for-deeper-push-into-hr/>
[2] <https://www.hirevue.com/products/hirevue-platform>

Facebook Fined $5 Billion for Data Privacy Violations – Was It Enough?

Facebook Fined $5 Billion for Data Privacy Violations – Was It Enough?
By Anonymous | November 1, 2019

On July 24, 2019 the U.S. Federal Trade Commission (FTC) announced a $5 billion penalty levied on Facebook for consumer data privacy violations. It was by far the largest privacy-related fine imposed by any entity.


Figure 1: Relative Penalties in Privacy Enforcement Actions

The FTC’s announcementexplained that the penalty was in response to Facebook deceiving their platform users about users’ ability to control the privacy of their personal information, a direct violation of a 2012 FTC order against Facebook.

The latest FTC probe of Facebook was precipitated by the highly-publicized Cambridge Analytica scandal, which was widely exposed in March 2018 and involved the exploitation of personal data from up to 87 million Facebook users. The Cambridge Analytica case illustrated the extent to which Facebook deceived its users regarding the control and protection of the users’ personal information, by undermining users’ privacy preferences and failing to prevent third-party applications and data partners from misusing users’ data. Facebook was aware of the policy violations, and these tactics allowed Facebook to share users’ personal information with third-party apps that were downloaded by users’ Facebook “friends.” The FTC also flagged other Facebook failures, such as misleading tens of millions of users about their ability to control facial recognition within their accounts, despite assurances to the contrary in Facebook’s April 2018 data policy. Facebook also failed to disclose it would use users’ phone numbers for advertising purposes when users were told it needed their phone numbers for two-factor authentication.

Despite the unprecedented amount of the fine, which represented about 9% of Facebook’s 2018 revenue, many thought it did not go far enough. The FTC committee in charge of this case was split 3-2 along party lines, with the two Democrats on the committee stating that the penalty should have been larger. Others characterized the fine as a “slap on the wrist”. The fact that the stock price ticked up immediately after the announcement indicated that the market was expecting a larger penalty than what was imposed.

In addition to the monetary penalty, the FTC order placed certain restrictions on Facebook’s business operations and created new compliance requirements, all in an attempt to change the company’s consumer privacy culture and prevent future failures. The FTC required Facebook to “establish an independent privacy committee of Facebook’s board of directors, removing unfettered control by Facebook’s CEO Mark Zuckerberg over decisions affecting user privacy.” Firing these members requires a supermajority of the Facebook board of directors.


Figure 2: Schematic of Facebook’s Privacy Compliance System

Facebook was also required to designate compliance officers who are responsible for Facebook’s privacy program; these individuals are subject to the approval of the board privacy committee and can be removed only by that committee. The compliance officers and Facebook CEO Mark Zuckerberg must independently submit quarterly certifications to the FTC to show that the company is complying with the mandated privacy program, which applies to Facebook, Instagram, and WhatsApp. Misrepresenting certification status can result in civil and criminal penalties for these individuals.

The other cog in the compliance system is an outside independent third-party assessor, approved by the FTC, who evaluates the effectiveness of Facebook’s privacy program and identifies gaps.

Additionally, Facebook must “conduct a privacy review of every new or modified product, service, or practice before it is implemented, and document its decisions about user privacy”, as well as document any incident where data of 500 or more users has been compromised within 30 days of discovery. Other new requirements stipulated by the FTC include increased oversight of third-party apps, utilizing user phone numbers only for security protocols, express user consent for enabling facial recognition, a comprehensive data security program, and encrypting passwords.

But will all of this be enough to coerce Facebook to respect the privacy of its users and abide by the FTC’s orders? The stakes are huge, not only for consumer privacy, but more broadly for issues as diverse as election integrity and monetary policy. Brittany Kaiser, one of the Cambridge Analytica whistleblowers, said “Facebook is actually the biggest threat to our democracy, not just foreign actors… I say that specifically because it was only two weeks ago that Mark Zuckerberg decided that Facebook is not going to censor politicians that are spreading disinformation, weaponizing racial hatred and even using voter suppression tactics”. During Zuckerberg’s October 23 congressional inquiry regarding Libra cryptocurrency, he was again questioned about Cambridge Analytica. It’s an issue that will not go away.

Ultimately, time will tell whether Facebook has learned its lessons from the recent FTC actions. If it hasn’t, everyone should hope that the FTC vigorously enforces its July order, and reacts even more strongly to prevent future harms.

AI-Powered Checkout

AI-Powered Checkout
By Anonymous | November 1, 2019

AI-powered checkout is on the rise. Amazon plans to open close to 3,000 new AmazonGo cashierless stores in the next few yearsi. Sensing the opportunity to disrupt a multi-billion-dollar industry, a slew of several competitors from Standard Cognition to Mighty AI have arisen to arm retailers with similar AI- powered camera checkout technology. The value is clear. For customers, a better, more convenient and quicker retail experience. For retailers, potentially higher sales, less theft, and lower costs. However, while most press around the technology centers around convenience and upside, one topic that has been missing from conversations is the potential data protection and privacy issues that may arise with computers tracking our appearance and movements in stores and using this massive treasure trove of data to subtly influence our daily lives.


Image Source: https://www.theverge.com/2016/12/5/13842592/amazon-go-new-cashier-less-convenience-store

First the technology. Using sensors and cameras, merchants can track what customers take off shelves and out of the store. Using a combination of computer vision, sensor fusion and deep learning, this “just walk out” technology can detect when products are taken or returned to the shelves and keeps track of them in your virtual cart. When you leave the store with your goods, your account is charged, and you are sent a receipt.


Image source: https://techcrunch.com/2018/07/17/standard-cognition-raises-another-5-5m-to-create-a-cashier-less-checkout-experience/

While the benefits are clear, there are several potential privacy concerns here from stores’ tracking and profiling customers. Let’s recall the infamous incident of target figuring out that a teen girl was pregnant before her father was awareii. With this technology, retailers will now have access to significantly more new granular behavioral information on customers than when the target incident occurred, including appearance, movement patterns, how long they peruse aisles before making a purchase decision and at what time and on which days they shop. Retailers and brands could use this information to target consumers in several ways that might appear intrusive. For example, you could go to a store to buy milk and cheese, only to have a store combine your appearance with your age and gender to target with a notification to buy skincare products for your skin ailment. And in a world with rising retail data breachesiii, adding visual information to the large existing treasure trove of data information retailers already have on consumers could make the consequences of future data beaches far more severe.

There are, however, a few simple steps that can be proactively taken by retailers to reduce potential future harm to users. Retailers can anonymize or remove all personally identifiable information not needed for business operations and transactions in their databases, avoiding the potential harm of data leaks or hacks and making it more difficult for hackers or other platforms to use data gathered to harm their customers. They can also give customers a heads up with signs outside stores that this information is being gathered so they are at least aware and offer direct discounts in exchange for targeted promotions. This way users would at least be aware that this data is being gathered have an option to turn around and not enter a store and they receive a tangible benefit in exchange. Customers should also be given an option to opt-in or not and have their data deleted if they chose to opt-out.

The domain of AI-powered checkout and the internet of things (IoT) technologies more broadly are in their infancy and the new data gathered could have large impacts on people’s lives in the future. New technologies today are developed so quickly that legal guidelines often lag, as they take time to form and be passed into law. The general public then needs to advocate lawmakers to create new regulations early in the development of these new technologies, so that customers can continue to benefit while also being protected from potential future harms.

Works Cited
[1] https://www.bloomberg.com/news/articles/2018-09-19/amazon-is-said-to-plan-up-to-3-000-cashierless-stores- by-2021
[2] https://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before- her-father-did/
[3] https://www.retaildive.com/news/beyond-the-data-breach-how-retail-is-addressing-cybersecurity/555563/

Which smart city initiatives are appropriate?

Which smart city initiatives are appropriate?
By Anonymous | November 1, 2019

In countries with high speed internet connections, Internet of Things (IoT) is becoming more integrated in everyday lives. Statista, a database company, states that the revenue generated in the US smart home market in 2019 is expected to be $23,577m, which is a $3,931m or 20.0% increase from 2018. Now, more urban areas have been engaging in smart city initiatives.

Smart city initiatives can include a wide range of tools and services. PrePol in LA which used predictive analytics to identify individuals likely to commit crimes based on data and facial recognition being banned for government use in four US cities are two examples of IoT and analytics tools that raise concerns due to personal data collection and potential bias from the tools that can bring more harm than good.

However, IoT in general has been a growing initiative in many cities around the world to increase efficiency and safety. For example, In July 2018, International Data Corporation (IDC) forecasted smart city spending to be $158billion in 2022 and in July 2019, IDC forecasted a $189billion spending on smart city initiatives in 2023. These smart city initiatives usually include some of the following areas: energy, transportation, internet access, real time information, etc. Amsterdam, Barcelona, Singapore, South Korea, and the United States have all invested resources in smart city projects.

Amsterdam, Netherlands has been engaged in smart city efforts prior to 2010 and is still continuing to this day. Government led projects as well as public-private platforms, one specifically called the Amsterdam Smart City (ASC) initiatives, helped transform Amsterdam through a combination of technological and data driven changes. In 2011 the city saw a 13% energy savings on average from16 projects which provides the potential for a “green” environment when the number of projects are scaled-up. Additional projects include analyzing crowd movements to improve crowd control, observing areas of rainfall to measure impact of flooding on traffic situations, and assessing social media posts to improve tourists and resident experience in Amsterdam by providing status updates on lines or transportation delays.

On March 13, 2019, Seoul, South Korea’s government discussed 18 projects that will use big data and information and communication technology (ICT) to turn Seoul into a smart city. A few types of IoT devices that are planned to be installed include 50,000 sensors to collect fine dust information, 500 spaces and sensors in 2019 (3,000 spaces by 2022) for shared parking and applications for parking payment, and 17,820 intelligent closed-circuit TV cameras (CCTV) to alert police or firefighters real-time. In this case, the CCTVs will alert firefighters or police of any dangerous situations by the cameras will constantly check for incidents such as physical altercations or arson. Rather than identifying individuals through biometric data, the intention here is to check for dangerous situations.

New York City took steps to become a smart city by enabling LinkNYC, free kiosks around the city that provides not only free WiFi hotspots but also phone calls, device charging ports, and city maps and directions on tablets at the kiosks all free of charge. LinkNYC has been funded through advertisement revenue, so this would not cost any extra money for taxpayers or the government. NYC also has Transit Wireless offer free WiFi in underground subways for the many individuals that use public transportation.

There are many smart city initiatives that can benefit the general public and save costs for the government while respecting individual data privacy, protecting individuals or groups from filter bubbles, and providing unbiased results from the use of IoT. As long as government officials can identify the risks and benefits of IoT, smart cities can be a success story.

Reference

[1] https://www.link.nyc/
[2] https://k-erc.eu/korea-rd-research-trends-and-results/seoul-municipal-govt-unveils-smart-city-plan/
[3] https://www.statista.com/outlook/279/109/smart-home/united-states
[4] https://www.smartcitiesworld.net/smart-cities-news/smart-cities-news/building-the-smart-cities-of-the-future-think-long-term-and-local-3948
[5] https://www.forbes.com/sites/jamesellsmoor/2019/05/19/smart-cities-the-future-of-urban-development/#105191742f90
[6] https://sloanreview.mit.edu/case-study/data-driven-city-management/

Advocacy to End Fake or Paid News

Advocacy to End Fake or Paid News
By Ramakrishna Rangavajjula | November 1, 2019

Fake news in India also called as Paid news is use of money or cash payment to journalists and media organizations by certain type of individuals or organizations to appear in news articles to ensure a perceived positive coverage in the eyes of general public

This practice has been an old one however off late become widespread and more scarily an organized activity through formal contracts with widely circulated newspapers or publications of India

The news financially benefits individual journalists and media organizations such as newspapers, magazines and television channels. These individuals or organizations are typically paid for by politicians, organizations (for profit and non-profit) and celebrities who seek to improve their public image by increasing favorable coverage and minimizing unfavorable information about themselves.

The widespread Fake News practice in India has been criticized by several advocacy groups since it diverts coverage to whosoever is willing to pay and selectively presents information to favor the payee, instead of what is significant, complete and necessary to inform the public.

Fake news corrupts the information and it deceives the newspaper reader or the television audience, particularly given the Indian practice of not making it clear that the news item has been paid for.

As part of their campaign against the misuse of newspapers/magazines/television channels, several advocacy groups started a drive based on learning material offered by media houses to create awareness among news readers and television audience on the issue of fake news.

A prominent campaign is being run by Centre for Development of Advanced Computing (CDAC) on a portal www.infosecawareness.in is targeted at school children and government officials, and seeks to create awareness around dos and don’ts of publication policies of the media houses.

The advocacy groups are working on various other initiatives to ensure that publication platforms become more accountable and efforts are on to increase awareness about fake news among citizens through focused-awareness drives.

Facebook has drawn flak from the advocacy groups over fake news on its platform that have led to multiple incidents of mob-lynching across the country. Notices were issues to Facebook, warning that it will treat the digital platform as abettor of rumor propagation and legal consequences will follow, if adequate checks are not put in place.

Subsequently Facebook claimed that it made significant changes to its digital platform to handle Fake news.

Advocacy groups plan to engage with governments to emphasize the role they can play in sensitizing students and teachers on Fake new in publications.

The Advocacy groups also plan to leverage the governments’ network of Common Service Centers to promote such campaigns, especially in rural areas where there has been an explosion in data services.

The content placed on digital platforms has been prepared in line with guidelines provided by the advocacy groups with requests to users to check the information before they publish it online or in newspapers.

References:

1. https://www.americanpressinstitute.org/publications/reports/survey-research/paying-for-news/
2. https://timesofindia.indiatimes.com/What-is-paid-news/articleshow/6801559.cms
3. https://www.prsindia.org/report-summaries/issues-related-paid-news
4. https://www.aljazeera.com/programmes/listeningpost/2018/06/caught-camera-indian-media-outlets-paid-news-180602112100172.html

Dataception: Discussing the Metadata of Your Data

Dataception: Discussing the Metadata of Your Data
By Gurdit Chahal | November 1, 2019

Many of us have heard the adage that a picture is worth a thousand words. However, few of us may realize how a digital picture literally carries a thousand words as we share content across social media, email, and the internet in general. And no, I don’t mean your visible posts or hashtags. Instead, consider the metadata, the trail of breadcrumbs attached to your electronic and digital content that government intelligence agencies, hackers, advertisers, and others make use of to gain knowledge into private areas of our lives.

Aptly named, metadata is the data about your data. Its function is to help tag, organize, find, and work with the data it describes. For a digital image or other visual digital media, the metadata often comes as an “exif” file that has details regarding the make of the camera, whether flash was used, the date it was taken, GPS coordinates, etc. This information is usually generated and shipped with the image itself automatically. Other digital objects like documents can carry metadata as well. For another example showcasing the potential level of detail, check out the anatomy of metadata for a Twitter tweet in the image below.


Caption: Metadata Diagram for a Twitter Tweet

Exploring examples of metadata-based applications can help give a sense of how much we potentially expose ourselves in the digital world. A seemingly silly yet concerning demonstration of the capability is given by the self-described “data visualization experiment” at iknowwhereyourcatlives.com . To showcase exploitability of personal information through public content, the creators collected millions of cat pictures available on the internet and pinned the pictures to Google maps using GPS coordinates provided in the images’ metadata that is accurate to about twenty feet of where it was taken. Basically, by your posting a picture of your cat (or anything/anyone else) at home, someone can figure out where you live and be off by about three cars parked in a line.


Caption:Metadata Can Be Found With Basic Tools

Taking the “Big Brother” vibe up a notch is an example from 2009. German politician Malte Spitz took his telecommunication provider Deutsche Telekom to court in order to get access to the data, particularly the metadata, it had collected on him. With the help of journalists, Spitz was able to produce an interactive map of his life that spans six months based purely off of his metadata. It included where he went, where he lived, whom he talked with/the call duration, his phone contacts, among other details. On top of that, the metadata combined with data related to his political life such as Twitter feeds, blogs, and other content available online to make the map showed not only where he was going and with whom he spoke, but also likely what he was talking about or doing throughout the day (such as rallies, flights, lunch breaks). The map is here: https://www.zeit.de/datenschutz/malte-spitz-data-retention .


Caption: Sample Frame of Maltz Spitz’s Data Story

In terms of research, there have been studies to try to quantify how vulnerable we are depending on the metadata available. A 2018 study showed how given Twitter metadata and nothing regarding their actual historical content, a machine-learning algorithm could pinpoint a new tweet to a user out of a group of 10,000 identified individuals with about 96% accuracy. Moreover, despite trying to confuse the model with data obfuscation and randomization techniques that are standard ways to try to add noise and hide information, the performance was still near 95%. A rough analogy would be to take a local phone book, be told that one of the people listed said “I like turtles” at 2pm and be able to use that list and some phone bill information to pinpoint who it was.

On the legal end, scholars like Helen Nissenbaum have pointed how metadata is often in a gray zone with respect to Fourth Amendment protections from search and seizure, since it hinges on what is “expected” to be private. While Europe has GDPR, California is one of few American parallels with the Electronic Communications Privacy Act requiring a warrant and the Consumer Privacy Act of 2018 giving citizens (starting 2020) the right to see what data and metadata is collected by companies to be able to have it deleted.

Having provided a survey level perspective on metadata and its potential, I hope that we can we can be mindful of not only the worth of the picture, but also of the thousand words used to describe it.

References
[1] iknowwhereyourcatlives.com
[2] https://www.law.nyu.edu/centers/ili/metadataproject
[3] https://arxiv.org/pdf/1803.10133.pdf
[4] https://www.zeit.de/datenschutz/malte-spitz-data-retention
[5] https://opendatasecurity.io/what-is-metadata-and-what-does-it-reveal/
[6] https://www.perspectiverisk.com/metadata-and-the-risks-to-your-security/
[7] https://www.digitalcitizen.life/what-file-s-metadata-and-how-edit-it
[8] https://www.eckerson.com/articles/if-data-is-the-new-oil-metadata-is-the-new-gold
[9] https://www.datasciencecentral.com/profiles/blogs/importance-of-metadata-in-a-big-data-world
[10] https://whatis.techtarget.com/definition/image-metadata
[11] https://www.digitalcitizen.life/what-file-s-metadata-and-how-edit-it

Keeping Data Separate: Not All Personally Identifiable Information are Equal

Keeping Data Separate: Not All Personally Identifiable Information are Equal
By Francis Leung | November 1, 2019

Two-factor authentication (2FA) is a security process in which the user provides two different authentication factors to verify their identity. The first factor is commonly a password, while the second factor could be a security token or pin that is generated and sent to the user’s pre-registered email address or phone as they try to log in. 2FA is easy to implement and helps improve security as simply obtaining passwords is no longer sufficient for attackers to access an account and the personal nature of the second factor also makes it more difficult to obtain. As a result, many online platforms offer 2FA to its users.

On October 8, 2019, Twitter disclosed that it had inadvertently allowed phone numbers and email addresses provided by users to set up 2FA on their accounts, to be used for targeted advertising. Twitter offers a product called Tailored Audiences, which allows advertisers to target ads to customers based on the advertiser’s own mailing lists. The advertiser uploads their marketing list to Twitter, who matches the list with the Twitter user base and generates the targeted audience for the advertiser. Twitter admitted that the personal information collected for 2FA was accessed by the Tailored Audiences product.

Similarly, a year ago in September 2018, researchers from Northwestern University had discovered that Facebook was using phone numbers shared by users for two-factor authentication purposes for targeted marketing. When the researchers confronted FAcebook, the company defended its move, saying “With regard to 2-fac specifically, we’re clear with people that we use the information people provide to offer a more personalized experience, including showing more relevant ads. So when someone adds a phone number to their account for example, at sign up, on their profile, or during the two-factor authentication signup — we use this information for the same purposes.” In addition, the only way to prevent 2FA data from being used for personalization as well, was to remove the 2FA security feature from the users account.

While Facebook openly admitted it was co-mingling personally identifiable data for security and for advertising, Twitter has claimed that it was a mistake on their part. Nevertheless, these two incidents are highly concerning because it constitutes both a deceptive and unfair practice as well as a breach of trust. Users provided their contact information for the purpose of securing their accounts, only to have it used for a completely different purpose. This is incredibly hypocritical, especially as security is supposed to prevent the access of data, but the very means of security (phone numbers and emails in this case) became the enablers of data access for advertising. To add insult to injury, one reason users sign up for 2FA in the first place is because of security lapses at both Twitter and Facebook, including the hacking of numerous politicians and public figures’ profiles. Despite Facebook’s claim that they had been clear about the use of personal information for personalized services, it is also likely that many users who signed up for 2FA were not aware that their contact information would be used for anything other than security purposes.

Although cognizant that personalized recommendations is what makes platforms like Twitter and Facebook highly engaging, we nonetheless adamantly believe that data obtained for security applications should be kept separately from data used for other purposes such as personalization, even if the user is willing to provide the same personal data for both uses. This would be technologically easy to implement and it is surprising that this is not industry common practice. In addition, if the user did indeed want to provide their phone number of email address for both marketing features as well as 2FA, online platforms should ensure that this data is entered by the user separately, even at the cost of customer convenience. This way there can be no mistakes on determining what data was provided for what purposes and it reduces the risk of data being co-mingled.

Given that companies like Facebook had voluntarily shared data between its different products, we also believe that regulation is needed to ensure the separation of data for different uses. For its part, Facebook was fined $5 billion by the Federal Trade Commission for various privacy lapses (including the 2FA issue) and was explicitly prohibited from using telephone numbers obtained for security for advertising. However, Twitter’s incident a year later shows that the Facebook settlement was a one-off case that did not encourage other companies to enact similar protections on 2FA personal data. Hence, given the widespread of 2FA use today, we recommend that regulators include such provisions in the latest data protection regulations to ensure that all other companies who collect personal data for security purposes can also effectively protect this data.

Works Cited

Coldewey, Devin, and Natasha Lomas. “Facebook Settles with FTC: $5 Billion and New Privacy Guarantees.” TechCrunch, TechCrunch, 24 July 2019, techcrunch.com/2019/07/24/facebook-settles-with-ftc-5-billion-and-new-privacy-guarantees/.

Gesenhues, Amy. “Facebook Targets Ads with Data Users Didn’t Share with the Platform.” Marketing Land, 21 Mar. 2019, marketingland.com/facebook-targets-ads-with-data-users-didnt-share-with-the-platform-249136.

Goodin, Dan. “Twitter Transgression Proves Why Its Flawed 2FA System Is Such a Privacy Trap.” Ars Technica, 9 Oct. 2019, arstechnica.com/information-technology/2019/10/twitter-used-phone-numbers-provided-for-2fa-t o-match-users-to-advertisers/.

Lomas, Natasha. “Yes Facebook Is Using Your 2FA Phone Number to Target You with Ads.” TechCrunch, TechCrunch, 27 Sept. 2018, techcrunch.com/2018/09/27/yes-facebook-is-using-your-2fa-phone-number-to-target-you-with-a ds/.

Newman, Lily Hay. “Twitter Puts Profit Ahead of User Privacy-Just Like Facebook Did Before.” Wired, Conde Nast, 10 Oct. 2019, www.wired.com/story/twitter-two-factor-advertising/.
“Personal Information and Ads on Twitter.” Twitter, 8 Oct. 2019, help.twitter.com/en/information-and-ads#10-08-2019.

Rouse, Margaret, et al. “What Is Two-Factor Authentication (2FA)? – Definition from WhatIs.com.” SearchSecurity, searchsecurity.techtarget.com/definition/two-factor-authentication.

Whittaker, Zack. “Twitter Admits It Used Two-Factor Phone Numbers and Emails for Serving Targeted Ads.” TechCrunch, TechCrunch, 8 Oct. 2019, techcrunch.com/2019/10/08/twitter-admits-it-used-two-factor-phone-numbers-and-emails-for-tar geted-advertising/.