Jared Maslin – Page 9 – Data Science W231 | Behind the Data: Humans and Values

November 2, 2020

The Looming Revolution of Online Advertising

The Looming Revolution of Online Advertising
By Anonymous, October 30, 2020

In the era of the internet, advertising is getting creepily accurate and powerful. Large ad networks like Google, Facebook, and more collect huge amounts of data, through which they can infer a wide range of user characteristics, from basic demographics like age, gender, education, and parental status to broader interest categories like purchasing plan, lifestyle, beliefs, and personality. With such powerful ad networks out there, users often feel like they are being spied on and chased around by ads.

Image credit: privateinternetaccess.com

How is this possible?
How did we leak so much data to these companies? The answer is through cross-site and app tracking. When you surf the internet, going from one page to another, trackers collect data on where you have been and what you do. According to one Wall Street Journal study, the top fifty Internet sites, from CNN to Yahoo to MSN, install an average of 64 trackers[1]. The tracking can be done by scripts, cookies, widgets, or invisible image pixels embedded on the sites you visit. You probably have seen the following social media sharing buttons. Those buttons, no matter you click them or not, can record your visits and send data back to the social platform.

Image credit: pcdn.co

A similar story is happening on mobile apps. App developers often link in SDKs from other companies, through which they can gain analytic insights or show ads. As you can imagine, those SDKs will also report data back to the companies and track your activities across apps.

Why is it problematic?
Cross-site or app tracking poses great privacy concerns. Firstly, the whole tracking process happens behind the scenes. Most users are not aware of it until they see some creepily accurate ads, and even if they are aware of it, the users often have no idea how the data is collected and used, and who owns it. Secondly, only very technically sophisticated people know how to prevent this tracking, which can involve tedious configuration or even installation of other software. To make things worse, even if we can prevent future tracking, there is no clue how to wipe out the already collected data.

In general, cross-site and app activities are collected, sold, and monetized in various ways with very limited user transparency and control. GDPR and CCPA have significantly improved this. Big trackers like Google, Facebook, and more provide dedicated ad setting pages (1, 2), which allow users to delete or correct their data, to choose how they want to be tracked, etc. Though GDPR and CCPA gave users more control, most users stay with the default options and cross-site tracking remains prevalent.

The looming revolution
With growing concerns of user privacy, Apple took a radical action to kill the cross-site and app tracking. Over the past couple of years, Apple gradually rolled out the feature of Safari Intelligent Tracking Prevention (ITP)[2], which curtailed companies’ ability to install third-party cookies. With Apple taking the lead, Firefox and Chrome browsers are also launching similar features as ITP. In the release of IOS 14, Apple brought a similar feature as ITP to Apps world.

Image credit: clearcode.com

While at the first glance this may sound like a long-overdue change to safeguard users’ privacy, when delving deeper, it could create backlashes. Firstly, internet companies collect data in exchange for their free services: products like Gmail, Maps, Facebook are all free of use. According to one study from VOX, in an ad-free internet, the user would need to pay $35 every month to compensate for ad revenue[3]. Some publishers even threatened to proactively stop working on Apple devices. Secondly, Apple’s ITP solution doesn’t give much chance for users to participate. Cross-site tracking can in general enable more personalized services, more accurate search results, better recommendations, etc. Some uses may choose to opt-in to allow cross-site tracking for this purpose. Thirdly, Apple’s ITP only disabled third party cookies, and there are many other ways to continue the tracking. For example, ad platforms can switch to device-id or “fingerprint” the users by combining IP address and Geolocation.

Other radical solutions were also proposed, such as Andrew Yang’s Data Dividend Project. With many ethical concerns and the whole ads industry at stake, it is very interesting to see how things play out and what other alternatives are proposed around cross-site and app tracking.

References

November 2, 2020

We see only shadows

We see only shadows
By David Linnard Wheeler, October 30, 2020

After the space shuttle Challenger disaster (Figure 1) on January 28th, 1986, most people agreed on the cause of the incident – the O-rings that sealed the joints on the right solid rocket booster failed under cold conditions (Lewis, 1988). What most failed to recognize, however, was a more fundamental problem. The casual disregard of outliers, in this case from a data set used by scientists and engineers involved in the flight to justify the launch in cold conditions, can yield catastrophic consequences. The purpose of this essay is to show that a routine procedure for analysts and scientists – outlier removal – not only introduces biases but, under some circumstances, can actually lead to lethal repercussions. This observation raises important moral questions for data scientists.

Figure 1. Space shuttle Challenger disaster. Source: U.S. NEWS & WORLD REPORT

The night before the launch of the space shuttle Challenger, executives and engineers from NASA and Morton Thiokol, the manufacturer of the solid rocket boosters, met to discuss the scheduled launch over a teleconference call (Dalal et al. 1989). The subject of conversation was the sensitivity of O-rings (Figure 2) on the solid rocket boosters to the cold temperatures forecasted for the next morning.

Figure 2. Space shuttle Challenger O-rings on solid rocket boosters. Source: https://medium.com/rocket-science-falcon-9-and-spacex/space-shuttle-challenger-disaster-1986-7e05fbb03e43

Some of the engineers at Thiokol opposed the planned launch. The performance of the O-rings during the previous 23 test flights, they argued, suggested that temperature was influential (Table 1). When temperatures were low, for example between 53 and 65∘F, more O-rings failed than when temperatures were higher.

Table 1: Previous flight number, temperature, pressure, number of failed O-rings, and number of total O-rings

Some personnel at both agencies did not see this trend. They focused only on the flights where at least one O-ring had failed. That is, they ignored outlying cases where no O-rings failed because, from their perspective, they did not contribute any information (Presidential Commission on the space shuttle Challenger Accident, 1986). Their conclusion, upon inspection of data from Figure 3, was that “temperature data [are] not conclusive on predicting primary O-ring blowby” (Presidential Commission on the space shuttle Challenger Accident, 1986). Hence, they asked Thiokol for an official recommendation to launch. It was granted.

Figure 3. O-ring failure as a function of temperature

The next morning the Challenger launched and 7 people died.

After the incident, President Regan ordered William Rogers, former Secretary of State, to lead a commission to determine the cause of the explosion. The O-rings, the Commission found, became stiff and brittle in response to cold temperatures, thereby unable to maintain the seal between the joints of the solid rocket boosters. The case was solved. But a more fundamental lesson was missed.

Outliers and their removal from data sets can introduce consequential biases. Although this may seem obvious, it is not. Some practitioners of data science essentially promote cavalier removal of observations that are different from the rest. They focus instead on the biases that can be introduced when certain outliers are included in analyses.

This practice is hubristic for at least one reason. We, as observers, do not, in most cases, completely understand the processes by which the data we collect are generated. To use Plato’s allegory of the cave, we just see the shadows, not the actual objects. Indeed, this is one motivation to collect data. To remove data without defensible justification (e.g measurement or execution error) is to claim, even if implicitly, that we know how the data should be distributed. If true, then why collect data at all?

To be clear, I am not arguing that outlier removal is indefensible under any condition. Instead, I am arguing that we should exercise caution and awareness of the consequences of our actions, both when classifying observations as outliers and ignoring or removing them. This point was acknowledged by the Rogers Commission in the statement: “a careful analysis of the flight history of O-ring performance would have revealed the correlation in O-ring performance in low temperature[s]” (Presidential Commission on the space shuttle Challenger Accident, 1986).

Unlike other issues in fields like data science, the solution here may not be technical. That is, a new diagnostic technique or test will likely not emancipate us from our moral obligations to others. Instead, we may need to iteratively update our philosophies of data analysis to maximize benefits, minimize harms, and satisfy our fiduciary responsibilities to society.

References:

Dalal, S.R., Fowlkes, E.B., Hoadley, B. 1989. Risk analysis of the space shuttle: Pre-Challenger prediction of failure. Journal of the American Statistical Association.
Lewis, S. R. 1988. Challenger The Final Voyage. New York: Columbia University Press.
United States. 1986. Report to the President. Washington, D.C.: Presidential Commission on the Space Shuttle Challenger Accident.

October 26, 2020

A Short Case for a Data Marketplace

A Short Case for a Data Marketplace
By Linda Dong, October 23, 2020

In today’s digital, internet age, data is power. Using data, Netflix can generate recommendations, Facebook can tailor advertisements, and Visa can detect fraud. Google can predict your search phrase, Alexa can prompt you to restock household products, and Wealthfront can create your personalized retirement path, taking into account individual savings, spending, and investment goals.

Not only are data products powerful, but they also tend to be lucrative. Data products tend to be high-margin because the cost of goods sold is so low: companies generally do not pay users to collect their data. Whether companies are channeling these lucrative products into customer savings (by making other services free) or purely amassing these gains as company profits, the central question remains: should data collection be free?

– – – –

Image Source: Robinhood

Just like oil, labor, and water, data is a commodity. True – it happens to be a non-finite commodity that humans can create; however, it is also a raw material used to create sold products. Just as a bar of chocolate is made from many cacao beans, so is a web marketing analytics insight crafted from many individual browser interactions.

If you’re a chocolate maker, you’ll likely have a handful of cocoa suppliers. If you’re a web analytics company, you’ll likely have millions of users providing a little data each. However, the simple facts that your suppliers are: (i) distributed, and (ii) orders-of-magnitude more numerous do not constitute adequate justification for not compensating them.

The logistics might be simpler than you think. The idea of web-based microtransactions is not new; little known to most people, the HTTP status code of 402 [2] has been reserved for “Payment Required” use-cases for a while. While this was meant to power the opposite flow (for a requestor to present payment to access content, rather than a content provider to pay a visitor for data gathered during an interaction), this nevertheless brings us one step closer to a future where browsers might contain native wallets that can enable hundreds of microtransactions per hour.

Image Source: Mozilla Foundation

– – – –

Regulation lags behind innovation. While privacy concerns have culminated in new statutes regulating how entities should collect and use data, most protections today concern only data subjects’ rights and obligations. They have not yet evolved to address questions of compensation and profit-sharing.

Some of this is due to a lack of pressure from the general public, which, in turn, results from a lack of awareness regarding the value of data, as well as opacity regarding how companies collect and use data. Some of this is due to coercive user policies that foist consent of data collection. And some of it is due to the lack of a clear solution and path forward.

What if we reimagined the concept of privacy in an economic, rather than rights-based, context? Could browsers compete for users by providing more sophisticated privacy customizations? Could they better enable user control to select and disclose limited and specific data in exchange for monetary earnings? Could they auto-respond to pesky cookie preference pop-ups? Could they broker a new type of data marketplace between companies who want to buy data and users who want to sell data? Are these features valuable enough for them to charge users a fee, and would the public pay?

I, for one, would.

[1] https://learn.robinhood.com/articles/626haurrOd1BFJ3CkfH7xq/what-is-a-commodity/
[2] https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/402

October 26, 2020October 26, 2020

All about Grandma

All about Grandma
By Anonymous, October 23, 2020

My grandma Diane lives in Tulsa, OK on a small farm with one of my aunts, Heather, my uncle Carl, my two cousins Carl III and Toby, and my uncle Carl’s mom Bethanne. They raise goats and fowl, have a couple house dogs and some cats that come and go as they are wont to. The farm has a pond that the dogs swim in sometimes. These are things that I know because they’re my family. I’ve spent countless Thanksgivings and Christmases and been to several weddings with them.

What I didn’t know until today was that grandma is a registered Republican and Heather and Carl are registered Democrats. I didn’t intend to find this information. Rather with the 2020 election on the mind and news media covering early voting, I decided to do a cursory search about what voting information exists in the public domain. It took less than a minute to stumble onto grandma’s voter registration on the data aggregator: voterrecords.com, where voter registration records are available in searchable form for 16 states, Oklahoma included.

Of course, voter registration records have been public for a long time, but before sites like voterrecords.com it took real effort to go peruse voter rolls. While the process differed from state to state, you typically had to go to the local county office or the secretary of state’s office to formally request access. These barriers meant only the most interested of actors, like political parties or investigative journalists, took the time to do it. Now, this information is available almost accidentally to anyone with an internet connection anywhere in the world.

While presence of the internet makes access to voter records fundamentally different than in the past, what makes it concerning now is the degree to which political affiliation has become enmeshed with personal identity, particularly for more extreme actors on both ends of the political spectrum, some of which threaten violence.

To make matters much worse, voterrecords.com connects voter registration information to sites that conduct extensive background searches – truthfinder.com and beenverified.com – all without transparent labeling that prominently displayed buttons will trigger a background search.

Truthfinder conducts a search of property records, criminal records, bankruptcy records, social media accounts, etc. While truthfinder exploits public records databases for much of this information, its site is set up to make use of users’ interactions to reinforce algorithmic conclusions about which records are related to the actual person in question. Presenting follow-on questions in a way that most users are likely to think that the site is trying to isolate a particular individuals’ records, the questions ask users to confirm or deny algorithmically generated relationships with other records it has come across, thereby strengthening the person-matching algorithms that form the core of those sites.

After asking several such questions the site prompts users to search for more people – including people with which the person likely has no personal connection such as ‘celebrities’. Truthfinder’s charges for its services, and its model invites people to conduct ‘unlimited’ searches over a month, rather than purchase individual reports. Furthermore, the generated report contains information not just about the person you’ve gone down a rabbit hole searching for but also about several people that truthfinder has determined are related to the person you’ve searched for.

It is through this that I learned, despite having known grandma all my life, that a lien was put on the farm last year, that she received her social security number and card around the time she turned 18 rather than at birth, and the VIN number on her Toyota Sequoia. While she doesn’t have a criminal record, several people in neighboring states with similar names do. While I know those people aren’t her, someone who doesn’t know her as well may not and might mistakenly come to the conclusion that my grandma has a problem with shoplifting. Truthfinder’s presentation of this information makes this outcome more likely by exaggerating and not disclaiming that the information may not be linked to the right person, as happened in this case. This is all in addition to a litany of phone numbers, email addresses, social media accounts, amazon wish lists, and the addresses she has lived at or co-signed for going back decades. A couple more clicks yields similar information about all of my Oklahoma relatives over the age of 18.

While voter registration records and for that matter each of the other sets of public records used by these sites historically may have had valid reasons for being in the public domain, the internet has enabled aggregation across these datasets in a way that it literally takes less than 10 minutes to stumble unintentionally from a person’s voter record to knowing some of the most personal aspects of their lives like bankruptcy and criminal records, and not much longer to unearth similar information about nearly everyone they are related to.

This is made all the more troubling by the devolution in public discourse and increase in othering as personal identities of all sorts and stripes are increasingly coalescing into constellations around bipolar political affiliations. This is all paired with increasing rhetoric of political violence. Americans should consider carefully what information is put into the public domain, and should advocate to their state legislatures to curtail the publication and aggregation of such data sources.

October 12, 2020

To Broadcast, Promote, and Prepare: Facebook’s Alleged Culpability in the Kenosha Shootings

To Broadcast, Promote, and Prepare: Facebook’s Alleged Culpability in the Kenosha Shootings
By Matt Kawa | October 9, 2020

The night of August 25, 2020 saw Kenosha, WI engrossed with peaceful protests, riots, arson, looting, and killing in the wake of the shooting of Jacob Blake. In many ways Kenosha was not unlike cities all around the country facing protests both peaceful and violent sparked by the killing of George Floyd and others by police forces. However, Kenosha manages to distinguish itself by the fact that in the midst of the responses to the untimely death of these individuals, more individuals were killed. Namely, two protestors were shot and killed, and another injured, by seventeen-year-old Antioch, IL resident, Kyle Rittenhouse.

Rittenhouse was compelled and mobilized to cross state lines, illegally (as a minor) in possession of a firearm, to “take up arms and defend out City [sic] from the evil thugs” who would be protesting, as posted by a local vigilante militia that calls themselves the Kenosha Guard. The Kenosha Guard set up a Facebook event (pictured below) entitled “Armed Citizens to Protect our Lives and Property” in which the administrators posted the aforementioned quote (also pictured).

In addition to egregious proliferation of racist and antisemitic rhetoric, the administrators of these Facebook groups blatantly promote commission of acts of violence against protestors and rioters, not only via the groups per se, but on their personal accounts as well.

On September 22, a complaint and demand for jury trial was filed by the life partner of one of Rittenhouse’s victims and three other Kenosha residents with the United States District Court for the Eastern District of Wisconsin against shooter Kyle Rittenhouse, Kyle Matheson, “commander” of the Kenosha Guard, co-conspirator Ryan Balch a member of a similar violent organization called the “Boogaloo Bois,” both organizations per se, and most surprisingly, Facebook, Inc.

The complaint effectively alleges intentional negligence on behalf of Facebook for allowing the vigilantes to coordinate their violent presence unchecked. The claim states that Facebook “provides the platform and tools for the Kenosha Guard, Boogaloo Bois, and other right-wing militias to recruit members and plan events.” In anticipation of the defense of ignorance, the complaint then cites that over four hundred reports were filed by users regarding the Kenosha Guard group and event page expressing concern that members would be seeking to cause violence, intimidation, and injury. Reports containing speculation which, as the complaint summarizes, ultimately did transpire.

While Facebook CEO Mark Zuckerberg did eventually apologize for his platforms role in the incident, calling it an “operational mistake” and removing the Kenosha Guard page, the complaint claims that as part of an observable pattern of similar behavior, Facebook “failed to act to prevent harm to Plaintiffs and other protestors” by ignoring material numbers of reports attempting to warn them.

Ultimately, the Plaintiffs’ case rests on the Wisconsin legal principle that, “A duty consists of the obligation of due care to refrain from any act which will cause foreseeable harm to others . . . . A defendant’s duty is established when it can be said that it was foreseeable that [the] act or omission to act may cause harm to someone.” Or, simply put, Facebook had a duty to “stop the violent and terroristic threats that were made using its tools and platform,” including through inaction.

Inevitably, defenses will be made on First Amendment grounds, claiming that the Kenosha Guard and Boogaloo Bois, and their leaders and members, were simply exercising their right to freedom of speech, a right Facebook ought to afford its users. However, the Supreme Court has interpreted numerous exceptions into the First Amendment including quite prominently forbidding of incitement to violence. Whether Facebook has a moral obligation to adjudicate First Amendment claims is less clear cut. But the decision must be made in the modern, rapidly evolving world of social media as to what the role of the platform is in society and what ought or ought no be permissible enforcement of standards across the board.

The full text of the complaint can be found here.

October 12, 2020October 12, 2020

Facing Security and Privacy Risks in the Age of Telehealth

Facing Security and Privacy Risks in the Age of Telehealth
By Anonymous | October 9, 2020

As the world grapples with the coronavirus pandemic, more healthcare providers and patients are turning to telehealth visits–visits where the patient is remote and communicates with her provider through a phone call or video conference. While telehealth visits will continue to facilitate great strides forward in terms of patient access, there are privacy risks that need to be mitigated to secure the success of remote visits.

Image: National Science Foundation

Participating in a remote visit opens up a patient to many potential touchpoints of security risk. For example, ordinary data transmissions from a mobile application or medical device, such as an insulin pump, may be inadvertently shared with a third party based on the permissions granted to applications on a patient’s mobile device. Additionally, devices that stream recordings of customer statements, such as Amazon’s Alexa may record sensitive information that’s communicated over the course of a remote patient visit. In some cases, a patient may have trouble using a HIPAA (Health Insurance Portability and Accountability Act) compliant telemedicine service such as Updox, and the patient and provider might alternatively look to a non-compliant ordinary Zoom call to complete their visit. How does one make the tradeoff between patient privacy and patient access?

There are steps that both patients and providers can take in mitigating the security risks that surround telehealth visits. Patients can limit the permissions of mobile applications they use to reduce the risk of sharing sensitive information with third parties. Patients may also look to briefly turn off any devices that may record activity in their homes. Medical professionals can ensure that only current patient lab results and records are open on their laptops to avoid inadvertently screen sharing inappropriate patient data. Additionally, medical professionals and patients can work to become familiar with HIPAA-compliant telemedicine services, ensuring improved security and seamless telehealth visits.

Image: Forbes

Beyond the actions of patients and providers, patient privacy is often addressed through regulatory institutions such as the U.S. Department of Health and Human Services (HHS) with acts such as HIPAA. The HHS has recognized the need for telehealth visits during the coronavirus pandemic, and has stated that its Office for Civil Rights (OCR) “will not impose penalties for noncompliance with the regulatory requirements under the HIPAA Rules against covered health care providers in connection with the good faith provision of telehealth during the COVID-19 nationwide public health emergency”. As a supplement to the previous statement, the HHS has stated only non-public telecommunication products should be used in telemedicine visits. While the point at which the world will start to recover from the COVID-19 pandemic remains to be seen, protecting patient privacy through improved regulatory guidelines around telehealth should become a higher priority.

Further regulatory control around patient privacy with respect to telehealth will help to ensure its success. The potential benefits of remote visits are great and are quickly becoming realized. Patients with autoimmune diseases can speak to their providers from home, alleviating their higher-than-average risk of COVID-19 complications. Rural patients who once had to travel hours to see the right provider can participate in lab work and testing closer to home and discuss results and steps forward with talented healthcare providers across the country. Providers may be able to see more patients than before. Patients and providers alike can look forward to a world where telemedicine is more easily integrated into daily life, but steps should be taken to ensure patient privacy.

References

Germain, T. (2020, April 14). Medical Privacy Gets Complicated as Doctors Turn to Videochats. Retrieved October 05, 2020, from https://www.consumerreports.org/health-privacy/medical-privacy-gets-complicated-video-chats-with-doctors-coronavirus/
Hall, J. L., & McGraw, D. (2014, February 01). For Telehealth To Succeed, Privacy And Security Risks Must Be Identified And Addressed. Retrieved October 05, 2020, from https://www.healthaffairs.org/doi/full/10.1377/hlthaff.2013.0997
McDougall, J., Ferucci, E., Glover, J., & Fraenkel, L. (2017, October). Telerheumatology: A Systematic Review. Retrieved October 06, 2020, from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5436947/
Notification of Enforcement Discretion for Telehealth. (2020, March 30). Retrieved October 07, 2020, from https://www.hhs.gov/hipaa/for-professionals/special-topics/emergency-preparedness/notification-enforcement-discretion-telehealth/index.html
Schwab, K. (2020, August 21). Telehealth has a hidden downside. Retrieved October 07, 2020, from https://www.fastcompany.com/90542626/telehealth-has-a-hidden-downside

September 28, 2020

The TikTok Hubbub: What’s Different This Time Around?

The TikTok Hubbub: What’s Different This Time Around?
By Anonymous | September 25, 2020

Barely three years since its creation, TikTok is the latest juggernaut to emerge in the social media landscape. With over two billion downloads (over 600 million of which occurred just this year), the short video sharing app that allows users to lip sync and share viral dances finds itself among the likes of Facebook, Twitter, and Instagram in both the size of its user base and ubiquitousness in popular culture. Along with this popularity has come a firestorm of criticism related to privacy concerns, as well as powerful players in the U.S. government categorizing the app as a national security threat.

Image from: https://analyticsindiamag.com/is-tiktok-really-security-risk-or-america-being-paranoid/

Censorship
The largest reason TikTok seems to garner such scrutiny is the app’s parent company, ByteDance, is a Chinese company, and as such is governed by Chinese laws. Early criticisms of the company noted possible examples of censorship, including the removal of a teen’s account who was critical of human rights abuses by the Chinese government, and a German study that found TikTok hid posts made by LGBTQ users and those with disabilities. Exclusion of these viewpoints from the platform certainly raises censorship concerns. It is worth noting TikTok is not actually available in China, and the company maintains that they “do not remove content based on sensitivities related to China”.

Data Collection
Like many of its counterparts, TikTok collects a vast amount of data from its users, including location, IP addresses, and browsing history. In the context of social media apps, this seems to be the norm. It is the question of where this data might ultimately flow that garners the most criticism. The Wall Street Journal notes “concerns grow that Beijing could tap the social-media platform’s information to gather data on Americans.” The idea that this personal information could be shared with a foreign government is indeed alarming, but might have one wondering why regulators have been fairly easy on U.S. based companies like Facebook, whose role in 2016’s election interference is still up for debate, or why citizens do not find it more problematic that the U.S. government frequently requests user information from Facebook and Google. In contrast to the U.S. Government, the European Union has been at the forefront of protecting user privacy and took preemptive steps by implementing the GDPR so that foreign companies, such as Facebook, could not misuse user data without consequence. It seems evident that control of personal data is a concern globally, but one that the U.S. is only selectively taking seriously if it stems from a foreign company.

Image from: https://www.usnews.com/news/technology/articles/2020-03-04/us-senator-wants-to-ban-federal-workers-from-using-chinese-video-app-tik-tok

The Backlash
In November 2019, with bipartisan support, a U.S. national security probe of TikTok was initiated over concerns of user data collection, content censorship, and the possibility of foreign influence campaigns. In September 2020, President Trump went so far as to implement a ban on TikTok in the U.S. Currently, it appears that Oracle has become TikTok’s “trusted tech partner” in the United States, possibly allaying some fears of where data is stored and processed for the application, and under whose authority, providing a path for TikTok to keep operating within the U.S.

For its part, TikTok is attempting to navigate very tricky geopolitical demands (the app has also been banned in India, and Japan and others may follow), even establishing a Transparency Center to “evaluate [their] moderation systems, processes and policies in a holistic manner”. Whether their actions will actually be able to assuage both the public and government’s misgivings is anyone’s guess, and it can also be argued that where the data they collect is purportedly stored and who owns the company are largely irrelevent to the issues raised.

As the saga over TikTok’s platform and policies continues to play out, hopefully the public and lawmakers will not miss the broader issues raised over privacy practices and user data. It is somewhat convenient to scrutinize a company from a nation with which the U.S. has substantive human rights, political, and trade disagreements. While TikTok’s policies should indeed raise concern, we would do well to ask many of the same questions of the applications we use, regardless of where they were founded.

September 28, 2020

Steps to Protect Your Online Data Privacy

Steps to Protect Your Online Data Privacy
By Andrew Dively | September 25, 2020

Some individuals, when asked about why they don’t take more steps to protect their privacy, respond with something along the lines of, “I don’t have anything to hide.”, but if I were to ask those same individuals to send me their usernames and passwords to their email accounts, very few would actually grant me permission. When there is a lot of personal information about us on the internet, it can harm us in ways we never intended. Future employers who scour social media looking for red flags, past connections searching for our physical addresses on Google, or potential litigators looking up our employer and job title on LinkedIn to determine if we’re worth suing. This guide is going to cover the various ways our data and lives are exposed on the web and how we can protect ourselves.

Social media is by far the worst offender when it comes to data privacy, not only because of the companies’ practices but also because of the information people willingly give up, which can be purchased by virtually any third party. I’d encourage you to Google yourself to see what comes up. If you see your page from any networking sites like LinkedIn or Facebook, there are settings to remove these from public search engines. Then, you have to file a query with Google to remove the links once they no longer work. Then, within the same Google page, go to images and see what comes up. These can usually be removed as well. I would recommend removing as much Personally Identifiable Information (PII) as possible from these pages, such as current city, employers, spouses, birth dates, age, gender, pet names, or anything else that could potentially compromise your identity. Then, go through you contacts and remove individuals you don’t know, because I’d recommend that you use the highest security settings on these apps, but they can be circumvented if someone makes a fake account and sends you a friend request. Each of these social media sites has a method under privacy to view your page from the perspective of an outsider, nothing should be visible other than your name and profile picture. Next we will move onto protecting your physical privacy.

If I walked up to most individuals, they wouldn’t give me their physical address either, yet it only takes five seconds to find it on Google. If you scroll down further on the page where you searched your name, you will see other sites like BeenVerified.com, Whitepages.com, and MyLife.com. All it takes for someone to find where you live on these sites is your full name, age range, and the state you live in. These sites aggregate various personal information from public records and other sources and sells them to other companies and individuals who may be interested in them. You will find your current address and the places you’ve lived for the past ten years, all of your relatives and their information, net worth, birth date, age, credit score, criminal history, etc. The good news is that you can wipe your information from most of these sites by searching for the “opt out” form, which they are required to honor by law. If you want to take a further step, you can setup a third party mail service or P.O. Box that has a physical mailing address for less than $10 per month, to avoid having to give your physical address out. Most people aren’t aware that even entities such as the Department of Motor Vehicles sells individuals address information that gets aggregated by these companies. Protecting your physical address and other vital details can go a long way to protect your privacy.

As we wrap this up, the key takeaway from all of this is to try to think about how your data can be compromised and to take steps to protect it before something happens. There are many more potential harms out there beyond just identity theft. Rather than relying on the Government to regulate data privacy in the US, we as individuals can take steps to reclaim our personal privacy and freedom.

September 23, 2020

Private Surveillance versus Public Monitoring

Private Surveillance versus Public Monitoring
By Anonymous | September 18, 2020

In an era where digital privacy is regarded highly, we put ourselves in a contradictory position when we embed digital devices into every aspect of our lives.

One such device that has a large fan club is the Ring doorbell, a product sold by Ring, an Amazon company. It serves the purpose of a traditional doorbell, but combined with its associated phone application, it can record audio and video to monitor motion detected between five and thirty feet of the fixture. Neighbors can then share their footage with each other for alerting and safety purposes.

Ring Video Doorbell from shop.ring.com

However, as most users of smart devices can anticipate, the data our devices generate rarely remains solely ours. Our data has the ability to enter the free market for alternate uses, analysis, and even for sale. One of the main concerns that has surfaced for these nifty devices is the behind-the-scenes access to the device’s data. Ring has been in partnership with law enforcement agencies across the United States. While intentions of this partnership are broadcasted as a way to increase efficiency in solving crime, it begs a larger question. Washington Post’s Drew Harwell points out that “Ring users consent to the company giving recorded video to “law enforcement authorities, government officials and/or third parties” if the company thinks it’s necessary to comply with “legal process or reasonable government request,” according to its terms of service. The company says it can also store footage deleted by the user to comply with legal obligations.” This begs a larger ethical question on whether these kinds of policies infringe on an individual consumer’s autonomy per the Belmont Principle regarding Respect for Persons. If we can’t control what our devices record, store, and what that data is used for, who should have that power?

What began as a product to protect personal property has garnered the power to become a tool for nationwide monitoring voluntarily or involuntarily. This product which is intended for private property surveillance can become a tool available for public surveillance given the authority law enforcement has for access to device data. While the discussion of the power given to law enforcement agencies is larger in scope, in context of the Ring device, it leaves us wondering if one product has garnered a beastly capability to become a tool for mass surveillance. This then creates a direct link to the Fair Information Practice Principles. Per the Collection Limitation principle, the collection of personal information should be limited and obtained by consent. The Ring devices blur the definition of personal information in this instance. Is the recording of when you leave and enter your home your personal information? If your neighbor captures your movements via their device, does their consent to police requests for access to their device compromise your personal autonomy because your activity is the one being shared?

In contrast to this, an ethical dilemma also arises. If neighbors sharing their device data with law enforcement can catch a dangerous individual (as the Ring terms and conditions state), is there a moral obligation to share that data despite having consent of the recorded individual? This is the blurry line between informed consent and public protection.

Lastly, as law enforcement becomes more easily able to rely on devices like Ring, it brings about a larger question of protection equity. With a base cost of approximately $200 and a monthly subscription of approximately $15 to maintain the device’s monitoring, there is a possibility for protection disparity. Will the areas where people can afford these devices inherently receive better protection from local law enforcement because it is faster and easier to solve those crimes? Per the Belmont Principle regarding Social Justice, the burden of civilian protection should be evenly distributed across all local law enforcement agencies. Would it be equitable if police relied on devices like this as a precursor to offering aid in resolving crimes? On the contrary, these devices also have the ability to hinder law enforcement by giving early warning of police searches to a potential suspect. Is that a fair advantage?

Police officer facing Ring doorbell

These foggy implications are what leave once crime cautious citizens wondering if these devices are tethering the lines of data privacy and ethics concerns and even contributing to a larger digital dystopia.

September 21, 2020September 21, 2020

Data collection – Is it ethical?

Data collection – Is it ethical?
By Sarah Wang | September 18, 2020

Companies’ data collection is growing rapidly and is projected to continue. Data collection refer that companies use a cornucopia of collection methods and sources to capture customer data, on a wide range of metrics. The type of data collected could range from personal data, such as Social Security numbers and other identifiable information, to attitudinal data, such as consumer satisfactory, product desirability and more.

Consumer data is collected for business purpose. For example, companies often analyze customer service records to understand what interaction methods worked well and what did not, and how customers responded, on a grand scale. Furthermore, it is also common for companies to sell customers data to third-party resources for profit or business collaboration. But this process is nearly never clearly disclosed to customers.

Why are companies collecting customer data?
Targeted advertising the the main driver behind customer data collection. Target advertising is directed towards audiences with certain traits, based on the product or the person the advertiser is promoting to. Contextualized data can help companies understand customers’ behavior and personalize marketing campaign. As the result, the increase in the likelihood of a purpose transaction will increase companies’ return on investment.

Concerns about data collection
Data privacy and breach is the major concerns about data collection. Last year, major corporations, such as Facebook, Google, Uber, experienced data breaches that put tens of millions of personal records into the hands of criminals. These breaches are only the “tip of the iceberg” when it comes to hacked accounts and stolen data. Consumers are beginning to take notice. In a research conducted by PWC, 69% of consumers believe companies are vulnerable to hacks and cyberattackers.

Over the time, this caused consumers to lose trust in companies that have customers’ personal information. Only 10% of consumers feel they have complete control over their personal information. If customers don’t trust the business to protect their sensitive data and use it responsibly, companies will get nowhere to harness the value of that data to offer customers a better experience

Last, but not the least, another downside of data driven business is it subjects to model bias. One example of that is how amazon’s recruiting algorithm is particular in favor of male and penalized resumes that includes the word “women” because the algorithm is trained based on the large sample bias toward male employees

How to reassure customers that their data is being protected?
First and foremost, companies need to demonstrate respect to their customers by providing full transparency of what data is collected, how the data will be used, when will the data be purged and expired, if ever.

Secondly, companies need to provide customers the option with data not being collected. Each individual should be treated as autonomous person capable of making decisions for themselves. Behind this idea is that data should be owned by customers. Individuals may consent that their data are used by companies but only under certain boundaries and conditions.

BIBLIOGRAPHY
1. Consumer Intelligence Series: Protect.me, Article by PWC, September 2017
https://www.pwc.com/us/en/advisory-services/publications/consumer-intelligence-series/protect-me/cis-protect-me-findings.pdf

2. Targeted Advertising, Wikipedia, January 2017
https://en.wikipedia.org/wiki/Targeted_advertising