March 2020 – Data Science W231 | Behind the Data: Humans and Values

March 31, 2020

Online Security in the Age of Shelter-In-Place

Online Security in the Age of Shelter-In-Place
By Percival Chen | March 31, 2020

With the implementation of the shelter-in-place orders around the country, the economy has been continuing to tank. One particular industry that is bucking the trend and is surging upwards is the video conference industry, and nowhere is this more apparent than with Zoom Video Communications, otherwise known simply as Zoom.

According to the New York Times, “[e]ven as the stock market has plummeted, shares of Zoom have more than doubled since the beginning of the year” [1]. Some people might say business is “zoom”-ing with the videoconferencing app, but with the rise of Zoom comes many other concerns, especially with regards to its privacy and security practices.

One recent practice that has gained particular notoriety is the concept of “Zoombombing” by attackers who can potentially hijack a meeting and dump whatever information they want onto the viewers [2]. This often leads to entire meetings getting shut down because once a user shares, there is no way for the host to immediately kick out the user once their screen has been shared.

Another issue that Zoom has had to deal with is the “recent and sudden surge in both the volume and sensitivity of data being passed through its network” [1]. The millions of Americans shifting to this new online reality of life, including the displaced students across the nation, are now using services such as Zoom to carry on with their jobs and with school as before. This sudden surge has led to an increased concern for the potential vulnerabilities of Zoom’s current security practice, a sentiment shared by even New York attorney general’s office. Communication is so vital to the operations of businesses, corporations, schools, and even just our daily lives, and now that all of that traffic is being funneled through a few providers, there is little wonder that this is becoming more of a concern, especially with the increase in sensitive information that is being passed virtually.

What kind of information is particularly at risk? It’s the personally identifiable information (PII). Think of this as data that can be used to identify a specific individual. This is basic information that Zoom collects, as do many other companies. It’s information like your name, home address, email, and phone number, but it also includes your Facebook profile information, credit/debit card (if it’s linked to your payment of Zoom), information about your job like your title and employer, general information about your product and service preferences, and information about your device, network, and internet connectivity [3]. Now, at this point, identity theft in the form of an information leak or hack attack can lead to serious consequences. I conducted some UserTesting research about a month ago, and some participants voiced that this potential issue was serious enough for them to personally investigate further and perhaps to even look for a different service for their needs rather than to Zoom, given that Zoom had access to a trifecta of personal, social, and financial information of a user.

Even as this blog post is released, I am sure that Zoom’s privacy policy will continue to evolve as different events unfold. It already has been updated several times since February. And while I don’t foresee video communications being shut down given the essential role that they play in the corporate scene, I do expect there to be many (many) new tweaks to the current system in place, and maybe after the pandemic is over, our world will be even more resilient to deal with the shifting landscape of data privacy and security.

Sources:
1. New York Attorney General Looks Into Zoom’s Privacy Practices – The New York Times. https://www.nytimes.com/2020/03/30/technology/new-york-attorney-general-zoom-privacy.html. Accessed 31 Mar. 2020.
2. Lorenz, Taylor. “‘Zoombombing’: When Video Conferences Go Wrong.” The New York Times, 20 Mar. 2020. NYTimes.com, https://www.nytimes.com/2020/03/20/style/zoombombing-zoom-trolling.html.
3. ZOOM PRIVACY POLICY – Zoom. https://zoom.us/privacy

March 30, 2020

Data Ethics in the Face of a Global Pandemic

Data Ethics in the Face of a Global Pandemic
By Natalie Wang | March 27, 2020

Since the first recorded case of Covid-19 four months ago, there have been more than 500,000 reported cases worldwide. Every day thousands of new cases of this infectious disease are being reported. Early studies show it affects people of all ages (although it appears to be particularly fatal in the elderly population) and is most often transmitted through contact with infected people. Due to its highly infectious nature, many countries have implemented international travel restrictions. Additionally, in the United States, as of March 24, more than 167 million people in 17 states, 18 counties, and 10 cities have received shelter-in-place orders from their state officials. The government is also attempting to head off the spread of Covid-19 with data.

Source: https://coronavirus.jhu.edu/map.html

As a disease progresses, there are different ways data can be used for public good. In the beginning of a spread, it is useful for public health officials to know some private information about the infected individuals. For example, where exactly they have been and who they have been in contact with. This can help determine how the virus was transmitted and who to warn to be on the lookout for early symptoms. Other demographic and health data may also provide clues about which populations will be particularly susceptible to the virus. However, while this data is extremely useful to public health officials especially in the beginning when a disease is more unknown, it does not mean it is useful or should be made available to the public. Personal privacy and risks associated with health data still need to be taken into account when sharing this kind of data. For example, knowing the names of the individuals with the first cases of Covid-19 would not help you protect yourself any better against the virus.

Once a disease is spread around a community, knowing specific personal information may not be as useful because an individual could have picked up the virus from many places. At this point, to prevent further spread of Covid-19, it may be useful to know where collective groups of people are gathering and general transportation patterns. Currently, Facebook, Google, and some other tech companies are discussing potentially sharing aggregated and anonymized user location data with the US government to analyze how effective social distancing measures are and how transportation patterns might be affecting the spread of Covid-19. From the general sentiment on Twitter, people are extremely unhappy with this idea. The main argument being that providing the government with increased data now will lead to further erosion of privacy down the line.

Source: https://www.wired.com/story/value-ethics-using-phone-data-monitor-covid-19/

To limit the privacy risk this additional digital surveillance will have on the population, tech companies should only collect and provide data that is needed and will lead to valuable insights on Covid-19. Obviously this is easier said than done, how do we know how much data is enough? Or what will be useful? The best way to address this would be to consult with multiple experts from different fields and to keep users informed during the process. In my opinion, Covid-19 is too important of an issue to not attempt to use all the resources we have. During public health emergencies, people cannot have the same level of personal privacy they have at other times. However, there should still be safeguards in place to protect a reasonable amount of privacy while also furthering public health.

As Al Gidari, the director of privacy at Stanford’s Law School tweeted, “The balance between privacy and pandemic policy is a delicate one… Technology can save lives, but if the implementation unreasonably threatens privacy, more lives may be at risk.” As a society, we are in a situation we have never been in before; Covid-19 is a dangerous global pandemic that needs to be addressed with all the technology we have. However, because the potential for misuse of personal data is so high, there needs to be more transparency between what data is being shared and how it is being used. Additionally, people should be careful and stay informed about what is going on.

Sources:
https://www.cdc.gov/coronavirus/2019-ncov/prepare/transmission.html
https://www.nytimes.com/interactive/2020/us/coronavirus-stay-at-home-order.html
https://www.theverge.com/2020/3/12/21177129/personal-privacy-pandemic-ethics-public-health-coronavirus
https://www.wired.com/story/value-ethics-using-phone-data-monitor-covid-19/
https://www.washingtonpost.com/technology/2020/03/17/white-house-location-data-coronavirus/

March 30, 2020

Streamlining US Customs during a state of emergency

Streamlining US Customs during a state of emergency
By Anonymous | March 27, 2020

I was one of those people traveling internationally when the US government started closing borders in response to the recent COVID-19 pandemic. Apple News was bombarding my phone with dramatic headlines about flight cancelations and horrendous lines at US Customs and Border Patrol (CBP) crossings, interwoven with constant reminders to avoid crowded areas. My airline and every other business I deal with had sent emails about disruptions in service and overwhelmed customer service hotlines.

Privacy is important to me, so I tend not to install unnecessary apps on my phone. In the anxiety-filled 24 hours leading up to my return flight I ignored my inhibitions and loaded every app that might help me along the way: airlines, banking, cellular provider, video downloads. I turned on additional push notifications (a feature I rarely enable) from Apple News and email accounts. The additional information blasts from news outlets, service providers, and concerned relatives did little to calm my nerves.

As the plane taxied to the runway and everyone finished cleaning their surroundings with anti-bacterial wipes, a lovely stranger told me about the Mobile Passport app that I could download to skip some lines at customs. Upon landing I downloaded the app, blindly accepted the Terms of Service, and proceded (with a cringe) to enter some of my most personally identifiable information: name, date of birth, sex, citizenship, and a clear photo of my face. I answered the basic customs declaration questionaire on my phone and sent it off to CBP before deboarding.

WHAT IS MOBILE PASSPORT CONTROL?
Mobile Passport Control (MPC) is the first process utilizing authorized apps to streamline a traveler’s arrival into the United States. It is currently available to U.S. citizens and Canadian visitors. Eligible travelers voluntarily submit their passport information and answers to inspection-related questions to CBP via a smartphone or tablet app prior to inspection.

I noticed that the app developer was not a government organization, and at the time I didn’t care. I wanted any advantage to make my connecting flight.

When I arrived at US customs I felt that much of my anxiety was unfounded. There were no significant lines and nobody in hazmat suits checking traveler’s temperatures. There was a dedicated line for Mobile Passport users, allowing me to avoid the clusters of touch-screen kiosks where most people filled out their declaration forms. After ten days quarantined without symptoms, I remain thankful that I did not have to touch those kiosks.

Upon a post-acceptance review of the Mobile Passport privacy policy I remain comfortable using this service on an as-needed basis. That is to say: I have deleted it from my phone, and will likely re-install it for future travel. Thanks to the clear and concise policy that includes GDPR specific requirements, a few important observations stand out:

1. The service provider uses de-identified information for targeted advertising from third party providers, yet offers a paid version that is ad-free.
2. Personally identifiable information (PII) is transmitted according to the respective requirements of CBP and TSA.
3. Transaction logs are pseudonymized. The customer’s name and passport number are not retained in association with form submission data.
4. Personal account data can be deleted upon request, however the pseudinymized access logs cannot be deleted because they have already been de-identified.
5. Device identifiers, which are commonly used for cross-app tracking, are not stored by the service provider.

While I’m not a fan of targeted advertising, I am willing to endure it for the convenience afforded by Mobile Passport. I assume that means Google and Facebook will learn a bit more about me from advertising analytics, but the fact that the app is not collecting the device identifiers suggests that they will not using my information in an aggregated manner. The fact that they acknowledge their inability to remove pseudinymized data provides a bit of comfort.

I’ve considered looking into the CBP and TSA policies to see what they have to say about my personal information, but going down the rabbit hole of the Department of Homeland Security’s (DHS) Data Management Hub didn’t seem worth the stress. It was refreshing to see that DHS clearly publishes privacy impact assessments for their various programs, so depending on how long it takes for this pandemic to subside I may still end up jumping into that hole.

March 30, 2020

AI Applications in the Military

AI Applications in the Military
By Stephen Holtz | March 27, 2020

In recent years countries across the world have started developing applications for artificial intelligence and machine learning for their militaries. Seven key countries lead in military applications of AI: the United States, China, Russia, the United Kingdom, France, Israel, and South Korea, with each one developing and researching weapons systems with greater autonomy.

This development indicates there is a need for leaders to examine the legal and ethical implications of this technology. Potential applications range from optimization routines for logistics planning to autonomous weapons systems that can identify and attack targets with little or no intervention from humans.

The debate has started from a number of sources. For example, “[t]he Canadian Armed Forces is committed to maintaining appropriate human involvement in the use of military capabilities that can exert lethal forces.” Unfortunately for the discussion at hand, the Canadian Armed Forces does not define what ‘appropriate human involvement’ means.

China has proposed a ban on the use of AI in offensive weapons, but appears to want to keep the capability for defensive weapons.

Austria has openly called for a ban on weapons that don’t have meaningful human control over critical functions, including the selection and engagement of a target.

South Korea has deployed the Super aEgis II machine gun which can identify, track, and destroy moving targets at a range of 4 kilometers. This technology can theoretically operate without human intervention and has been in use with human control since 2010.

Russia has perhaps been the most aggressive in its thinking about AI applications in the military, having proposed concepts such as AI-guided missiles that can switch targets mid-flight, autonomous AI operation systems that provide UAVs with the ability to ‘swarm’, autonomous and semi-autonomous combat systems that can make its own judgements without human intervention, unmanned tanks and torpedo boats, robot soldiers, and ‘drone submarines.’

While the United States has multiple AI combat programs in development including an autonomous warship, the US Department of Defence has put in place a directive that requires a human operator to be kept in the loop when taking human life by autonomous weapons systems. This directive implies that the same rules of engagement that apply to conventional warfare also applies to autonomous systems.

Similar thinking is applied by the United Kingdom government in opposing a ban on lethal autonomous weapons, stating that current international humanitarian law already provides sufficient regulation for this area. The Uk armed forces also exerts human oversight and control over all weapons they employ.

The challenge with the development of ethical and legal systems to manage the development of autonomous weapons systems is that game theory is at play and the debate is not simply about what is right and wrong, but about who can exert power and influence over others. Vladimir Putin is quoted as having said in 2017 that “Artificial Intelligence is the future, not only for Russia but for all humankind… Whoever becomes the leader in this sphere will become the ruler of the world.” With the severity of the problem so succinctly put by Russia’s President, players need to evaluate the game theory before deciding on their next move.

Clearly, in a world where all parties cooperate and can be trusted to abide by the agreed rules in deed and intent, the optimal solution is for each country to devise methods of reducing waste, stengthening their borders, and learning from eachother’s solutions. In this world, the basic ethical principles of the Belmont Report are useful for directing the research and development of military applications of AI. Respect for Persons would lead militaries to reduce waste through optimization, and to build defensive weapons systems. Beneficence and justice would lead militaries to focus on disaster response functions that they are all-too-often called upon to fulfill. Unfortunately, we do not always live in this world.

Should nations assess that by collaborating with other nations they expose themselves to exploitation and domination by bad actors, they will start to develop a combination of defensive, offensive, and counter-AI measures that would breach the principles shared in the Belmont report.

PORTLAND, Ore. (Apr. 7, 2016) Sea Hunter, an entirely new class of unmanned ocean-going vessel gets underway on the Williammette River following a christening ceremony in Portland, Ore. Part the of the Defense Advanced Research Projects Agency (DARPA)’s Anti-Submarine Warfare Continuous Trail Unmanned Vessel (ACTUV) program, in conjunction with the Office of Naval Research (ONR), is working to fully test the capabilities of the vessel and several innovative payloads, with the goal of transitioning the technology to Navy operational use once fully proven. (U.S. Navy photo by John F. Williams/Released)160407-N-PO203-598
Join the conversation:
http://www.navy.mil/viewGallery.asp
http://www.facebook.com/USNavy
http://www.twitter.com/USNavy
http://navylive.dodlive.mil
http://pinterest.com
https://plus.google.com

Perhaps the most disturbing possibilities in the autonomous weapons systems are those that involve genocide committed by a faceless, nameless machine that has been disowned by all nations and private individuals. Consider the ‘little green men’ that have fought in the Donbass region of the Ukraine since 2014. Consider also the genocides that have occurred in the Balkans, Rwanda, Sudan, and elsewhere in the last fifty years. Now combine the two stories whereby groups are targetted and killed and there is no apparent human who can be tied to the killings. Scenarios like these should lead the world to broader regulatary systems whereby the humans who are capable of developing such systems are identified, registered, and subject to codes of ethics. Further, these scenarios call for a global response force to combat autonomous weapons systems should they be put to their worst uses. Finally, the global response force that identifies and responds to rogue or disowned autonomous weapons systems must develop the capability to conduct forensic investigations of the autonomous weapons systems to determine the responsible party and to hold it to account.

Works Cited

https://mwi.usma.edu/augmented-intelligence-warrior-artificial-intelligence-machine-learning-roadmap-military/

https://business.financialpost.com/pmn/business-pmn/two-of-canadas-ai-gurus-warn-of-war-by-algorithm-as-they-win-tech-nobel-prize

https://ploughshares.ca/2019/05/more-clarity-on-canadas-views-on-military-applications-of-artificial-intelligence-needed/

https://www.researchgate.net/publication/335422076_Militarization_of_AI_from_a_Russian_Perspective

https://futureoflife.org/ai-policy-russia/?cn-reloaded=1

Russian AI-Enabled Combat: Coming to a City Near You?

https://media.defense.gov/2019/Oct/31/2002204458/-1/-1/0/DIB_AI_PRINCIPLES_PRIMARY_DOCUMENT.PDF

https://smallwarsjournal.com/jrnl/art/emerging-capability-military-applications-artificial-intelligence-and-machine-learning

https://www.cfc.forces.gc.ca/259/290/405/192/elmasry.pdf

https://www.cfc.forces.gc.ca/259/290/308/192/macdonald.pdf

The Army Needs Full-Stack Data Scientists and Analytics Translators

https://www.eda.europa.eu/webzine/issue14/cover-story/big-data-analytics-for-defence

https://en.wikipedia.org/wiki/Artificial_intelligence_arms_race

March 30, 2020

Privacy Implications for the First Wave of Ed Tech in a COVID-19 World

Privacy Implications for the First Wave of Ed Tech in a COVID-19 World
By Daniel Lee | March 30, 2020

Educational institutions have been among the first to implement sweeping operational changes to adjust to and combat the realities of COVID-19. As of this writing, over 100 colleges and universities have shifted their courses from physical to online classrooms to mitigate the dangers and spread of coronavirus, with a growing number of primary and secondary (K-12) school districts also rapidly adopting various distance learning models. Instead of calling on raised hands, passing out homework sheets, or filling lecture halls, instructors are now fielding homework questions via Twitter, disseminating study guides via YouTube, and leveraging online collaboration platforms such as Zoom or Google Hangouts to host synchronous lectures. This has spawned a mad rush to understand how to live and learn in this new world: instructors scrambling to digitize curriculum and content, CIO offices frantically approving new learning technologies and tools, and students and teachers alike adjusting to a wholly new learning environment.

While these changes have been largely disruptive operationally, they are remarkably less disruptive in the transformative sense compared with how we might traditionally evaluate education technology initiatives. Instead, “first wave” education technology responses to COVID-19, such as virtualizing the delivery of learning content and resources, largely focus on the substitution of physical tools or processes with virtual ones, with little to no functional change from the status quo. Dr. Ruben Puentedura’s SAMR model provides a helpful framework for thinking about the varying degrees of technology integration in teaching. Using the SAMR model, we broadly categorize most first wave initiatives as “substitution” given the primary objective of integrating technology to enable students to participate in class without being physically co-located with teachers, and not necessarily to redefine education practices or promote a higher standard for learning efficacy overall. If we can convincingly conclude that many first wave initiatives are largely focused on the substitution of existing classroom functions with digital ones, this also provides a really focused point of comparison for analyzing the legal, ethical, and privacy implications of these first wave education technologies against their analog counterparts.

Frameworks like Nissembaum’s contextual integrity are particularly helpful here to compare and contrast the information flows and inherent privacy implications for physical tools with those of their virtual counterparts, especially for simple scenarios such as taking attendance, participating in lecture, or submitting an assignment where the only difference between physical and virtual is the tooling itself. Traditional classroom information flows between the data subject (the student) and the recipient (the instructor) are generally subject to the transmission principle that this information is secured and employed only within the boundaries and needs of the classroom or school itself. However, when virtual tools are employed for these activities, third-party entities emerge as additional recipients of this information and compromise the contextual integrity of physical classroom activities. We reach similar conclusions when considering Solove’s taxonomy. In most scenarios, data subjects remain mostly consistent across both the physical and virtual toolsets, but the presence of third-party entities in virtual alternatives extends who participates in information collection, data holding, information processing, and information dissemination of personal student and instructor information in these scenarios.

The reason for these deviations, of course, is that many first wave tools are oriented towards making analog information digital and easily accessible by many, and the digitization of these educational tools and processes, by definition, requires the hosting and distribution of classroom content to third-parties. As a result, when classroom information is shared broadly through online platforms like Twitter, Youtube, or Zoom, that information is exposed to a much broader ecosystem of service providers who are traditionally not in play at all or who participate to varying, but lesser degrees in physical classroom environments. As the virtualization of analog information also brings the possibility of generating additional information on data subjects based on their attendance and participation in specific classroom events and their social relationships to other participants, we must also consider the various ways that this data can be fused with or exploited to generate additional insights about data subjects that might have been previously accessible.

As a result, we must carefully consider how the original transmission principles and contexts are either enabled or jeopardized by the involvement of third-party entities, and identify new processes or methods for protecting personal information in virtual classroom settings. While first wave initiatives have made it possible to continue learning activities virtually in the face of COVID-19, they add additional complexity to the classroom privacy landscape and expose new opportunities for inappropriate disclosures or misuse of personal information. We must dutifully consider the ethical, legal, and privacy implications of these first wave technologies and the accountability for safeguarding personal data, especially as many third-party technologies abide by their own set of governing policies on how that information is used, disseminated, and secured.

References:
1: https://www.npr.org/2020/03/13/814974088/the-coronavirus-outbreak-and-the-challenges-of-online-only-classes

2:http://www.hippasus.com/rrpweblog/archives/2014/06/29/LearningTechnologySAMRModel.pdf

3: https://s.abcnews.com/images/International/coronavirus-school-japan-ap-rc-200305_hpMain_16x9_992.jpg

March 9, 2020

Mobile Apps Know Too Much

Mobile Apps Know Too Much
By Annabelle Lee | March 6, 2020

In the modern days, we are not surprised by the fact that the technologies we are using everyday are collecting our personal data to some extent. The data they are collecting could be what we click on the webpage, or keywords we search for. We understand that tech companies collect data in order to improve the technology and for the advertisement reasons. But do we know what exactly they are collecting? Do we give consent to them at all?

Some of the apps, like Expedia, Hotel.com, and Air Canada, have reported to work with a customer experience analytics firm called Glassbox to collect users’ every tap and keyboard entry by recording the screen. Glassbox’s recording technology allows the companies to do analysis on the data by replaying the screenshots and records. However, sensitive data like passport numbers, banking information and passwords could be exposed in the screenshots as well. There was an incidence that Glassbox failed to encrypt the sensitive data and resulted in exposing 20,000 files to whoever has the access to the database.

We would assume that this should be communicated with the users before downloading and starting to use their apps. However, the Term of Service for Expedia, Hotel.com, and Air Canada, does not mention any of the screen recording action they are conducting in the document. It seems like they are purposely not being transparent and honest about the data collection process. The users would have no idea about what the app is doing in the back just from the document itself. Luckily, Apple has found out this issue and sent out a notice to the companies who conduct screen recordings for analytics through Iphone Apps. Apple told them in an email to remove the code that does the screen recording work immediately. Otherwise, the app would be taken down from the app store.

I found it rather ironic that the companies only started to take actions not because of the laws or policies of our government but a private company like Apple. Why is Apple the one who is guarding our data privacy but not our government? It seems like the issue is that the technology is moving too fast but the lawmaking process is moving too slow. The law couldn’t and don’t know how to regulate companies’ data collection process. There is also no penalty for the companies if they are not being honest or transparent about the data collection process in Term of Service or other documents.

We enjoy the benefits and convenience of cutting-edge technologies every single day. However, our laws are rather behind when it comes to protecting the users’ privacy and forcing the companies to be transparent and honest on their work. In comparison, the EU is doing a much better job on regulating the companies and protecting the citizens’ rights. Hopefully in the near future, we can find a balance among technologies, privacy and security.

Works Cited

Many popular iPhone apps secretly record your screen without asking

Apple tells app developers to disclose or remove screen recording code

Image Source

Ahead of CES, Apple touts ‘what happens on your iPhone, stays on your iPhone’ with privacy billboard in Las Vegas

https://medium.com/@tsybinanatalia/ironhack-add-a-feature-project-expedia-app-3d7e1e3c1f8

March 9, 2020

What’s Changing after YouTube’s $170 Million Child Privacy Settlement

What’s Changing after YouTube’s $170 Million Child Privacy Settlement
By Haihui Cao | March 6, 2020

YouTube, owned by Google, was fined $170 million for violating the Children’s Online Privacy Protection Act or COPPA in September 2019. The $170 million fine is the largest COPPA fine up to date according to the Federal Trade Commission (FTC). COPPA is a law that requires websites and online services to provide notice and get parental consent before collecting information from kids under 13. YouTube was accused of violating COPPA by gathering children’s data and targeting kids with advertisements using the collected data without parents’ consent.

Although YouTube’s terms of service exclude children under 13, it claimed to be a general-audience site and marketed itself to advertisers as a top platform for young children. From the communications with some toy’s companies such as Mattel, maker of Barbie and Monster High toys, and Hasbro, maker of My Little Pony and PlayDoh, YouTube and Google claim that “YouTube was unanimously voted as the favorite website for kids 2-12” and “93% of tweens visit YouTube to watch videos”. Yet when it came to complying with COPPA, YouTube told some advertising firms that they did not have to comply with the children’s privacy law because YouTube did not have viewers under 13. In Practice YouTube does not require a user to register to view videos, and most videos are not age appropriate. Anyone can view them, and millions of children under age 13, like my kids, do watch them. YouTube served targeted advertisements on these channels even though it knew the channels were directed to children and watched by children.

So, what’s next and how is YouTube changing? Google and YouTube need to implement a number of changes to comply with COPPA. Starting January 2020, YouTube started a system that asks video creators and channel owners to label and categorize their YouTube content as “directed to children”, and therefore remove the personalized ads and comments. Creators will also have to categorize each of their previously uploaded videos and even their entire channels as needed. According to YouTube, it is using artificial-intelligence algorithms to check the content labels and that it might override some settings “in cases of error or abuse.” The big problem is that nobody is sure what “child-directed content” means exactly. Sometimes it’s too difficult to tell the difference between what’s child-directed content and what’s not. The FTC provides only general rules of thumb about whether content is “directed to children” and thus subject to COPPA. Popular YouTube videos watched by children and adults, such as gaming, toy reviews, and funny family videos, may fall under gray areas. The creators will be held liable if the FTC finds COPPA violations. Google specifically called on the FTC that “the current ambiguity of the COPPA Rule makes it difficult for companies to feel confident that they have implemented COPPA correctly.” YouTube could only advise creators to consult a lawyer to help them work through COPPA impact on their own channels. YouTube’s new COPPA-compliance rules have significant impact on some creators because they will lose a big part of their ads income under the new rules. As a parent of kids who watch videos on YouTube, I’m glad seeing FTC’s decision on YouTube indicating that the FTC is taking the protection of children’s data seriously. I hope FTC and regulators work out clearer guidelines on the “child-directed” contents and YouTube impose more strict auditing on the “child directed” video contents using its advanced technology in the near future.

YouTube is also promoting YouTube Kids, a separate app on all kids’ videos. The app was launched by YouTube in 2015. It filters the grown-up stuff, funnels the kid stuff to the app, and removes many of the features that are available on the main site. As a parent who is concerned about kids’ privacy and protection, yet never took actions to read the privacy policies before, I would recommend parents to read the privacy policies carefully from now on and explore YouTube Kids yourselves. You can set up parental controls and some features such as the timer which lets you set a limit for your kids on the app. Is YouTube kids safe or right for kids, or not? This is not one-answer-fits-all. Every family is different, and I think parents need to work with their kids to figure out a better answer for your family.

References:
https://www.ftc.gov/news-events/press-releases/2019/09/google-youtube-will-pay-record-170-million-alleged-violations
https://www.ftc.gov/news-events/blogs/business-blog/2019/11/youtube-channel-owners-your-content-directed-children
https://support.google.com/youtube/answer/9383587?hl=en

March 9, 2020

Amazon Go: A New Era in Data Collection

Amazon Go: A New Era in Data Collection
By Joshua Smith | March 6, 2020

If you haven’t heard of Amazon Go yet, it’s Amazon’s newest concept store. The stores are not especially common with only twenty-six locations in the United States. And, the content is the same as your typical small convenience store. What makes them unique is they introduce ‘just walk out’ technology. Essentially, this means you walk in, grab what you want, and walk out without any human interaction, lines, or checkout which would be brazen theft in any other store.

Credit to forbes.com

Amazon accomplishes this customer experience with an impressive array of technology including computer vision, sensor element fusion, and deep learning according to the app and a patent. This necessarily means it is equipped with ubiquitous cameras, sensing devices, and passive scanners that Amazon seemingly has taken great care to make low-profile. Further, the technologies, and by proxy the store, are heavily data-driven and rely on the information to function. Make no mistake we’re talking about lots and lots of data.

It’s safe to assume that everything you do will be recorded, analyzed, and likely stored in some form by the cameras, sensors, and algorithms that process the data. In fairness, Amazon would be crazy not to do this from a business perspective. Learning from data is the central method to improve their product. This is a technology dependent store after all. Furthermore, Amazon has developed something many other retailers will be interested in for cost-saving, theft mitigation, and optimization. All of these secondary effects will be driven by that data.

So, what does this mean in practical terms? It likely means that when you walk into an Amazon Go store you are walking into a bit of a laboratory. Amazon will have the potential to answer with high-specificity how people shop, interact with products, and move through stores. They will also have the ability to analyze how people interact with each other and their broader environment. This data generates a growing network of questions. Are there machine detectable physical behaviors that will tell us whether someone is about to buy toothpaste? That someone is about to start a conversation with a stranger? That someone is attracted to another person? If they are, do they both have the same dating app so they can be connected? Biomarkers and personal traits could be stored and used to track and evaluate individuals. Is this person losing or gaining weight? Do their gaits indicate they are injured or have some other medical issue? What brand is that worn out sweater so we can send a link to purchase a new one?

This is speculative of course. There isn’t an outward indication that Amazon is tackling these types of problems, but they could try. For now, the microcosm of a convenience store might be too small to answer some of these questions, but it’s likely we’d be surprised what could be discovered. As this technology becomes more common and works its way into other areas of our life, we can almost guarantee that some company will try to answer these questions. With data sharing, the ability would be augmented considerably.

Ubiquitous sensor and vision data that is processed in this way isn’t something we’ve confronted at scale yet. Phones don’t really provide a useful analogy, since they’re largely confined to a pocket or bag and typically require an interaction. More analogous technology from law enforcement and intelligence agencies have made headlines with facial recognition and Gorgon Stare, but the context isn’t the same. It’s not surveillance. It’s something that is being worked into the fabric of daily life, which is an important distinction. This type of data collection represents a shift to the highly personal from a passive collection perspective. Your data will be collected by doing nothing other than being in a public space. It’s not clear what to do about this. It naturally entangles categories of data with elevated protections, privacy rights, and questions regarding the proper application of technology. However, what is clear, is this change requires careful consideration.

March 9, 2020

Uber can take you somewhere, could it also take advantage of your personal information?

Uber can take you somewhere, could it also take advantage of your personal information?
By Yuze Chen | March 7, 2020

Nowadays, people are using technology to help them get around the city. Ride-sharing platforms such as Uber and Lyft are being used by hundreds and thousands of people around the world to help them get from point A to point B. These platforms help the riders with a lower-than-taxi fare, removes the hassle to book an appointment with taxi companies in advance. They also give drivers an opportunity to earn extra income using their own vehicles. With the convenience to take people around the city on their fingertips, ride-sharing companies are also known for their sketchy privacy policy.

Photo by Austin Distel on Unsplash

What Data does Uber Collects

According to Uber’s Privacy Notice, Uber collects a variety of data including three main aspects: Data provided by user, such as username, email, address, etc; Data created when using Uber services, such as user’s location, app usage and device data; Data from other sources, such as Uber partners who provide data to Uber. Overall, the personal identifiable information (PII) are collected and stored by Uber. Besides that, Uber also collects and stores information that can reconstruct an individual’s daily life, such as location history. With the wealth of PIIs collected and stored by Uber, the user should be concerned about what Uber can do with the data.

Photo by Charles Deluvio on Unsplash Solove’s Taxonomy Analysis

Solove’s Taxonomy provides us a framework to analyze the potential harm on the data subject (the user). According to Solove’s Taxonomy, Uber users are the data subject. When the user uses the Uber service, their personal data are transmitted to Data Holder, which is Uber. Then it will go through Information Processing step in Solove’s Taxonomy, including aggregation, identification, etc. After processing the data, there will be a Information Dissemination step where Uber can take action on user’s data.

Information Collection
Let’s look at the first step in Solove’s Taxonomy – Information Collection. As discussed above, Uber collects data in three ways. However, there is a weird clause in Uber’s privacy notice III. A. 2. Location data section: “precise or approximate location data from a user’s mobile device if enabled by the user… when the Uber app is running in the foreground (app open and on-screen) or background (app open but not on-screen) of their mobile device”. This means Uber is able to always collect the user’s current location on background even though the user isn’t using the app. The clause gives Uber an opportunity to always monitor the user’s location after using Uber once. This creates potential harm to the user because it’s possible that a user’s location is always being collected by Uber without consent.

Information Processing
In the Information Processing step of Solove’s Taxonomy, Uber has a couple of sketchy terms in their privacy policy. In III. A. 2 Transaction Information, Uber claims that if a user refers a new user with the promo code, both user’s information will be associated in Uber’s database. This practice could be harmful to both user’s privacy and could cause some unpredicted consequences. For example, many users simply post their promotion code to a public blog or discussion board. Without knowing the referee who could use the promotion code, there is potential privacy and legal harm to to associate two strangers. For example, if the referee were under investigation of a crime when using Uber, the referer user could have also been involved in the legal matter, even if they don’t know each other.
Information Dissemination
According to Privacy Policy Section III. D, Uber could share the user data with third parties such as Uber’s data analytics providers, insurance and financing partners. It does not specify who those partners are and how they are able to do with the data. It is obvious there will be potential harms to the user during this step. First, the user might not want their data to be disclosed to the third party other than Uber. This is more privacy concerns

Conclusion
According to our analysis on Uber’s privacy policy, we found plenty of surprising terms that would jeopardize user’s privacy. The data collection, data processing, and data dissemination are all having issues with user consent and user privacy protection. Some are even pulling users into potential legal issues. The wording of the Privacy Notice makes it easy for Uber to take advantage of user data legally, such as selling to its partner. There are a lot that Uber could do to better address those privacy issues in the Privacy Notice.

References
https://www.uber.com/global/en/privacy/notice/
https://wiki.openrightsgroup.org/wiki/A_Taxonomy_of_Privacy

March 2, 2020

Assessing Clearview AI through a Privacy Lens

Assessing Clearview AI through a Privacy Lens
By Jonathan Hilton | February 28, 2020

The media has been abuzz lately about the end of privacy driven extensively from the capabilities of Clearview AI. The New York Times drew attention to the start up in an article about Clearview AI called ‘The Secretive Company That Might End Privacy as We Know It’, where the paper details the secretive nature of a company that has expansive capabilities with facial recognition. Privacy is a complicated subject that requires extensive context to understand what is and what is not acceptable in a society. In fact, privacy does not have universal meaning and understanding within society or within the academic community. Privacy in one context is not considered a privacy violation in another context. Since this ambiguity exists, we can deconstruct the context of vast facial recognition capabilities with Solove’s Taxonomy to better understand the privacy concerns with Clearview AI.

Prior to examining Clearview AI’s privacy concerns with Solove’s Taxonomy, we need to address what Clearview AI is doing with their facial recognition capability. Clearview AI CEO, Ton That (see figure 1), started the company back in 2017. The basic premise of the company is that their software can take a photo and use a database comprised of billions of photos to identify the person in the photo. How did they come up with this database with out the consent of each person whose likeness is in the database? Clearview AI has scraped pictures from Facebook, YouTube, Venmo and many others to compile their massive database. The company claims the information is ‘public’ and thus does violate any privacy laws. So who exactly uses this software? Clearview maintains that they work with law enforcement agencies and their product is not meant for public consumption.

Clearview CEO Hoan Ton-That

More information about Clearview AI can also be found at their website: https://clearview.ai/

After the Times published the article bringing Clearview AI’s practices to light, there has been intense backlash from the public over privacy concerns. Solove’s Taxonomy should be able to help us decompose exactly what the privacy concerns are.

Solove’s Taxonomy was published by Daniel J. Solove in January 2006. The basic components of the Taxonomy can be found in figure 2 below.

Solove’s Taxonomy

In the case of Clearview AI, the entire public is the Data Subject, or at least those who have either posted videos or pictures or were part of videos or pictures posted by others. So even if a person never posted to social media or a video site, their likeness may be part of the database simply because someone else did post their picture or even caught them in the background of their own photo. An obvious case of harm can come from surveillance according to the Solove taxonomy. Locations, personal associations, activities, and preferences could be derived from running a person’s photo through the system and then viewing all of the results. While Clearview AI claims that their product is to search and not surveil according to their website, it is very difficult to see how the data could not be used for surveillance. The distinction between and search and surveillance is a hard one to draw since searches can quickly produce results that achieve surveillance.

In terms of information processing, the entire purpose of the software is to aggregate data to attain personal identification. With this type of information, great concerns start to arise on the use of this type of capability and data usage. While Clearview maintains that its software serves the needs of law enforcement, the question of secondary use brings great concern. How long until law enforcement adjacent organizations gain access to the software, such as bounty hunters, private security, or private investigators. What other government organizations will gain access to the same capability for purposes such as taxes, welfare disbursement, airport surveillance, or government employment checks. The road to secondary uses of data by the government is well paved by historical precedence such as the use of genealogical DNA for criminal investigations or Social Security numbers becoming de-facto government ID numbers. Furthermore, what happens when Clearview AI is purchased and used by governments to repress their people. More authoritarian regimes could use the software to fully track their citizens and use it against anyone advocating human rights or democracy.

Pictures can be used for much more than law enforcement

Information dissemination also poses a risk that Clearview AI’s results could be distributed to others that would use it for nefarious purposes. For instance, if a bad actor in a government agency distributes a search of their friend or co-worker to another, what type of harm could that cause the person whose images were searched. Additionally, what happens when the software is either hacked or reverse engineered so that the general public can do the same searches or searches are sold on the dark web. It is not a far cry to imagine a person being blackmailed for the search results. We have already seen where hackers will encrypt hard drives to hold people ransom for either money or explicit images of the computer owner, so this would be another extension of the same practice.

Finally, invasion is a real threat to all data subjects. A search of a person’s image that reveals more than who they are but more what they are doing, when they are doing it, and with whom could be used in many ways to affect the person’s day to day decision making and activities. For instance, if this information can be sold to retailers or marketers, a person may find that there is a targeted advertisement is made at the right place and time to influence a person’s decision. Furthermore, the aforementioned bad actor scenario could lead to direct stalking or personal residence invasion if significant details are known about a person. A few iterations of this technology could find the software’s capability mounted on a AR / VR IoT device that will identify people and perhaps some level of personal information as they are seen in public. So basically, a person can where the device and see who each person is that passes them in a mall or on a street.

In conclusion, Solove’s Taxonomy does help deconstruct the privacy concerns associated with Clearview AI’s facial recognition capability. The serious concerns with the software is unlikely to go away anytime soon but unfortunately once this type of technology is developed, it usually does not go away. Society will have to continue to grapple more and more with the growing power and capability of facial recognition.

References:

https://www.popularmechanics.com/technology/security/a30613488/clearview-ai-app/
https://en.wikipedia.org/wiki/Clearview_AI

Clearview AI CEO Defends Facial Recognition Software

Solove, Daniel J., ‘A Taxonomy of Privacy’, University of Pennsylvania Law Review, Jan 2006.