Operation Neptune Spear

Operation Neptune Spear
By Chris Sanchez | March 3, 2019

Almost eight years ago on May 2nd, 2011, at 11:35pm Eastern Time, former President Barak Obama unfolded Operation NEPTUNE SPEAR to the world:

*“…the United States has conducted an operation that killed Osama bin Laden, the leader of al-Qaeda, and a terrorist who’s responsible for the murder of thousands of innocent men, women, and children.”*

Neptune Spear Command Center

At the time, the American public was aware that the US was engaged in combat operations in Afghanistan, but the whereabouts of Osama bin Laden—including whether he was alive or dead—were unknown. The announcement by President Obama (which, by the way, interrupted my viewing of America’s Funniest Home Videos, confirmed to the American public that Osama bin Laden:

  • Survived the US invasion of Afghanistan in 2001.
  • Had been hiding in Pakistan for several years.
  • Was killed in the raid by a highly trained (but undisclosed) US military unit.

President Obama’s announcement and subsequent reporting provided additional details about the raid and the decisions leading up to it, but the primary substance of the event can be neatly summarized in the above three bullet points. Yet much to my shock and dismay, over the coming days, I watched as news channels reported leaked details of the event to include classified information such as the identity of the military unit responsible for the raid including their call signs, identifying features, and deployment rotation cycle. None of this disclosed information materially altered the narrative of what had happened or provided any particularly useful insight into this classified military operation.

Secrecy and representative democracy have long had a tumultuous relationship, which is not likely to significantly improve in our Age of Information (On-Demand), as there will always be a trade-off between government transparency and the desire to keep certain pieces of information hidden from the public in the name of national security to include economic, diplomatic, and physical security. And though it often takes major headline events—Pentagon Papers (1971), Wikileaks (2006), Edward Snowden (2013) —to jar the public consciousness, the resultant public discussion surrounding these events, often finds that the balance between transparency and secrecy is either not well monitored, or well understood, by those who are elected/appointed to safeguard both the public trust and their overall security.

Take for instance the Terrorist Screening Database  (TSDB, commonly known as the “terrorist watchlist”). The TSDB is managed by the Terrorist Screening Center, a multi-agency organization created in 2003 by presidential directive, in response to the lack of intelligence sharing across governmental agencies prior to the September 11 terrorist attacks. People—both US citizens and foreigners—who are known or suspected of having terrorist organization affiliations are placed into the TSDB, along with unique personal identifiers, including in some cases, biometric information. This central repository of information is then exported across federal agencies (Department of States, Department of Homeland Security, Department of Defense, etc.) to aid in terrorist identification across passive and active channels.
TSDB Nomination Regimen

In the aftermath of the 9/11 attacks and subsequent domestic terror incidents, one would be hard pressed to argue that the TSDB is not a useful and necessary information-sharing tool for US Law Enforcement and other agencies responsible for domestic security. But like other instances of the government claiming the necessity of secrecy in the name of national security, there are indications that the secrecy/transparency balance is tilted in favor of unnecessary secrecy. A report in 2014 from the Intercept —an award-winning news organization—claimed evidence that 280,000 people in the TSDB (almost half the total number at the time), had no known terrorist group affiliation. How or why were these unaffiliated people placed into this federal database? The consequences of being placed in the TSDB are not trivial. Depending on the circumstances, TSDB members can find themselves on the “no-fly list”, have visas denied, be subjected to enhanced screenings at various checkpoints, and find their personal information (including biometric information) exposed across multiple organizations.

With an average of over 1,600 daily nominations to the TSDB, I am hard-pressed to believe that due diligence is conducted on all of those names, despite what is claimed on the Federal Bureau of Investigation’s FAQ section  of their Terrorist Screening Center website, regarding the thoroughness of the TSDB nomination process. Furthermore, once nominated, it’s very cumbersome for individuals to correct or remove records about them in the TSDB, in spite of a formal appeals procedures as mandated by the Intelligence Reform and Terrorism Prevention Act of 2014. The Office of the Inspector General under the Department of Justice has criticized the maintainers of the TSDB for “…frequent errors and being slow to respond to complaints”. A 2007 Inspector General report found a 38% error rate in a 105 name sample from the TSDB.

As long as we live in a representative democracy that values individual privacy, free and open discussion of policy, and the applicability of Constitutional principles to all US citizens, there will always be “friction” at the nexus of government responsibility, public trust in governmental institutions, and secrecy. Trust in US governmental institutions has slowly eroded over time, due in large part to the access of information previously hidden from the public, which was found to be contrary/misleading to what they had been told or had been led to believe. Experience has shown that publicly elected representatives are often not enough of a check on the power of government agencies to strike an appropriate balance between secrecy and transparency. Fortunately, though not perfect in their efforts to right perceived wrongs, much progress has been made at this nexus point by public advocacy organizations, academic institutions, investigative journalism, constitutional lawyers, and concerned citizens.

In my experience, which includes being on the front lines of the War on Terror from 2007-2013, the men and women who comprise the totality of “government institutions”, while imperfect, generally do have the best interests of the nation (as a whole), in mind when prosecuting their responsibilities. Given the limitations of human decision making in times of both crisis and tranquility, there is a tendency to err on the side of secrecy in the name of security. However, taken to extremes this mentality can result in significant abuses of power ranging from moderate invasions of privacy to severe abuses of personal freedoms. To compound the situation, the public erosion of trust in government creates a certain level of suspicion behind every governmental action that is not completely “above board”, even when there are very good reasons for non-public disclosure of information (such as the operational details as described in the Operation Neptune example cited at the beginning of this article). At the end of the day, the government will take those measures it deems as necessary to secure the safety of its citizenry, even if such actions come at the expense of the rights of minority groups or those who do not find themselves in political power. I think it’s our job as vigilant citizens to ensure that the balance of power is restored once the real or perceived crisis has passed.

How transparent does a government need to be? In a representative democracy it needs to be as transparent as possible without compromising public safety and security. How the US government and its citizens decide to strike that balance over the coming generations will be an interesting discussion indeed.

Primary Sources
1. en.wikisource.org/wiki/Remarks_by_the_President_on_Osama_bin_Laden
2. fas.org/sgp/crs/terror/R44678.pdf
3. theintercept.com/2014/08/05/watch-commander/
4. www.fbi.gov/file-repository/terrorist-screening-center-frequently-asked-questions.pdf/view

Data Privacy and the Chinese Social Credit System

Data Privacy and the Chinese Social Credit System
“Keeping trust is glorious and breaking trust is disgraceful”
By Victoria Eastman | February 24, 2019

Recently, the Chinese Social Credit System has been featured on podcasts, blogs, and news articles in the United States, often highlighting the Orwellian feel of the imminent system China plans to use to encourage good behavior amongst its citizens. The broad scope of this program raises questions about data privacy, consent, algorithmic bias, and error correction.

What is the Chinese Social Credit System?

In 2014, the Chinese government released a document entitled, “Planning Outline for the Construction of a Social Credit System” The system uses a broad range of public and private data to rank each citizen on a scale from 0-800. Higher ratings offer citizens benefits like discounts on energy bills, more matches on dating websites, and lower interest rates. Low ratings incur such punishments as the inability to purchase plane or train tickets, banishment for you and your children from universities, and even pet confiscation in some provinces. The system has been undergoing testing in various provinces around the country with different implementations and properties, but the government plans to take the rating system nationwide in 2020.

The exact workings of the system have not been explicitly detailed by the Chinese government, however details have spilled out since the policy was announced. Data is collected from a number of private and public sources: chat and email data; online shopping history; loan and debt information; smart devices, including smart phones, smart home devices, and fitness trackers; criminal records; travel patterns and location data; and the nationwide collection of millions of cameras that watch all Chinese citizens. Even your family members and other people you associate with can affect your score. The government has signed up more than 44 financial institutions and has issued at least 8 licenses to private companies such as Alibaba, Tencent, and Baidu to submit data to the system. Algorithms are run over the entire dataset and generate a single credit score for each citizen.

This score will be publicly available on any number of platforms including the newspapers, online media, and even some people phones so when you call a person with a low score, you will hear a message telling you the person you are calling has low social credit.

What does it mean for privacy and consent?

On May 1st, 2018, China announced the Personal Information Security Specification, a set of non-binding guidelines to govern the collection and use of personal data of Chinese citizens. The guidelines appear similar to the European GDPR with some notable differences, namely a focus on national security. Under these rules, individuals have full rights to their data, including erasure and must provide consent for any use of personal data by the collecting company.

How do these guidelines jive with the social credit system? The connection between the two policies has not been explicitly outlined by the Chinese government, but at first blush it appears there are some key conflicts between the two policies. Do citizens have erasure power over their poor credit history or other details that negatively affect their score? Are companies required to ask for consent to send private information to the government if it’s to be used in the social credit score? If the social credit score is public, how much control to individuals really have over the privacy of their data?

Other concerns about the algorithms themselves have also been raised. How are individual actions weighted by the algorithm? Are some ‘crimes’ worse than others? Does recency matter? How can incorrect data be fixed? Is the government removing demographic information like age, gender, or ethnicity or could those criteria unknowingly create bias?

Many citizens with high scores are happy with the system that gives them discounts and preferential treatment, but others fear the system will be used by the government to shape behavior and punish actions deemed inappropriate by the government. Dissidents and minority groups fear the system will be biased against them.

There are still many details that are unclear about how the system will work on a nationwide scale, however, there are clear discrepancies between the published data privacy policy China announced last year and the scope of the social credit system. How the government addresses the problems will likely lead to even more podcasts, news articles, and blogs.

Sources

Sacks, Sam. “New China Data Privacy Standard Looks More Far-Reaching than GDPR”. Center for Strategic and International Studies. Jan 29, 2018. www.csis.org/analysis/new-china-data-privacy-standard-looks-more-far-reaching-gdpr

Denyer, Simon. “China’s plan to organize its society relies on ‘big data’ to rate everyone“. The Washington Post. Oct 22, 2016. www.washingtonpost.com/world/asia_pacific/chinas-plan-to-organize-its-whole-society-around-big-data-a-rating-for-everyone/2016/10/20/1cd0dd9c-9516-11e6-ae9d-0030ac1899cd_story.html?utm_term=.1e90e880676f

Doxing: An Increased (and Increasing) Privacy Risk

Doxing: An Increased (and Increasing) Privacy Risk
By Mary Boardman | February 24, 2019

Doxing (or doxxing) is a form of online abuse where one party releases sensitive and/or personally identifiable information. While it isn’t the only risk associated with a privacy concern, it is one that can be put people physically in harm’s way. For instance, this data can include information such as name, address, telephone number. Such information exposes doxing victims to threats, harassment, and/or even violence.

People dox others for many reasons, all with the intention of harm. Because more data is more available to more people than ever, we can and should assume the risk of being doxed is also increasing. For those of us working with this data, we need to remember that there are actual humans behind the data we use. As data stewards, it is our obligation to understand the risks to these people and do what we can to protect them and their privacy interests. We need to be deserving of their trust.

Types of Data Used
To address a problem, we must first understand it. Doxing happens when direct identifiers are released, but these aren’t the only data that can lead to doxing. Some data are such as indirect identifiers, can also be used to dox people. Below are various levels of identifiability and examples of each:

  • Direct Identifier: Name, Address, SSN
  • Indirect Identifier: Date of Birth, Zip Code, License Plate, Medical Record
  • Number, IP Address, Geolocation
  • Data Linking to Multiple Individuals: Movie Preferences, Retail Preferences
  • Data Not Linking to Any Individual: Aggregated Census Data, Survey Results
  • Data Unrelated to Individuals: Weather

Anonymization and De-anonymization of Data
Anonymization is a common response to privacy concerns and can be seen as an attempt to protect people’s privacy. The way this is done is by removing identifiers from a dataset. However, because this data can be de-anonymized, anonymization is not a guarantee of privacy. In fact, we should never assume that anonymization can provide more than a level of inconvenience for a doxer. (And, as data professionals, we should not assume anonymization is enough protection.)

Generally speaking, there are four types of anonymization:
1. Remove identifiers entirely.
2. Replace identifiers with codes or pseudonyms.
3. Add statistical noise.
4. Aggregate the data.

De-anonymization (or re-identification) is where data that had been anonymized are accurately matched with the original owner or subject. This is often done by combining two or more datasets containing different information about the same or overlapping groups of people. For instance, anonymized data from social media accounts could be combined to identify individuals. Often this risk is highest when anonymized data is sold to third parties who then re-identify people.


Image Source:
technodocbox.com/Internet_Technology/75952421-De-anonymizing-social-networks-and-inferring-private-attributes-using-knowledge-graphs.html

One example of this is Sweeney’s 2002 paper where she was able to correctly identify 87% of the US population with just zip code, birthdate, and sex. Another example is work by Acqusiti and Gross from 2009, where they were able to predict social security numbers with birthdate and geographic location. Other examples include a 2018 study by Kondor, et al., where they were able to identify people based on mobility and spatial data. While their study only had a 16.8% success rate after a week, this jumped to 55% after four weeks.


Image Source:
portswigger.net/daily-swig/block-function-exploited-to-deanonymize-social-media-accounts

Actions Moving Forward
There are many options data professionals can take. These range from being negligent stewards, doing as little as possible, to the more sophisticated differential privacy option. El Emam presented a protocol back in 2016 that does a very elegant job of balancing feasibility with effectiveness to anonymize data. He proposed the following steps:

1. Classify variables according to direct, indirect, and non-identifiers
2. Remove or replace direct identifiers with a pseudonym
3. Use a k-anonymity method to de-identify the indirect identifiers
4. Conduct a motivated intruder test
5. Update the anonymization with findings from the test
6. Repeat as necessary

We are unlikely to ever truly know the risk of doxing (and with it, de-anonymization of PII). However, we need to assume de-anonymization is always possible. Because our users trust us with their data and their assumed privacy, we need to make sure their trust is well-placed and be vigilant stewards of their data and privacy interests. What we do, and the steps we take as data professionals can and do have an impact on the lives of the people behind the data.

Works Cited:
Acquisti, A., & Gross, R. (2009). Predicting Social Security numbers from public data. Proceedings of the National Academy of Sciences, 106(27), 10975–10980. doi.org/10.1073/pnas.0904891106
Center, E. P. I. (2019). EPIC – Re-identification. Retrieved February 3, 2019, from epic.org/privacy/reidentification/
El Emam, Khaled. (2016). A de-identification protocol for open data. In Privacy Tech. International Association of Privacy Professionals. Retrieved from iapp.org/news/a/a-de-identification-protocol-for-open-data/
Federal Bureau of Investigation. (2011, December 18). (U//FOUO) FBI Threat to Law Enforcement From “Doxing” | Public Intelligence [FBI Bulletin]. Retrieved February 3, 2019, from publicintelligence.net/ufouo-fbi-threat-to-law-enforcement-from-doxing/
Lubarsky, Boris. (2017). Re-Identification of “Anonymized” Data. Georgetown Law Technology Review. Retrieved from georgetownlawtechreview.org/re-identification-of-anonymized-data/GLTR-04-2017/
Narayanan, A., Huey, J., & Felten, E. W. (2016). A Precautionary Approach to Big Data Privacy. In S. Gutwirth, R. Leenes, & P. De Hert (Eds.), Data Protection on the Move (Vol. 24, pp. 357–385). Dordrecht: Springer Netherlands. doi.org/10.1007/978-94-017-7376-8_13
Narayanan, A., & Shmatikov, V. (2010). Myths and fallacies of “personally identifiable information.” Communications of the ACM, 53(6), 24. doi.org/10.1145/1743546.1743558
Snyder, P., Doerfler, P., Kanich, C., & McCoy, D. (2017). Fifteen minutes of unwanted fame: detecting and characterizing doxing. In Proceedings of the 2017 Internet Measurement Conference on – IMC ’17 (pp. 432–444). London, United Kingdom: ACM Press. doi.org/10.1145/3131365.3131385
Sweeney, L. (2002). k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(05), 557–570. doi.org/10.1142/S0218488502001648

Android Apps in the Hot Seat for Violating Privacy Rules

Over 17k Android Apps in the Hot Seat for Violating Privacy Rules
A new ICSI study shows that Google’s user-resettable advertising IDs aren’t working
by Kathryn Hamilton (www.linkedin.com/in/hamiltonkathryn/)
February 24, 2019

What’s going on?
On February 14th 2019, researchers from the International Computer Science Institute (ICSI) published an article claiming that thousands of Android apps are breaking Google’s privacy rules. ICSI claims that while Google provides users with advertising privacy controls, these controls aren’t working. ICSI is concerned for users’ privacy and is looking for Google to address the problem.

But what exactly are the apps doing wrong? Since 2013, Google has required that apps record only the user’s “Ad ID” as an individual identifier. This is a unique code associated to each device that advertisers use to profiles users over time. To ensure control remains in the hands of each user, Google allows users to reset their Ad ID any time. This effectively resets everything that advertisers know about a person so that their ads are once again anonymous.

Unfortunately, ICSI found that some apps are recording other identifiers too, many of which the user cannot reset. These extra identifiers are typically hardware related like IMEI, MAC Address, SIM card ID, or device serial number.


Android’s Ad ID Settings

How does this violate privacy?

Let’s say you’ve downloaded one of the apps that ICSI has identified as being in violation. This list includes everything from Audible and Angry Birds to Flipboard News and antivirus softwares.

The app sends data about your interests to its advertisers. Included is your resettable advertising ID and your device’s IMEI, a non-resettable code that should not be there. Over time, the ad company begins to build an advertising profile about you, and the ads you see become increasingly personalized.

Eventually, you decide to reset your Ad ID to anonymize yourself. The next time you use the app, it will again send data to its advertisers about your interests, plus your new advertising ID and the same old IMEI.

To a compliant advertiser, you would appear to be a new person—this is how the Ad ID system is supposed to work. For the noncompliant app, however, advertisers simply match your IMEI to the old record they had about you and associate your two Ad IDs together.

Just like that, all your ads go back to being fully personalized, with all the same data that existed before you reset your Ad ID.

But they’re just ads. Can this really harm me?

I’m sure you have experienced the annoyance of being followed by ads after visiting a product’s page once and maybe even by accident. Or maybe you’ve tried to purchase something secretly for a loved one and had your surprise ruined by some side banner ad. The tangible harm to a given consumer might not be life-altering, but it does exist.

Regardless, the larger controversy here is not the direct harm to a consumer but rather the blatant lack of care or conscience exhibited by the advertisers. This is an example of the ever-present trend of companies being overly aggressive in the name of profit, and not respecting the mental and physical autonomy that should be fundamentally human.

This problem is only increasing as personal data is becoming numerous and easily accessible. If we’re having this amount of difficulty anonymizing ads, what kind of trouble will we face when it comes to bigger issues or more sensitive information?

What is going to happen about it?

At this point, you might be thinking that your phone’s app list is due for some attention. Take a look through your apps and delete those you don’t need or use—it’s good practice to clear the clutter regardless of whether an app is leaking data. If you have questions about specific apps, search ICSI’s Android app analytics database, which has privacy reports for over 75,000 Android apps.

In the bigger picture, it’s not immediately clear that Google, app developers, or advertisers have violated any privacy law or warrant government investigation. More likely, it seems that Google is in the public hot seat to provide a fix for the Ad ID system and to crack down on app developers.

Sadly, ICSI reported their finding to Google over five months ago, but have yet to hear back. Their study has spurred many media articles over the past few days, which means Google should feel increasing pressure and negative publicity over this in the coming weeks.

Interestingly, this case is very similar to a 2017 data scandal about Uber’s iOS app, which used hardware based IDs to tag iPhones even after the Uber app had been deleted. This was in direct violation of Apple’s privacy guidelines, caused large amounts of public outrage, and resulted in threats from Apple CEO Tim Cook to delete Uber from the iOS App Store. Uber quickly updated their app.

It will be interesting to see how public reaction and Google’s response measure up to the loud public outcry and swift action taken by Apple in the case of Uber.

Fall 2017 Test

Hi there, everyone! This is a “test” post to ensure that the process is working as intended and that everyone should have access to create posts of your own!