How much of our secrets can machine learning models leak? – Data Science W231

How much of our secrets can machine learning models leak?
By Anonymous | May 28, 2021

The field of Machine Learning (ML) has achieved meaningful progress in the last decade, especially with the introduction of low-cost computing and storage by cloud providers. Its popularity originates mainly due to current successful applications and promises that it will bring numerous social benefits to humanity.

ML models learn patterns from data used during their training process. The ‘learning’ is stored as parameters represented according to the model’s architecture and algorithm.

After that, the models can be used to respond to a variety of queries. Moreover, it is common that those models are published as a service (via API), where an adversary may have black-box access to them without any knowledge of the model’s internal parameters.

Given that process, an important question arises: how much sensitive information existing in training data can be leaked through this type of access?

A flawed model design and training the model may cause an ‘overfitting’ problem, which happens when the model corresponds too closely to a particular set of data, in this case, the training data. The more ‘overfitted’ the model, the easier it is for an adversary to perform attacks that seek to disclose sensitive information contained in the training data.

Among existing attacks are training data extraction, membership inference, and attribute inference attacks2. Those attacks have a limited scientific investigation, and the knowledge about their impact on individuals’ privacy is still to be fully understood.

Training Data Extraction¹

Language Model (LMs) are trained using massive datasets – think about a terabyte of English text – to generate a response that approximates a fluent language. In summary, those models ‘memorize’ the trained data, which may contain sensitive information, and create the opportunity for some of this sensitive information to be reflected in the model’s output. An attacker can explore this opportunity by providing hundreds, or millions, of questions to the model and examining the output.

Membership Inference⁷ ⁸

This threat happens when, given a data record, an attacker can infer whether the record was part of the training dataset. One way this attack is performed is by exploiting the outputs (confidence scores) of the model, and the efficacy of the attack improves for models with ‘overfitting’ problems. Membership inference can have profound privacy implications, such as identifying a person as part of a group that has a particular disease or has been admitted to a hospital.

Attribute Inference

In this attack, the adversary tries to infer missing attributes of a partially known record used in the training dataset. Zhao et al⁴. experimentally concluded that “it is infeasible for an attacker to correctly infer missing attributes of a target individual whose data is used to train an ML model. That does not mean the same model is not vulnerable to membership inference attacks.

Data Scientists are responsible for applying privacy protection techniques to the training data to prevent later disclosure from adversaries’ attacks, such as Differential Privacy⁶. Additionally, they must adopt industry best practices to prevent ‘overfitting’ models to its training data, which significantly increases vulnerabilities to attacks, especially when the models are supposed to be accessible via APIs.

As crucial as exploring the technical details behind the attacks capable of leaking information from models and how to avoid it, it is the investigation of the privacy concerns that come from this threat. Those concerns add even more complexity to the Data Scientists’ responsibilities when using data, including adherence to fair information practices, such as those summarized by the Belmont Report³, when collecting, handling, using, and sharing data.

The analysis of the ML data extraction and inference attacks unveil direct application of all Belmont Report principles:

Respect for Persons– despite the fact the Data Scientists are usually distant to the collection of data, and more importantly, from the individuals that contributed to the data, they must be sensitive to protecting the autonomy of those individuals and act according to their consent. Besides doing the best of their knowledge to avoid common vulnerabilities to attacks, the users need to be aware and consent to the risk of being a member of the dataset.
Benevolence– even when protected from attacks, machine learning models can learn bias, racism, and other social challenges if not correctly designed and trained. That includes the exclusion or overrepresentation of particular groups. Therefore it is critical that Data Scientists feel obligated to protect individuals from harm while maximizing benefits.
Justice– finally, the benefits yielded through training ML models should be inclusive to all and have a fair distribution.

Finally, ML models are vulnerable to disclosing sensitive information raises privacy protection and ethics concerns. More importantly, it interplays with data protection law and risks being classified as personal data under General Data Protection Regulation (GDPR).

There are important questions to be answered as machine learning become ubiquitous, and new threats of sensitive information disclosure appear:

What is the comprehensive list of obligations that Data Scientists need to comply with to adhere to the Belmont report principles and regulations such as GDPR? How to assess that those obligations were met?