Making AI useful for enterprises hinges largely on the ability to extract value from data. At the same time, responsible AI needs to comply with existing data protection regulations, like the EU’s GDPR and California’s CCPA, which restrict the collection and processing of personal data. Creating value with AI and data while utilizing the best practices in trust creates a well-known tension for businesses.
In recent years, we’ve witnessed the emergence of attacks designed to infer sensitive information from trained machine learning (ML) models. These include membership inference attacks, model inversion attacks, and attribute inference attacks. That such sensitive information can be extracted from trained ML models led the community to conclude that those models trained on personal data should, in some cases, be considered personal information themselves. This revelation in turn spurred the development of a multitude of solutions aimed at creating privacy-preserving models.
IBM’s AI Privacy Toolkit is a set of open-source tools designed to help organizations build more trustworthy AI solutions. It bundles together several practical tools and techniques related to the privacy and compliance of AI models. The toolkit is designed to be used by model developers (like data scientists) as part of their existing ML pipelines. It is implemented as a Python library that can be used with different ML frameworks such as scikit-learn, PyTorch, and Keras.
In our recent paper published in the journal, SoftwareX, we showcased the functionalities of the toolkit, mainly ML model anonymization and data minimization. Each of these modules is a first-of-a-kind implementation of a new approach to AI privacy.
Most approaches to protecting the privacy of ML training data, including training models with differential privacy (DP), typically require making changes in the learning algorithms themselves. This leads to a plurality of solutions that are difficult to adopt in organizations that employ many different ML models. Moreover, these solutions are not suitable for scenarios in which the learning process is carried out by a third party, and not the organization that owns (and wants to anonymize) the private data.
Our AI privacy toolkit implements a practical solution for anonymizing ML models that is completely agnostic to the type of model trained. It’s based on applying what’s known as k-anonymity to the training data and then training the model on the anonymized dataset to yield an anonymized model. The idea behind k-anonymity is to group together a certain number “k“ (or more) samples and generalize them in a way that makes them indistinguishable.
Past attempts at training ML models on anonymized data have resulted in very poor accuracy. However, our anonymization method is guided by the specific ML model that will be trained on the data. We use the knowledge encoded within the model to produce an anonymization that is highly tailored to the model. We call this method model-guided, or accuracy-guided anonymization. This approach outperforms non-tailored anonymization techniques in terms of the achieved utility, as demonstrated in our 2022 paper
A recent trend in protecting the privacy of ML models involves using synthetic datasets for training, instead of the original dataset containing sensitive information. The datasets used for this purpose typically share some desired characteristics with the original data. This is achieved through a variety of approaches, ranging from completely rule-based systems, through statistical-query based methods, to generative machine learning models. However, simply generating a synthetic dataset does not guarantee privacy, as the new dataset may still leak sensitive information about the original data.
To address that issue, researchers have come up with methods for generating differentially private synthetic data, a solution that provides strong formal privacy assurances. At the same time, this produces a synthetic data set that “looks like” the real data from the perspective of an analyst. This approach resides completely outside the training phase and may therefore be easier to use in practice than methods that require replacing the training algorithm. Potentially, the same synthetic dataset could even be used for several downstream tasks or models, making this an even more appealing direction.
When using synthetic data for training, it may be beneficial to be able to assess the privacy risk posed by the synthetic dataset. This is true regardless of whether differential privacy was applied to the data generation process, and may also aid in finding the best privacy-utility tradeoff.
The latest release of the ai-privacy-toolkit includes a new module for dataset risk assessment.
Dataset risk assessment methods typically fall into one of two categories: The first includes methods that assess the generated dataset itself, regardless of how it was created. This assessment is based on looking at similarities between the generated dataset and the original dataset, often comparing that to the similarities between the generated dataset and another, holdout dataset (taken from the same distribution). The more similar the generated dataset is to the original dataset versus the holdout dataset, the higher the privacy risk.
The second category of methods assess the data generation model or algorithm itself, trying to determine the probability of it generating samples that are "too similar" to the original dataset. This type of method can be used to compare between different data generation techniques, or between different privacy parameters applied to the same technique (for example different values of ε when DP is applied).
Our new module includes two assessment methods from the first category (assessing the dataset directly). These are based on adaptations of a few recently published research papers.
The first method is based on the paper, "GAN-Leaks: A Taxonomy of Membership Inference Attacks against Generative Models”. It employs a black-box membership inference attack (MIA) against generative adversarial networks (GANs) using reconstruction distances. It measures the distances between each member (from the training set) and their nearest neighbor in the synthetic dataset, and the same for non-members (from the holdout set). The probability of a synthetic record being closer to each cohort is estimated and the area under the receiver operating characteristic curve (AUC ROC) gives the privacy risk score.
Another method is based on the papers, "Data Synthesis based on Generative Adversarial Networks," and "Holdout-Based Fidelity and Privacy Assessment of Mixed-Type Synthetic Data" . It measures the distance between each synthetic data record and their nearest neighbor from the training and holdout datasets. The privacy risk score is the share of synthetic records closer to the training set.
Both assessment methods assume access to synthetic records, to the original dataset from which the synthetic data was generated, and to a holdout dataset that was derived from the same distribution but not used for generating the synthetic data. Without this holdout dataset, according to van Breugel et al. , it is not possible to determine whether a local density peak corresponds to an overfitted example or to a genuine peak in the real distribution.
These assessment methods may also be used to assess the privacy leakage of anonymized datasets, as long as the new data shares the same domain and distribution as the original data, as is the case for example when employing the anonymization module of the ai-privacy-toolkit.
Other related technologies may be added to the toolkit in the future, such as support for the “right to erasure” or “right to be forgotten” for ML models — also known as “machine unlearning.” We are also currently looking into extending this work to foundation models, specifically large language models.
The open-source ai-privacy-toolkit is designed to help organizations build more trustworthy AI solutions, with tools that protect privacy and help ensure the compliance of AI models. As AI regulations mature and case-laws around their violations become available, we foresee that many more organizations will implement and embed such tools and processes in their AI infrastructure.
The toolkit has been and continues to be developed in collaboration with and supported by several projects funded by the European Union’s Horizon 2020 research and innovation program, namely iToBoS, CyberKit4SME and NEMECYS.