Master Thesis: Pulling Sensitive Data from Trained Models

Artificial intelligence is transforming society. AI Sweden is the national centre for applied artificial intelligence, jointly funded by the Swedish government and our partners, both public and private. Our mission is to accelerate the use of AI for the benefit of our society, our competitiveness, and for everyone living in Sweden.

We are now looking for master thesis student(s) to further strengthen our LeakPro team.

Introduction

As machine learning models become integral to various industries, from healthcare and finance to social media and autonomous systems, the importance of data privacy and security has never been more critical. One of the most pressing concerns in this domain is the risk posed by model inversion attacks [1]. These attacks represent a significant threat to the privacy of individuals whose data may have been used in training machine learning models. By leveraging the outputs or internal states of a trained model, adversaries aim to reverse-engineer and reconstruct specific data points, revealing sensitive information about individuals or class representatives.

The danger of model inversion attacks lies in their ability to recreate underlying training data. For instance, an attacker may infer private attributes such as medical conditions, financial transactions, or personal behaviours based solely on the model’s predictions or intermediate outputs. This risk is particularly acute for models deployed in privacy-sensitive domains, where exposure of private data can have severe ethical, legal, and regulatory consequences.

Project Background

AI Sweden is currently leading a project within adversarial information extraction against trained ma- chine learning models. The project is called LeakPro and is a collaboration including RISE, Sahlgrenska, Region Halland, AstraZeneca, Syndata, and Scaleout. The main goal of LeakPro is to create an open- source tool to stress-test trained machine learning models to understand the risk of leaking sensitive information from the training data.

Model inversion attacks represent a significant privacy threat by attempting to recover data points from a machine learning model’s training dataset. In these attacks, an adversary seeks to reconstruct specific data points by exploiting the model’s outputs or internal states. Typically, the adversary is granted query access to the model, denoted as Q(θ), and has knowledge of the training algorithm T . Additionally, the adversary can sample from a similar distribution Π, from which the training data was originally drawn [6, Game 7]. Formally, the model inversion attack can be described by an adversary strategy whose goal is to reconstruct a data set S˜ such that S˜ ← A(T , Π,Q(θ)), where S˜ represents the reconstructed version of the training data.

Recent research has demonstrated several effective adversarial strategies for conducting successful model inversion attacks. These works show how adversaries can exploit various model vulnerabilities, including those in label-only settings [5, 3, 4].

Outline

The goal of this project is to investigate the potential of model inversion attacks under realistic settings for both vision and tabular data. The objectives of this project are as follows:

Literature Review: Conduct a thorough review of existing research on reconstruction attacks [6, 1, 8, 7], focusing on their application to classifiers. Analyze current methodologies, effectiveness, and implications of these attacks.
Experimental Validation: Implement and evaluate reconstruction attacks on relevant bench- marks. Measure the extent of training data that can be reconstructed and evaluate the success rate of these attacks in practical scenarios.
Impact Assessment: Analyze the implications of reconstruction attacks for data privacy and security. Evaluate the results to understand these attacks’ potential risks and broader meaning.
Enhancing Model Inversion Attacks: Based on the literature survey and the benchmark suite, we shall attempt to improve the current state-of-the-art (7+ experts are actively working on this within LeakPro).

If time permits and the student has interests, there is also an opportunity to contribute to the open- source platform LeakPro that is currently under development [2]. Several contributions are interesting for this, e.g., a taxonomy for reconstruction attacks and/or implementation of the benchmarks/novel attacks into LeakPro.

Contact

Johan Östman: johan.ostman@ai.se

Fazeleh Hoseini: fazeleh.hoseini@ai.se

References

Sayanton V Dibbo. Sok: Model inversion attack landscape: Taxonomy, challenges, and future roadmap. In 2023 IEEE 36th Computer Security Foundations Symposium (CSF), pages 439–456. IEEE, 2023.
AI Sweden et al. Leakpro: Leakage profiling and risk oversight of machine learning models. Leakpro, 2024. Accessed: 2024-09-17.
Rongke Liu, Dong Wang, Yizhi Ren, Zhen Wang, Kaitian Guo, Qianqian Qin, and Xiaolei Liu. Unstoppable attack: Label-only model inversion via conditional diffusion model. IEEE Transactions on Information Forensics and Security, 19, 2024.
Bao-Ngoc Nguyen, Keshigeyan Chandrasegaran, Milad Abdollahzadeh, and Ngai-Man Man Che- ung. Label-only model inversion attacks via knowledge transfer. Advances in Neural Information Processing Systems, 36, 2024.
Ngoc-Bao Nguyen, Keshigeyan Chandrasegaran, Milad Abdollahzadeh, and Ngai-Man Cheung. Re- thinking model inversion attacks against deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Ahmed Salem, Giovanni Cherubin, David Evans, Boris K¨opf, Andrew Paverd, Anshuman Suri, Shruti Tople, and Santiago Zanella-B´eguelin. Sok: Let the privacy games begin! a unified treatment of data inference privacy in machine learning. In IEEE Symposium on Security and Privacy (SP), 2023.
Xiaoxiao Sun, Nidham Gazagnadou, Vivek Sharma, Lingjuan Lyu, Hongdong Li, and Liang Zheng. Privacy assessment on reconstructed images: are existing evaluation metrics faithful to human per- ception? Advances in Neural Information Processing Systems, 36, 2024.
Yuheng Zhang, Ruoxi Jia, Hengzhi Pei, Wenxiao Wang, Bo Li, and Dawn Song. The secret revealer: Generative model-inversion attacks against deep neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 253–261, 2020.

Application closes November 10th. You can apply to this thesis alone or as a pair of students. The LeakPro team is mostly located in Gothenburg but remote work is okay.

AI Sweden does not accept unsolicited support and kindly ask not to be contacted by any advertisement agents, recruitment agencies or manning companies.

Master Thesis: Pulling Sensitive Data from Trained Models

AI Sweden is now looking for master thesis students(s) to further strengthen the LeakPro team.

Introduction

Project Background

Outline

Contact

References

About AI Sweden

Master Thesis: Pulling Sensitive Data from Trained Models

AI Sweden is now looking for master thesis students(s) to further strengthen the LeakPro team.

Master Thesis: Pulling Sensitive Data from Trained Models

AI Sweden is now looking for master thesis students(s) to further strengthen the LeakPro team.

Introduction

Project Background

Outline

Contact

References

New job openings

About AI Sweden

Master Thesis: Pulling Sensitive Data from Trained Models

AI Sweden is now looking for master thesis students(s) to further strengthen the LeakPro team.