Master thesis- Gossiping Models: Understanding Unintended Data Disclosure in LLMs
Artificial intelligence is transforming society. AI Sweden is the national center for applied artificial intelligence and our mission is to accelerate the use of AI for the benefit of our society, our competitiveness, and for everyone living in Sweden. We drive impactful initiatives in areas such as healthcare, energy, and the public sector while pushing the boundaries of AI research and innovation in fields such as natural language processing and edge learning. Join us in harnessing the untapped value of AI to drive innovation and create sustainable value for Sweden.
We are now looking for a master thesis student to strengthen our team.
Introduction
When an LLM reproduces training text verbatim or near-verbatim, it can expose personal data, confidential documents, or proprietary code. For individuals, that means privacy harm and potential misuse; for organizations, it creates regulatory exposure, loss of trade secrets, and reputational damage. Empirically, this risk is real: black-box sampling with filtering has recovered hundreds of word-for-word training snippets from GPT-2, including sensitive strings.
Regulators warn that AI models can leak training data and call for empirical stress-testing with privacy attacks and privacy-enhancing technologies. Measuring this risk is critical: without hard evidence, models are deployed blindly, exposing people to privacy harm and organizations to legal and IP liability. This thesis rigorously measures and compares data-extraction attacks under realistic black-box conditions, producing reproducible, actionable results that inform safer training, tuning, and release.
Project Background and Problem Statement
AI Sweden is leading a project to develop an open-source privacy auditing tool called LeakPro, designed to assess information leakage risks in machine learning models. This initiative, undertaken in collaboration with RISE, Sahlgrenska, Region Halland, AstraZeneca, Syndata, and Scaleout, aims to evaluate the risk of sensitive information disclosure when models trained on confidential data are made publicly available.
Recent advances in automatic prompt optimization demonstrate that query phrasing can be systematically optimized, but they target task performance rather than privacy auditing. There is no standard, reproducible method that uses automated prompt- optimization to measure LLM data-extraction risk using only black-box access and a fixed query budget. The problem is building a reproducible audit that uses automated prompt optimization to measure data extraction from LLMs.
Outline
The objectives of this thesis project are outlined below.
1. Literature study of harm-based privacy risk models: Summarize (i) how “data extraction” is defined (exact vs.\ near-verbatim recovery of training text), (ii) practical ways to verify matches and set similarity thresholds, (iii) representative attack settings and query styles reported in the literature, and (iv) datasets and evaluation practices suitable for reproducible audits.
2. Design of a preliminary evaluation approach: From the study, specify a threat model, dataset(s), and models. Evaluate data extraction in a meaningful, calibrated way so that results are comparable across settings.
3. Prototype and evaluation: Based on insights from the study and benchmark, explore directions to enhance data extraction attacks.
Contact
Fazeleh Hoseini: fazeleh.hoseini@ai.se
In order to comply with all applicable immigration and export control regulations, we are limiting this opportunity to students with permanent residents of EU, Norway, Switzerland, Iceland, India, the UK, Canada, USA, Mexico, Japan, and South Korea.
AI Sweden does not accept unsolicited support and kindly ask not to be contacted by any advertisement agents, recruitment agencies or manning companies.
- Organization
- AI Labs
- Role
- Engineering
- Locations
- Göteborg
Colleagues
About AI Sweden
As Sweden's national center for applied AI, we're on a mission to accelerate the use of AI to benefit our society, our competitiveness, and everyone living in Sweden. We drive impactful initiatives in areas such as healthcare, energy, and public services while pushing the boundaries of AI research in fields such as natural language processing and machine learning. Join us in harnessing the untapped value of AI to drive innovation and create sustainable value for Sweden.