Publication
NeurIPS 2024
Workshop paper

Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents

Abstract

Conversational agents are increasingly woven into individuals' personal lives, yet users often underestimate the privacy risks involved. In this paper, based on the principles of contextual integrity, we formalize the notion of contextual privacy for user interactions with LLMs. We apply our notion to a real-world dataset (ShareGPT) to demonstrate that users share unnecessary sensitive information with these models. We also conduct a user study, showing that even "privacy-conscious" participants inadvertently reveal sensitive information through indirect disclosures. We propose a framework that operates between users and conversational agents for reformulating the prompts to ensure that only contextually relevant and necessary information is shared. We use examples from ShareGPT to illustrate common privacy violations and demonstrate how our system addresses these violations.