Model Extraction Attacks: How Hackers Steal AI Models

Training a state-of-the-art machine learning model is expensive. Large language models like GPT-3 required hundreds of petaflop-days of compute and millions of dollars. Yet once deployed behind an API, they are vulnerable to a surprisingly subtle attack: an adversary who never sees the weights, never reads the training data, and never touches the server — but can still steal the model by asking it questions. This is a model extraction attack, and it is one of the more underappreciated threats in production ML security. Related adversarial work — see Goodfellow et al. on FGSM — focuses on perturbing inputs to fool a model. Model extraction goes further: the attacker wants a copy of the model itself. ...

September 15, 2024 · 6 min · Akshat Gupta

Speaker Anonymization: Protecting Voice Identity in the AI Era

Every time you speak to a voice assistant, attend a recorded meeting, or submit audio to a diagnostic tool, your voice reveals something deeply personal: your identity. Unlike a password, you cannot change your voice. This makes speaker anonymization — the task of modifying speech so a speaker cannot be identified, while keeping the content intact — one of the more important problems in applied AI privacy. Speaker diarization tells us who spoke and when. Speaker anonymization does the inverse — it ensures that even if someone has the audio, they cannot determine who it was. ...

October 15, 2024 · 7 min · Akshat Gupta