Privacy in a Mobile-Social World

CompSci 590.03, Fall 2012

Instructor: Ashwin Machanavajjhala
When: 1:25 PM - 2:40 PM Wednesdays and Fridays
Where
D106, LSRC
Office Hours: By appointment (send email to firstname@cs.duke.edu)

Announcements:

Overview: Terabytes of data about individuals are collected and analyzed daily through the use of mobile and social applications. Disseminating the data or results of analysis over this data to third parties (e.g., researchers or advertisers) has tremendous scientific and commercial value. However, such data dissemination can lead to potential identification of individuals, and disclosure of their sensitive information.

This project-oriented course focuses on the foundations of statistical privacy -- the problem of disclosing aggregate statistics about data collected from individuals while ensuring that individual level sensitive properties are not disclosed. The course will cover fundamentals of defining privacy and developing efficient algorithms for enforcing privacy, as well as the challenges in developing privacy preserving algorithms in real-world applications -- publishing social science data (like in the US Census), continuous user activity monitoring (like in search logs, location traces, energy monitoring), social networks, recommendation engines and targeted advertising.

Prerequisites: The course is open to interested graduate and undergraduate students with sufficient mathematical maturity. Basic knowledge in algorithms, proof techniques, and probability will be assumed. 

Administrivia:

  Syllabus:

#
Date
Topic
Readings

Wed, Aug 29
NO CLASS

1
Fri, Aug 31
Introduction and Attacks on privacy (slides)
"Privacy", J. DeCew, The Stanford Encyclopedia of Philosophy
   (Fall 2012 Edition), Edward N. Zalta (ed.)
"A Face is exposed for AOL Searcher No. 4417749",
   New York Times, Aug 2006
"How Companies Learn your Secrets",
   New York Times, Feb 2012
2
Wed, Sep 5
De-anonymization (slides)
"Robust deanonymization of Sparse Datasets (Netflix Prize Data)",
   A. Narayanan, V. Shmatikov, IEEE S&P 2008
Optional: "Wherefore Art Thou R3579X?",
   L. Backstrom, C. Dwork, J. Kleinberg, WWW 2007


ANONYMITY

3
Fri, Sep 7
k-Anonymity + algorithms (slides)
"Incognito: Efficient full domain k-anonymity",
   K. Lefevre, D. DeWitt, R. Ramakrishnan, SIGMOD  2005
Optional: "k-anonymity: a model for protecting privacy",
   L. Sweeney, IJUFKS 2002
4
Wed, Sep 12
Anonymity in Social Networks (slides)
"Resisting Structural Re-identification in Anonymized Social Networks"
M. Hay, G. Miklau, D. Jensen, D. Towsley, C. Li VLDBJ 2010


BEYOND ANONYMITY

5
Fri, Sep 17
Adversaries and Background Knowledge (slides)
"L-diversity: privacy beyond k-anonymity"
A. Machanavajjhala, J. Gehrke, D. Kifer, M. Venkitasubramaniam, ICDE 2006
"T-closeness: privacy beyond k-anonymity and l-diversity"
N. Li, T. Li, S. Venkatasubramanian
6
Wed, Sep 19
Simulatability (or "the adversary knows the privacy mechanism") (slides)
"Simulatable Auditing", K. Kenthapadi, N. Mishra, K. Nissim PODS 2005
"Minimality Attack in Privacy Preserving Data Publishing",
R. C. Wong, A. Fu, K. Wang, J. Pei, VLDB 2007


DIFFERENTIAL PRIVACY

7
Fri, Sep 21
Differential privacy, Laplace Mechanism and Composability (slides)
"Calibrating Noise to Sensitivity in Private Date Analysis"
C. Dwork, F. McSherry, K. Nissim, A.Smith TCC 2006
"Composition Attacks and Auxiliary Information in Data Privacy"
S. Ganta, S. Kasiviswanathan, A. Smith KDD 2008
8
Wed, Sep 26
Exponential and median mechanisms (slides)

"Mechanism design via differential privacy"
F. McSherry, K. Talwar FOCS 2008
Optional: "Interactive Privacy via Median Mechanism", A. Roth, T. Roughgarden, STOC 2010
9
Fri, Sep 28
Smooth Sensitivity and Sample Aggregate Framework (slides)
(Deadline for choosing project topics)
"Smooth Sensitivity and Sampling in Private Data Analysis", K. Nissim, S. Raskhodnikova, A. Smith STOC 2007
Optional: "Privacy-preserving statistical estimation with optimal convergence rates", A. Smith STOC 2011
10
Wed, Oct 3
Optimization utility for differential privacy - constrained inference (slides)
"Boosting the Accuracy of Differentially Private Histograms
Through Consistency
", M. Hay, V. Rastogi, G. Miklau, D. Suciu, PVLDB 2010
11
Fri, Oct 5
Optimization utility for differential privacy - Query strategies (Wavelet and Matrix Mechanisms) (slides) "Differential Privacy via Wavelet Transforms",
X. Xiao, G. Wang, J. Gehrke ICDE 2009
Optional: "Optimizing Linear Counting Queries Under Differential Privacy"
C. Li, M. Hay, V. Rastogi, G. Miklau, A. McGregor PODS 2010
12
Wed, Oct 10
Sparse Vector Technique (slides)
"The Sparse Vector Technique" Lecture Notes, Aaron Roth
13
Fri, Oct 12
Multiplicative Weights Algorithms (slides)(proofs)
(Deadline for submitting project proposal)
"A simple and practical algorithm for differentially private data release", F. McSherry, K. Ligett, M. Hardt, NIPS 2012
Optional: "A Multiplicative Weights Mechanism for Privacy Preserving Data Analysis" M. Hardt, G. Rothblum, FOCS 2010
14
Wed, Oct 17
Implementing Differential Privacy Privately (slides)

"Airavat: Security and Privacy for MapReduce", I. Roy, S. Setty, A. Kilzer, V. Shmatikov, E. Witchel
"Differential Privacy Under Fire". Haeberlen, Pierce, and Narayan, 2011
"Gupt: Privacy Preserving Data Analysis Made Easy", P. Mohan, A. Thakurtha, E. Shi, D. Song, D. Culler, SIGMOD 2012


BEYOND DIFFERENTIAL PRIVACY

15
Fri, Oct 19
No Free Lunch Theorem (slides)
"No Free Lunch in Data Privacy", D. Kifer, A. Machanavajjhala SIGMOD 2011
Optional: "The Case for Differential Privacy" C. Dwork, M, Naor,
Journal of Privacy and Confidentiality 2010
16
Wed, Oct 24
Customizing privacy definitions for realistic applications (slides)
"A rigorous and customizable framework for privacy", D. Kifer, A. Machanavajjhala PODS 2012
Optional: "Crowd-Blending Privacy", J. Gehrke, M. Hay, E. Liu, R. Pass, CRYPTO 2012
Optional: "Data Publishing against Realistic Adversaries", A. Machanavajjhala, J. Gehrke, M. Gotz, PVLDB 2009


APPLICATIONS

17
Fri, Oct 26
Differential Privacy in the US Census (slides)
"Privacy: Theory meets practice on the map", A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, L. Vilhuber, ICDE 2008

Wed Oct 31
NO CLASS
18
Fri, Nov 2
Guest Lecture: Jerry Reiter, Confidentiality in the US Census, and synthetic data generation
"Estimating Risks of Identification Disclosure in Partially Synthetic Data", Reiter & Mitra, JPC '09
19
Wed, Nov 7
Publishing Search Logs  (slides)
"User 4xxxxx9: Anonymizing Query Logs", E. Adar, WWW 2007
"Publishing Search Logs", M. Gotz, A. Machanavajjhala, G. Wang, X. Xiao, J, Gehrke, TKDE 2012
20
Fri, Nov 9
Cloaking/Mixes for Location Privacy (slides) "Location Privacy in Pervasive Computing", A. Beresford, F. Stajano, 2003
"Preserving privacy in gps traces via uncertainty-aware path cloaking", B. Hoh, M. Gruteser, H. Xiong , A. Alrabby, ACM CCS 2007
21
Wed, Nov 14
Recommendation Engines (slides)
"Differentially Private Recommender Systems", F. McSherry, I. Mironov KDD 2009
"Personalized Social Recommendations - Accurate or Private?", A. Machanavajjhala, A. Korolova, A. Das Sarma PVLDB 2011
22
Fri, Nov 16
Targeted Advertising
(Deadline for midterm report)
"Adnostic: Privacy Preserving Targeted Advertising", V. Toubiana, A. Narayanan, D. Boneh, H. Nissenbaum, S. Barocas NDSS 2010
"Privacy Violations Using Microtargeted Ads", A. Korolova JPC 2011

Wed, Nov 21
NO CLASS - thanksgiving break


Fri, Nov 23
NO CLASS - thanksgiving break
23
Wed, Nov 28
Privacy in Machine Learning
Functional Mechanism: Regression Analysis under Differential Privacy
J.  Zhang, Z. Zhang, X. Xiao, Y. Yang, M. Winslett, PVLDB 2012
Optional: "Differentially Private Empirical Risk Minimization", K. Chaudhuri, C. Monteleoni, A. Sarwate JMLR 2011
24
Fri, Nov 31
Privacy and Economics
"A Theory of Pricing Private Data" C. Li, D.  Li, G. Miklau, D. Suciu Corr 2012
Optional: "Privacy Consensus in Anonymization Systems via Game Theory", R. Karimi Adl, M. Askari, K. Barker, R. Safavi-Naini DBSec 2012
25
Wed, Dec 5
Project Presentations
(Deadline for final report)

26
Fri, Dec 7
Project Presentations