The Price of Privacy and the Limits of LP Decoding
- Cynthia Dwork ,
- Frank McSherry ,
- Kunal Talwar
Proceedings of the 39th Annual ACM Symposium on Theory of Computing (STOC) |
Published by Association for Computing Machinery, Inc.
This work is at the intersection of two lines of research. One line, initiated by Dinur and Nissim, investigates the price, in accuracy, of protecting privacy in a statistical database. The second, growing from an extensive literature on compressed sensing (see in particular the work of Donoho and collaborators [14, 7, 13, 11]) and explicitly connected to error-correcting codes by Candès and Tao ([4]; see also [5, 3]), is in the use of linear programming for error correction.
Our principal result is the discovery of a sharp threshhold ρ∗ ≈ 0.239, so that if ρ < ρ∗ and A is a random m×n encoding matrix of independently chosen standard Gaussians, where m = O(n), then with overwhelming probability over choice of A, for all x ∈Rn, LP decoding corrects bρmc arbitrary errors in the encoding Ax, while decoding can be made to fail if the error rate exceeds ρ∗. Our bound resolves an open question of Candès, Rudelson, Tao, and Vershyin [3] and (oddly, but explicably) refutes empirical conclusions of Donoho [11] and Candès et al [3]. By scaling and rounding we can easily transform these results to obtain polynomialtime decodable random linear codes with polynomial-sized alphabets tolerating any ρ < ρ∗ ≈ 0.239 fraction of arbitrary errors.
In the context of privacy-preserving datamining our results say that any privacy mechanism, interactive or noninteractive, providing reasonably accurate answers to a 0.761 fraction of randomly generated weighted subset sum queries, and arbitrary answers on the remaining 0.239 fraction, is blatantly non-private.
Copyright © 2007 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.