Team for Research in
Ubiquitous Secure Technology

Privacy: Theory meets Practice on the Map.
Johannes Gehrke, Daniel Kifer, Ashwin Machanavajjhala, John Abowd, Lars Vilhuber

Citation
Johannes Gehrke, Daniel Kifer, Ashwin Machanavajjhala, John Abowd, Lars Vilhuber. "Privacy: Theory meets Practice on the Map.". International Conference on Data Engineering, Cornell University Comuputer Science Department, Cornell, USA, 10, April, 2008.

Abstract
Abstract— In this paper, we propose the first formal privacy analysis of a data anonymization process known as the synthetic data generation, a technique becoming popular in the statistics community. The target application for this work is a mapping program that shows the commuting patterns of the population of the United States. The source data for this application were collected by the U.S. Census Bureau, but due to privacy constraints,they cannot be used directly by the mapping program. Instead, we generate synthetic data that statistically mimic the original data while providing privacy guarantees. We use these synthetic data as a surrogate for the original data. We find that while some existing definitions of privacy are inapplicable to our target application, others are too conservative and render the synthetic data useless since they guard against privacy breaches that are very unlikely. Moreover, the data in our target application is sparse, and none of the existing solutions are tailored to anonymize sparse data. In this paper, we propose solutions to address the above issues.

Electronic downloads

Citation formats  
  • HTML
    Johannes Gehrke, Daniel Kifer, Ashwin Machanavajjhala, John
    Abowd, Lars Vilhuber. <a
    href="http://www.truststc.org/pubs/463.html"
    >Privacy: Theory meets Practice on the Map.</a>,
    International Conference on Data Engineering, Cornell
    University Comuputer Science Department, Cornell, USA, 10,
    April, 2008.
  • Plain text
    Johannes Gehrke, Daniel Kifer, Ashwin Machanavajjhala, John
    Abowd, Lars Vilhuber. "Privacy: Theory meets Practice
    on the Map.". International Conference on Data
    Engineering, Cornell University Comuputer Science
    Department, Cornell, USA, 10, April, 2008.
  • BibTeX
    @inproceedings{GehrkeKiferMachanavajjhalaAbowdVilhuber08_PrivacyTheoryMeetsPracticeOnMap,
        author = {Johannes Gehrke and Daniel Kifer and Ashwin
                  Machanavajjhala and John Abowd and Lars Vilhuber},
        title = {Privacy: Theory meets Practice on the Map.},
        booktitle = {International Conference on Data Engineering},
        organization = {Cornell University Comuputer Science Department,
                  Cornell, USA},
        pages = {10},
        month = {April},
        year = {2008},
        abstract = {Abstract— In this paper, we propose the first
                  formal privacy analysis of a data anonymization
                  process known as the synthetic data generation, a
                  technique becoming popular in the statistics
                  community. The target application for this work is
                  a mapping program that shows the commuting
                  patterns of the population of the United States.
                  The source data for this application were
                  collected by the U.S. Census Bureau, but due to
                  privacy constraints,they cannot be used directly
                  by the mapping program. Instead, we generate
                  synthetic data that statistically mimic the
                  original data while providing privacy guarantees.
                  We use these synthetic data as a surrogate for the
                  original data. We find that while some existing
                  definitions of privacy are inapplicable to our
                  target application, others are too conservative
                  and render the synthetic data useless since they
                  guard against privacy breaches that are very
                  unlikely. Moreover, the data in our target
                  application is sparse, and none of the existing
                  solutions are tailored to anonymize sparse data.
                  In this paper, we propose solutions to address the
                  above issues.},
        URL = {http://www.truststc.org/pubs/463.html}
    }
    

Posted by Johannes Gehrke on 26 Aug 2008.
Groups: trust
For additional information, see the Publications FAQ or contact webmaster at www truststc org.

Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright.