Team for Research in
Ubiquitous Secure Technology

Approximation Algorithms for K-Anonymity
Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi, Rajeev Motwani, Rina Panigrahy, Dilys Thomas, An Zhu

Citation
Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi, Rajeev Motwani, Rina Panigrahy, Dilys Thomas, An Zhu. "Approximation Algorithms for K-Anonymity". Journal of Privacy Technology, November 2005.

Abstract
We consider the problem of releasing a table containing personal records, while ensuring individual privacy and maintaining data integrity to the extent possible. One of the techniques proposed in the literature is {\em $k$-anonymization}. A release is considered {\em $k$-anonymous} if the information corresponding to any individual in the release cannot be distinguished from that of at least $k-1$ other individuals whose information also appears in the release. In order to achieve {\em $k$-anonymization}, some of the entries of the table are either suppressed or {\em generalized} (e.g. an Age value of 23 could be changed to the Age range $20$-$25$). The goal is lose as little information as possible while ensuring that the release is {\em $k$-anonymous}. This optimization problem is referred to as the {\em $k$-{\sc Anonymity} problem}. We show that the $k$-{\sc Anonymity} problem is NP-hard even when the attribute values are ternary and we are allowed only to suppress entries. On the positive side, we provide an $O(k)$-approximation algorithm for the problem. This improves upon the previous best-known $O(k\log k)$-approximation. We also give improved positive results for the interesting cases with specific values of $k$ --- in particular, we give a 1.5-approximation algorithm for the special case of 2-{\sc Anonymity}, and a 2-approximation algorithm for 3-{\sc Anonymity}.

Electronic downloads

  • k-anonymity-jopt.pdf · application/pdf · 185 kbytes. (A preliminary version appeared as Anonymizing Tables, ICDT 2005.)
Citation formats  
  • HTML
    Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi, Rajeev
    Motwani, Rina Panigrahy, Dilys Thomas, An Zhu. <a
    href="http://www.truststc.org/pubs/100.html"
    >Approximation Algorithms for K-Anonymity</a>,
    <i>Journal of Privacy Technology</i>, November
    2005.
  • Plain text
    Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi, Rajeev
    Motwani, Rina Panigrahy, Dilys Thomas, An Zhu.
    "Approximation Algorithms for K-Anonymity".
    <i>Journal of Privacy Technology</i>, November
    2005.
  • BibTeX
    @article{AggarwalFederKenthapadiMotwaniPanigrahyThomasZhu05_ApproximationAlgorithmsForKAnonymity,
        author = {Gagan Aggarwal, Tomas Feder, Krishnaram
                  Kenthapadi, Rajeev Motwani, Rina Panigrahy, Dilys
                  Thomas, An Zhu},
        title = {Approximation Algorithms for K-Anonymity},
        journal = {Journal of Privacy Technology},
        month = {November},
        year = {2005},
        abstract = {We consider the problem of releasing a table
                  containing personal records, while ensuring
                  individual privacy and maintaining data integrity
                  to the extent possible. One of the techniques
                  proposed in the literature is {\em
                  $k$-anonymization}. A release is considered {\em
                  $k$-anonymous} if the information corresponding to
                  any individual in the release cannot be
                  distinguished from that of at least $k-1$ other
                  individuals whose information also appears in the
                  release. In order to achieve {\em
                  $k$-anonymization}, some of the entries of the
                  table are either suppressed or {\em generalized}
                  (e.g. an Age value of 23 could be changed to the
                  Age range $20$-$25$). The goal is lose as little
                  information as possible while ensuring that the
                  release is {\em $k$-anonymous}. This optimization
                  problem is referred to as the {\em $k$-{\sc
                  Anonymity} problem}. We show that the $k$-{\sc
                  Anonymity} problem is NP-hard even when the
                  attribute values are ternary and we are allowed
                  only to suppress entries. On the positive side, we
                  provide an $O(k)$-approximation algorithm for the
                  problem. This improves upon the previous
                  best-known $O(k\log k)$-approximation. We also
                  give improved positive results for the interesting
                  cases with specific values of $k$ --- in
                  particular, we give a 1.5-approximation algorithm
                  for the special case of 2-{\sc Anonymity}, and a
                  2-approximation algorithm for 3-{\sc Anonymity}. },
        URL = {http://www.truststc.org/pubs/100.html}
    }
    

Posted by Dilys Thomas on 30 May 2006.
For additional information, see the Publications FAQ or contact webmaster at www truststc org.

Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright.