Computer scientists at the School of Engineering and Applied Science have designed a “reconstruction attack” that proves U.S. Census data is vulnerable to exposure and theft.
Aaron Roth, Henry Salvatori Professor of Computer & Cognitive Science in Computer and Information Science (CIS), and Michael Kearns, National Center Professor of Management & Technology in CIS, led a PNAS study demonstrating that statistics released by the U.S. Census Bureau can be reverse engineered to reveal protected information about individual respondents. With computing power no stronger than that of a commercial laptop and algorithm design drawn from machine learning fundamentals, the research team established risks to the privacy of the U.S. population.
The study stands out for being the first of its kind to determine a baseline for unacceptable susceptibility to exposure. In addition, it proves that an attack has the means to ascertain the likelihood that a reconstructed record corresponds to the data of a real person, making it even more probable that this kind of attack could render respondents vulnerable to identity theft or discrimination.
The findings sharpen the stakes of one of the digital era’s most significant debates in public policy.
“Over the last two decades it has become clear that practices in widespread use for data privacy—anonymizing or masking records, coarsening granular responses or aggregating individual data into large-scale statistics—do not work,” says Kearns. “In response, computer scientists have created techniques to provably guarantee privacy.”
“The private sector,” adds Roth, “has been applying these techniques for years. But the Census’ long-running statistical programs and policies have additional complications attached.”
This story is by Devorah Fischler. Read more at Penn Engineering Today.