DATE: Tuesday, Feb. 28, 2006
TIME: 2:30 pm
PLACE: Council Room (SITE 5-084)
TITLE: DNA Re-identification and Privacy in Distributed Environments
PRESENTER: Brad Malin
Carnegie Mellon University
ABSTRACT:

The incorporation of genomic data into personal electronic medical records poses many challenges to patient privacy. Proposed technologies for preserving patient privacy in shared genomic data have lack formal proofs and, as a consequence, automated methods can re-identify genomic data, devoid of explicit identifiers, to named individuals without "hacking" into private computer systems. In this talk, I focus on a specific re-identification threat that manifests in distributed health care environments. First, I introduce several algorithms that link genomic data to named individuals by leveraging unique features in patient-location visit patterns, or trails. Experimental evidence with real world populations confirms vulnerability to trail re-identification is neither trivial, nor the result of isolated occurrences. Second, I present a secure computational protocol to disclose genomic data, such that the data is provably protected from trail re-identification. I conclude this talk with a discussion on additional challenges to genomic data privacy, as well as several directions for continuing research in formal computational methods for genomic data privacy.