8 May 2007
WWW2007 Workshop: Query Log Analysis: Social and Technological Challenges
Daniel J. Weitzner
Decentralized Information Group
MIT Computer Science and Artificial Intelligence Laboratory
1. The advancing privacy challenge
2. Help from the history of the evolution of privacy and technology
3. Possible responses to privacy in query logs: two unsatisfying approaches and one new possibility
Overall transparency is a major factor to query log-related privacy risks.
Saltzer and Schroeder (The Protection of Information in Computer Systems):
“The term “privacy” denotes a socially defined ability of an individual (or organization) to determine whether, when, and to whom personal (or organizational) information is to be released.”
Historical foundations - the home
|"The house of everyone is to him as his castle and fortress, as well for his defence against injury and violence, as for his repose...."
Semayne's Case, All ER Rep 62 (Michaelmas Tern 1604)
|"Ways may some day be developed by which the Government, without removing papers from secret drawers, can reproduce them in court, and by which it will be enabled to expose to a jury the most intimate occurrences of the home.... Can it be that the Constitution affords no protection against such
invasions of individual security?"
Olmstead v. United States, 277 U.S. 438, 467 (1928) (Brandeis, J., dissenting)
|"The Fourth Amendment protects people, not places. What a person knowingly exposes to the public, even in his own home or office, is not a subject of Fourth Amendment protection.... But what he seeks to preserve as private, even in an area accessible to the public, may be constitutionally protected
Katz v. United States. 389 U.S. 347 (1967)
|It would be foolish to contend that the degree of privacy secured to
citizens by the Fourth Amendment has been entirely unaffected by the
advance of technology...."
Kyllo v. United States. 533 U.S. 27 (2001) (Scalia, J.)
Goal: construct data base protocol that limits information access according to a formal definition of privacy
Privacy Definition: indistinguishability of the individual from the community
Method: measures epsilon-indistinguishability of a database query transcript
Differential Privacy, Cynthia Dwork, 33rd International Colloquium on Automata, Languages and Programming, ICALP 2006, Part II, pp. 1–12, 2006.
see also Sweeney's k-anonymity work
A privacy-safe zone: Privacy sensitive data mining establishes a boundary, which, if respected, assures no privacy risk to the individual.
IRB's & basic privacy notice & consent model. Can today's privacy model (EU or US) be sufficient going forward?
Key will be purpose limitation, but we have a dilemma...
Dilemma: limited individual and regulatory capacity to control escalating data collection.
Current result of consent dilemma + increased inference power: strict about what's collected but loose about usage
Better result: loose about what is collected and strict about usage
General view (amongst the 'digerati'): law has to catch up with new technology.
General question: how will laws catch up?
My question: How will the Web finally catch up with the 'real world'?: in everyday life, the vast major of 'policy' problems get worked out without recourse to legal system.
Design goal: instrument the Web to provide seamless social interactions which allow us to avoid legal system the way we do in the rest of life
Global perspective: In the shift from centralized to decentralized information systems we see a general trend:
ex ante policy enforcement barriers -> policy description with late binding of rules for accountability
For more information see:
Work described here is supported by the US National Science Foundation Cybertrust Program (05-518) and ITR Program (04-012).
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.