Privacy, practical obscurity and the power of the Semantic Web

While attending a US Department of Homeland Security workshop on privacy (see also complete notes on the workshop), I had a chance to think a bit about one aspect of US privacy law with important bearing on Semantic Web applications with personal information — the doctrine of ‘practical obscurity’. Practical obscurity is legal doctrine that one may have a privacy interest in the compilation of information (aka a dossier) even though each piece of information composing the dossier is itself publicly available.
The doctrine of practical obscurity was first articulated in a US Supreme Court case called U.S. DEPT. OF JUSTICE v. REPORTERS COMMITTEE, 489 U.S. 749 (1989). This case concerned a reporter?s request for FOIA access to ?rap sheets? (compilation of an individual?s arrest records). Even though the individual arrest records were found to be publicly available in all of the relevant court houses, the compilation of those individual records were found to impinge the privacy rights of the subject by burdening his ?practical obscurity.?The Court explained:

[B}oth the common law and the literal understandings of privacy encompass the individual’s control of information concerning his or her person. In an organized society, there are few facts that are not at one time or another divulged to another. Thus the extent of the protection accorded a privacy right at common law rested in part on the degree of dissemination of the allegedly private fact and the extent to which the passage of time rendered it private. According to Webster’s initial definition, information may be classified as “private” if it is “intended for or restricted to particular person or group or class of persons: not freely available to the public.” Recognition of this attribute of a privacy interest supports the distinction, in terms of personal privacy, between scattered disclosure of the bits of information contained in a rap sheet and revelation of the rap sheet as a whole. The very fact that federal funds have been spent to prepare, index, and maintain these criminal-history files demonstrates that the individual items of information in the summaries would not otherwise be “freely available” either to the officials who have access to the underlying files or to the general public. Indeed, if the summaries were “freely available,” there would be no reason to invoke the FOIA to obtain access to the information they contain. Granted, in many contexts the fact that information is not freely available is no reason to exempt that information from a statute generally requiring its dissemination. But the issue here is whether the compilation of otherwise hard-to-obtain information alters the privacy interest implicated by disclosure of that information. Plainly there is a vast difference between the public records that might be found after a diligent search of courthouse files, county archives, and local police stations throughout the country and a computerized summary located in a single clearinghouse of information.

Much of the discussion at the workshop (including many distinguished data protection officials from Europe and Asia) accepted (reluctantly) the view that practical obscurity is fast becoming ineffective at protecting privacy, given the increasing availability of previously separate information sources through a common search interface (the Web). One discussant pointed out that the practical obscurity theory never really protected against privacy, but just made it harder and more expensive to get a private information.

With the success of the Semantic Web, we’re likely to see an even further decline in the ‘practical obscurity’ we may have come to rely upon.