MIT CSAIL

Policy Compliance of Queries for Private Information Retrieval


Overview

The use of Private Information Retrieval (PIR) techniques enable a client to retrieve items from a co-operating database without revealing either the query or the items being retrieved. However, as both the query and the results are hidden from the database owner, it is in principle possible for the client to access information that she is not authorized to access. In order to prevent this, it must be possible to prove that the queries being posed are compliant with a set of privacy policies previously agreed upon by the client and server. Policy assurance deals with proving that queries made by the client conform to mandated policies and that leakage of sensitive information is not possible.

We propose to extend our AIR (Accountability in RDF) policy language to capture the semantics of query-based privacy policies. The AIR policy language is aimed at meeting policy compliance requirements of open, decentralized information infrastructures such as the World Wide Web and large enterprise systems. It is able to provide detailed explanations for policy compliance and non-compliance by using dependency tracking. The explanation feature will be particularly useful in this program because it will allow database owners to check the correctness of their policy and allow users to trust that their policy is being enforced correctly. We will also modify the AIR reasoner to understand the properties of these queries and incorporate compliance checking for both individual and combinations of queries.

In stage 1, we will support policy compliance over SPARQL queries by extending SPASQL to express a subset of SPARQL as RDF graphs so that AIR policies can be written over different components of queries.

In stage 2, We allow policies to be written at a higher level and be less dependent on the query and database structure.

In stage 3, we will support SQL queries either converting SQL to RDF directly or via SPARQL.

Policy Assurance Tools

We have developed several tools to help develop SPARQL-based policies as well view the result of the policy inference.

Justification User Interface

As explanations are usually in the form of proof trees, which might be incomprehensible to end users, we have developed a graphical Justification User Interface in Tabulator, a Firefox extension for SemanticWeb browsing. The interface allows users to view the explanation provided by the AIR reasoner in different ways: (i) in a simple Semantic Web based rule language, and (ii) in a graphical layout that highlights the result of the reasoning and shows both its natural language explanation as well as its specific premises (or dependencies) and allows these explanations to be explored.

Download Tabulator Firefox extension to view demos below

SPARQL to N3 Translator

As our tools are based in SW technologies, we require the queries to be in a compatible format as well. SPARQL, unfortunately, is not in RDF requiring the SPARQL queries to be translated into RDF Our first attempt at SPARQL translation lead to a detailed ontology in RDF that captured most of the semantics of SPARQL. Though this was useful research, it lead to lengthy and complex policies. We realized that we could not continue with this translation, so we tried to come up with a simplified ontology. This ontology actually flattened the earlier ontology causing us to lose most of the semantics of SPARQL. This ontology is the smallest ontology we could come up with that maintained the components of the query that we require for reasoning and it greatly reduced the size and complexity of our policies The SPARQL to N3 service accepts sparql queries and returns the translated query in our simplified ontology.

Policy Generator

We support automated policy generation using policy templates for : restriction, inclusion, exclusion, chaining, and default deny. The requirements for each policy are different, so please visit the policy generator page for further details. The policy generator outputs an AIR policy in N3.

Policy Execution Service

The Policy Execution Page accepts the URI of a policy and the URI of a SPARQL query in N3 as input. It passes these along to the AIR reasoner, and displays the reasoning output in a Web browser. If you have Tabulator installed, the results will appear automatically in the Justification UI.

SPARQL Endpoint

We put the test database into a SPARQL endpoint and gave it an easy to use front end.

Use Cases

Use Case 0: Understanding Structure of SPARQL policies

The database contains personal information including SSN numbers, openid uris, name, contact details etc. This use case is based on the initial sparql translation. The policy states that if SSN number is referred to in the query either as the requested value or just to filter the data, the query is incompliant.
  • Example query 1 (Not compliant. Refers to SSN in WHERE clause, and outputs SSN data.) Demo
  • Example query 2 (Not compliant. Refers to SSN in WHERE clause.) Demo
  • Example query 3 (Not compliant. Refers to SSN in WHERE clause, does a FILTER on the SSN, and outputs SSN data.) Demo
  • Example query 4 (Compliant. Does not contain any references to SSN.) Demo
  • Example query 5 (Not compliant. Refers to SSN in the OPTIONAL section of the WHERE clause, and outputs SSN data if it is available.) Demo
  • Example query 6 (Compliant. Does not contain any references to SSN.) Demo
  • Example query 7 (Not compliant. Refers to SSN in the OPTIONAL section of the WHERE clause, even though it does not output SSN data.) Demo

Use Case 1: Types of Policies

We move away from a particular kind of database and try to generalize the building blocks of queries that we might want to build. The generic policies include:

Use Case 2: Using External Semantic Web Information

  • Scenario: The user may not retrieve attribute X when filtering based on condition Y.
  • Scenario: The user may ONLY retrieve attributes X, Y, and Z from a database. This is a kind of "default deny" policy.
    • Example: The user may only access the first and last name of users in a particular database.
    • Implementation: This is equivalent to blocking, as defined above, using the log:notIncludes operator instead of log:includes.
    • See the sample policy, and two examples: Example 1 (compliant), and Example 2 (incompliant).
  • Scenario: The user may not retrieve attribute X or any of its subclasses.
    • Example: The user may not retrieve addresses of people living in New England.
    • Implementation: The policy first checks whether the 'where' clause contains predicates relating to state, zipcode, city, or address. If it does, then it looks to see if the object is a subclass of New England.
    • See the sample policy , and two examples: Example 1 (compliant) , and Example 2 (incompliant) .
    • Here is this policy in its generic form. Note that this one uses the new translation of SPARQL into n3. Example 1 should be non-compliant, and Example 2 should be compliant.
  • Scenario: The user may only retrieve boolean values from a table.
    • (Silly) Example: The user can only query if there exist people over eighteen, but not retrieve names or ages.
    • Here is this policy in its generic form. Note that this one uses the new translation of SPARQL into n3. Example 1 should be non-compliant, and Example 2 should be compliant.

Use Case 3: Query History

Use Case 3 introduces support for query history. If a policy supports history, this means that the policy will make a compliance decision based on not only the current query that a use makes to the database, but also the user's entire history of queries. The current reference implementation of a history-aware policy demonstrates one possibility for the format of such a policy. We are still working on a good online demo for query history.

Use Case 4: Controlling Source/Graphs

We should also consider policies which control what tables a user has access to. The following generic policies will do just that. Note that these all use the new translation of SPARQL.

  • From Prohibited Policy
    • A policy that prohibits the user from using the FROM or FROM NAMED statement with a generic set of URIs "http://URIgeneric1.html", "http://URIgeneric2.html", "http://URIgeneric3.html".
    • Here is the policy. Query 1 should be non-compliant, whereas Query 2 should be compliant.
  • From Required Policy
    • Requires the user to specify the location of the database at "http://URIgeneric1.html", "http://UIRgeneric2.html", or "http://URIgeneric3.html".
    • Here is the policy. Query 1 should be compliant, whereas Query 2 should be non-compliant.
  • Default Deny From Policy
    • Allows the user to only specify information retrieval from a set of pre-determined graphs.
    • Here is the policy. Query 1 should be compliant, whereas Query 2 should be non-compliant.
  • Two Sources Policy
    • A generic policy that specifies the following: if a user searches in a table "http://URIgeneric1.html", then the user may not search in "http://URIgeneric2.html".
    • Here is the policy. Query 1 should be non-compliant, whereas Query 2 should be compliant.

Use Case 5: Meta-Policies and More Control

Using the AIR policy language, it is also possible to use some more sophisticated policy control. For example, we can place several policies on one document. This allows us to build several base policies which may be used frequently, and match them as we see fit. For example, we can have a policy checking for which variables are retrieved, and another checking from where they are retrieved, in concurrence.

Another useful possibility is to nest policies, thus having some act only if the query is compliant with another policy. In the following example, the RtPolicy, which is simply a Superclass policy in disguise (see above), is checked first. If it is non-compliant, then the second policy, Then_Policy, will take into affect. Here, Then_Policy is rather simple, and simply returns that the query is compliant. However, we could check for more conditions by adding rules, therefore making it a non-trivial policy.

The policy in its raw form, and an example query.

References



maintained by Lalana Kagal
$Revision: 31138 $
$Date: 2011-08-23 16:59:02 -0400 (Tue, 23 Aug 2011) $