Overview
Our goal is to leverage existing Semantic Web technologies to provide policy assurance.
SPARQL Queries
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?s ?id ?n WHERE {
?s foaf:ssn ?n.
?s foaf:age ?a.
?s foaf:openid ?id.
FILTER (?a > 18)
}
What does this do?
- This is a SPARQL query. The syntax looks like SQL - by design.
- We return a table of the SSN, and OpenID for all entities over age 18.
SPARQL Queries
The AIR reasoner cannot understand SPARQL, but it can
understand
N3,
a human readable representation of RDF. We provide an automated
tool, sparql2n3, to convert from SPARQL to N3.
$ ./sparql2n3 query1.rq
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix s: <http://dig.csail.mit.edu/2009/IARPA-PIR/sparql#> .
@prefix : <http://dig.csail.mit.edu/2009/IARPA-PIR/query1#> .
:Query a s:Select;
s:cardinality :ALL;
s:POSList [
s:variable :s;
s:variable :id;
s:variable :n;
];
s:WhereClause :WHERE.
:WHERE a s:DefaultGraphPattern;
s:TriplePattern { :s <http://xmlns.com/foaf/0.1/ssn> :n };
s:TriplePattern { :s <http://xmlns.com/foaf/0.1/age> :a };
s:TriplePattern { :s <http://xmlns.com/foaf/0.1/openid> :id };
s:Filter [
a s:ComparatorExpression;
s:TriplePattern { :a s:BooleanGT "18 "^^xsd:integer};
];
Policies in AIR
Now that we have converted a query into something that our reasoner
can investigate, we need to consider the policies themselves.
What form will our policies take? What kinds of policies can we
create?
- We can test for particular kinds of variables in parts of clauses.
- We can make compliance decisions based on logical constructs.
- We can reason over a user's query history.
- There is enough expressivity to encode interesting policies.
Example: SSN Restriction Policy
An example of a simple policy: "if SSN number is referred
to in the query either as the requested value or just to
filter the data, the query is incompliant."
- Define
SSN:
<http://xmlns.com/foaf/0.1/ssn>.
- Check different parts of a query for SSN.
- Return as much compliance information as possible.
Creating the SSN policy in AIR
- What parts of a query can contain references to the SSN?
- WHERE clause in SELECT
- OPTIONAL part of a WHERE clause
- FILTER clause
- A variable that is bound to SSN elsewhere.
- ...the list continues. These are properties of
SPARQL structure.
- How do we create policies that catch these references?
- AIR rules implement pattern matching
- Rules can be chained to form policies
- Independent policies return information
SSN policy in AIR: the WHERE clause rule
First, a sanity check to see that this is a SPARQL query.
:SSN_RULE1 a air:BeliefRule;
air:label "SSN policy rule 1";
air:description (:Q " is a SPARQL query with a WHERE clause.");
air:pattern {
:Q a s:Select;
s:POSList :P;
s:WhereClause :W.
};
Now, let's check to see if SSN is mentioned directly in the WHERE clause.
:SSN_RULE4 a air:BeliefRule;
air:label "SSN policy rule 4";
air:description ("The query, " :Q ", includes reference to
SSN number in the where clause");
air:pattern {
:P s:variable :V.
:W s:TriplePattern :T.
:T log:includes { :X <http://xmlns.com/foaf/0.1/ssn> :Y }
};
air:assert { :Q air:non-compliant-with :SSNPolicy };
SSN policy in AIR: OPTIONAL and FILTER
We can continue to find mentions of SSN in the OPTIONAL part of a WHERE clause...
:SSN_OP02 a air:BeliefRule;
air:label "SSN optional clause rule 02";
air:pattern {
:W s:OptionalGraphPattern :O.
:O s:TriplePattern :T.
:T log:includes { [] [] }
};
air:description ("The query, " :Q ", includes reference to
SSN number in the OPTIONAL part of the WHERE clause");
air:assert { :Q air:non-compliant-with :SSNPolicy_OptionalClause }.
...or as a FILTER.
:SSN_FR02 a air:BeliefRule;
air:label "SSN filter rule 02";
air:pattern {
:P s:variable :V.
:W s:TriplePattern :T.
:T log:includes { :X <http://xmlns.com/foaf/0.1/ssn> :V }.
:W s:Filter :F.
:F s:TriplePattern :S.
:S log:includes { :V [] [] }.
};
air:description ("The query, " :Q ", filters on SSN variables");
air:assert { :Q air:non-compliant-with :SSNPolicy_FilterRule }.
Rules and Policies in AIR
In summary,
- A rule (previous slides) performs a check and returns true or false.
- Rules can make assertions and call other rules.
- A policy may contain multiple rules, but only makes one assertion.
- A policy file can contain multiple policies, and make up to one
assertion per policy.
- We check queries (or "log files" in AIR lingo) against policies (or
"rules files").
The
SSN
policy file shows how all of these interact.
Passing or failing various tests gives us information about the
query, leading up to a decision regarding the compliance of the
query.
Finding Structure: Logical Constructs
Many policies build upon basic constructs.
- ~A: Restriction (not). May not view something of type:A, at all.
- i.e. the running example, cannot query a user's SSN
- A (+) B: Exclusion (xor). May view type:A or type:B, but never both.
- i.e. "cannot query a user's bank account number and SSN."
- A <-> B: Inclusion (and). May only view type:A if type:B is
also present.
- i.e. can only get the photo of users over 18
- A -> ~B: Blocking (ordered xor). Viewing type:A prevents viewing type:B.
- i.e. "cannot query for driver's license number after having queried SSN."
- Exclusion and blocking generalize to max(M,N).
- max(M,N) defined as "may view up to M fields of N for M ≤ N."
- easy to understand, difficult to program directly
- i.e. "may know up to 3 of: last name, DOB, SSN, driver's license number, bank account number."
We can generalize to create policy "templates".
Example: Exclusion and Blocking
We can construct a sample exclusion policy in AIR.
:EX_RULE02 a air:BeliefRule;
# Matches WHERE{?x type:B ?y . ?x type:C ?y}
air:label "Exclusion WHERE clause rule.";
air:pattern {
:P s:variable :V.
:W s:TriplePattern :T.
:T log:includes { :X type:B :V } .
:W s:TriplePattern :U.
:U log:includes { :X type:C :C } .
};
air:description ("long_winded_explanation" );
air:assert { :Q air:non-compliant-with :Exclusion };
air:alt [ air:rule :EX_RULE03 ].
:EX_RULE03 a air:BeliefRule;
air:label "Exclusion compliance rule 1.";
air:pattern {
:W s:TriplePattern :T.
:T log:notIncludes { [] type:B [] }
};
air:description ("The query, " :Q ", includes a reference
to something of type:C, but not of type:B.");
air:assert { :Q air:compliant-with :Exclusion };
Other policies follow a similar structure.
Finding Structure: Lingustic Constructs
Orthogonally, many policies check the same parts of SPARQL queries.
There are a finite number of places to check.
- WHERE clause in SELECT. Reveals data.
- WHERE clause in CONSTRUCT. Reveals data in a different format.
- WHERE clause in DESCRIBE. Reveals metadata.
- OPTIONAL part of any WHERE clause. Reveals data if it is available.
- FILTER. Indicates potential misuse of a field.
- ORDER BY. Similar to FILTER.
- ASK. Reveals information, enough queries can return interesting results.
- Example: 746,347,500 queries would exhaust the SSN space per my calculations.
- FROM clause. Allows user to specify a table directly, potential security hole.
- Variable bindings.
- UNION. This chains queries together.
We saw these earlier. These are structured by the language, and would
change if we used a different query language.
Policy Generation
We can find patterns in the logical aspect of policies.
- A few logical constructs allow us to create interesting policies.
- We can chain rules, and many are repetitive.
We know the limitations of our query language.
- There are only so many patterns we can check.
- Almost all of our policies involve checking the same list of patterns.
We also know something about ourselves...
- We're lazy! Writing policies is time consuming.
Can we automate this process?
Automatic Policy Generation
We propose a graphical interface for automatically composing queries.
(This is a work in progress.)
Reasoning over Query History
- Perhaps the most imporatant application is reasoning over a history of queries.
- We extend policy assurance from single queries to a history of queries.
- State must be maintained outside of AIR as the reasoner cannot store state.
- Internal approach: let the reasoner deal with histories.
- Pass multiple queries to the reasoner.
- Preserves order, but may require annotation.
- External approach: create a resultant query.
- Join previous queries with UNIONs before reasoning.
- Loses some order, but a simpler approach.
- We are only beginning to implement this functionality.
Policy Assurance in SPARQL: Summary
- We have demonstrated a system capable of making relatively
simple policy checks against SPARQL queries.
- We believe this system offers enough robustness to encode more
substantial policies.
- We have found enough structure in our policies to propose an
interface to ease policy generation.