Modelling Data Usage Policies
05 March 2009
Data-Purpose Algebra: Modelling Data Usage Policies
Chris Hanson, Tim Berners-Lee, Lalana Kagal, Gerald Jay Sussman, Daniel Weitzner
Decentralized Information Group, CSAIL, MIT
- Goals: Enable the expression of data-purpose and usage restrictions
- - build systems in which specific uses of personal data are transparent to authorized observers
- - provide effective accountability assessments by those who seek policy compliance
Data Sources
- User behaviour on web sites, Purchases on credit cards
- Government and commercial databases
- Environmental sensors
Inferences on such data collected from variety of sources may lead to adverse consequences
Restrictions on data
- Contract: Imposed by the sender or the reciever
- Statue
- Customs or Common decency
Formulating Restrictions as Algebraic Expressions
- Computed from the history of 'transmission'
- Unary Process
- Terminology:
- Data item,
i = I(q,a,k,p)
- Content,
q = QD(i)
- Agent (producer of the data),
a = AD(i)
- Category (set of data items containing the data item),
k = KD(i)
- Purpose,
p = PD(i)
- Binary Process
An example formalization: Privacy Act
- Goal: Formalize rules for data passed along Systems Of Records (SOR)
- Let r be an SOR repository. It has an associated SORN (Systems Of Records Notices) which specifies
the following input conditions:
- Allowed Sources, Os(n)
- Data Categories, Ks(n)
- Purposes, Ps(n)
- Routine Uses, U(n)
- Each routine use u ∈ U(n), which specifies OR(u), KR(u) and PR(u).
- Transfer for data i from SOR s to SOR r depends on:
- Purposes that came with the data
- Input conditions on r
- How it's supposed to work:
- Log transaction data in XML (or other structured format)
- XML -> RDF
- Annotate the transactional data with 'agent', 'category' and 'purpose'
- Encode the derivation steps in PML
Bridging the Gap Between Theory and Practise
- Annotation: it might not be clear what the 'purpose' might be at the outset, complex interactions might obfuscate the original purpose
- Scalability: log aggregation from different nodes
- Can the same principles be applied in a much narrower domain, such as content reuse?
A LOGIC FOR AUDITING ACCOUNTABILITY IN DECENTRALIZED SYSTEMS*
R. Corin1, S. Etalle1,2, J. den Hartog1, G. Lenzini1 and I. Staicu1
1 Department of Computer Science, University of Twente, The Netherlands
2 CWI, Center for Mathematics and Computer Science, Amsterdam, The Netherlands