Transparent Accountable Data Mining Initiative (TAMI)

The TAMI Project is creating technical, legal, and policy foundations for transparency and accountability in large-scale aggregation and inferencing across heterogeneous information systems. We are outling an information architecture for the Web that can provide transparent access to reasoning steps taken in the course of data mining, and accountability for use of personal information as measured by compliance with rules governing data usage.

February 2008

TAMI/e2esa face-to-face [agenda][minutes]

January 2008

Meeting with iARPA. Presentations: Danny's, Lalana's, and Jim's

Submitted paper to IEEE Policy 2008 pdf

December 2007

Tabulator extension release includes justification UI

Download and install extension

Example justification

Begun work on Reciprocal Privacy for Social Networks

November 2007

Scenario 9: MA Disability Discrimination

Detailed workthrough of scenario 9

October 2007

Scenario 0: MIT Prox Card violation

September 2007

6.898 Fall course on Accountability architectures for WWW started

Draft specification of AIR (Accountability in RDF) AIR ontology

August 2007

Decided to move to a more AMORD like language with dependency tracking

July 2007

First draft of Rei+ ontology

Started work on privacy policy language (Rei+)

TAMI Architecture (pdf)

March 2007

Developing Policy Aware Provenance design

February 2007

Beginning work on scenarios 8, 9, 10

January 2007

TAMI/e2eSA FTF meeting

Continuing work on Scenario 6

November 2006

Scenario 4 is starting to take shape, as is the Scheme code that implements one of the reasoning engines.

October 2006

We've begun work on a Data Purpose Algebra and are continuing to work on expressing Scenario 4.

September 2006

Work continues on cwm/n3 and TMS-based reasoners, as well as Scenario 4 and the user interface of the project.

June 2006

We participated in these events:

May 2006

Integrating Cwm with Inference Web

April 2006

We've decided to produce more complex scenarios in order to test/modify our design:

March 2006

We've presented a paper at the AAAI Spring Symposium "The Semantic Web Meets E-Government."

We've started producing code:

February 2006

We've agreed that we have enough common understanding to work through a scenario from end to end.

Fall Term 2005 - Preliminary Work

One significant challenge was the range of backgrounds necessitated by the project. In order to ensure a common knowledge base, we provided each other with overviews of relevant topics:

Produced a hypothetical use case

We created a fictional scenario that addresses some of the common public concerns. It involves an airline passenger who is a potential match in the testing of the Transportation Security Administration's Secure Flight program (formerly known as CAPPS); his identity is passed to the FBI's Joint Terrorism Task Force and, ultimately, he is arrested on an outstanding warrant for unpaid child support.

The scenario will allow us to test our ability to build a system that can proofcheck the answers to two important data mining questions:

The scenario was built specifically to require application of rules with three increasing levels of complexity:

The pieces of our planned system

XML

Based upon current government efforts, we presume that the historical log of data collection, analysis, and transfer, as well as case activities, will exist in XML. Using our hypothetical, we created

Note 1: Where possible, we used the National Information Exchange Model, the joint Department of Justice and Department of Homeland Security XML for law enforcement.

Note 2: The two versions do not contain identical information. The "cleansed" view contains more of the required information.

XSLT

Next Steps:

RDF

We expect that the transactional data will be processed as RDF. A volunteer has produced an RDF version of the transaction data.

Chris Hanson has generated a skeletal RDF Schema for SORN documents and has used that vocabulary to create an example SORN. This uses an updated RDF/XML version of the above transaction data. K is drawing an RDF graph to demonstrate the SORN Schema.

N3

We are operating on the assumption that the rules should be expressed in N3. This required us to build quite a bit of common understanding about how to convert law to rules to N3. This appears to be an iterative process. So far, we have:

Note: These rules will not answer either of our two goal questions, but this was an important first step in determining that we could convert any law into N3.

Note: This will create an opportunity to test a proof for our first goal - was an agency allowed to collect/acquire the data?

Next Steps:

CWM

We have had quite a bit of discussion about what is the appropriate logic system for this sort of project.

Because of our Semantic Web interests, we are focused on CWM. We have:

Next Steps:

Truth Maintenance Systems

We are contemplating using a TMS as the storage mechanism for our proofs and/or as a deductive reasoner. We have:

Next Steps:

Inference Web/Proof Mark-up Language

We are intending to register question answering systems used in TAMI in Inference Web and have those systems generate PML. Then we will use Inference Web to view and manipulate explanations.

Next Steps:

Valid XHTML 1.1