TAMI Technical Challenges

Technical Challenges

How do we identify the relevant policies to reason over?
- We can provide a list of policies that we want to check for compliance
- We can allow the reasoner to find "relevant" policies from a policy registry. Relevant policies are identified because their patterns match the transactions under consideration
- It should be possible to override the expected policies with different ones at run-time (e.g., to risk model a possible court decision or proposed change in law or contract)
How do we identify relevant entries from a transaction log?
- assuming that we have a centralized log
- assuming that we have a distributed log
How should we handle changes in the text of law/policy?
- How will we know when the change occurs?
- How will we know which version of the rule to run
  - For audit, recognize the date of the transaction and the effective date of the rule?
  - For current accountability, always use the current effective version (watch out for enacted but not yet effective)?
Can we come up with an example in which a combination of transactions cause a violation?
Relevant literature on modeling legal reasoning in logic
User interface should allow user to ask queries such as
- How many violations are there in a given set of transaction entries
- How many violations are due to policy X
- Which policies do a set of transactions violate
- What-if analysis: If I change rule A, how many more/fewer violations would have occured in a given set of transactions
What problems do we uncover when we handle many transactions instead of one?
- Can we create a proof of concept with 50 rules and 1000 transactions and allow anyone to change a piece of a rule or a transaction and show that TAMI consistently and correctly produces the correct compliant/non-compliant result?

Scenario 4

Can we determine if the inference a party draws from data (or data mining) is reasonable?
Can we use data purpose algebra to determine if the nth sequential recipient of data is using it within the narrowed allowed purposes caused by the sequential passing?
How can we reason if the nth sequential party to receive data is using it in a manner "compatible with" the reason for which it was collected? (Example: a law enforcement agency saved a letter alleging illegal drug activity of one of its employees and later used the letter for a criminal investigation of the letter's author. In our current system, both uses would be for criminal law enforcement, but in the real case the second use was determined to be impermissible. Additional explanation)
How can we store and reuse input that addresses a can't-decide-compliance result (e.g., that for TSA, the purpose "secure planes" includes "counterterrorism")?

Scenario 5

How do we decide among rules created by case law, which rule applies or has priority?
- Recognize the geo-location of the person using the data and match it to the proper Courts (Example: System able to recognize that the person using the data is geo-located in California, so 9th Circuit rules apply.)
- Recognize the hierarchical structure of courts and apply lower court rules only where higher court rules don't exist. (Note: the ability to decide between branches of rules will likely come up in other contexts)
Do we want to extend into reasoning based upon weighting of different rules or variables?

Scenario 6

What special challenges are there when the party is data mining the internet?

Scenario 9

How can we address conflict-of-law or seeming conflict of law situations?
- Preemption: How will we handle rules which state that one rule should be used instead of another if a particular set of circumstances exists?
  - In this scenario, the required determination is to find which rule is most restrictive.
- Exceptions: Have we handled enough different types of exceptions to be confident that our method of handling is sufficient?
  - In this scenario, there are automatic exemptions to preemption and "manual" exemptions (ones that must be requested and approved before they can be triggered)
- Can we use data purpose algebra to produce the scope information needed to compare the restrictiveness of two rules?
How can we address situations in which the category of data changes over time?
- How can we recognize change in category caused by aggregation (e.g., anonymized data is aggregated with other data and the persons can now be identified ... which should trigger rules for handling Personally Identifiable Information)?
- How can we recognize change in category caused by an event external to the system (e.g., a foreign national becomes a legal permanent resident ... which should trigger rules with additional protections for that person's information)?
- How can we recognize change in category caused by the authorized purpose of the recipient (e.g., personal telecom records become health information when CDC collects them to investigate spread of TB or financial records become criminal case investigation records when collected by law enforcement during a case)?
How/when should we be able to reasonably infer leakage?
- How could we recognize that a person (or a person's organization) accessed information and then, afterwards, that person apparently used that same information elsewhere (without a transfer of the digital data)?
- How would we recognize screen-scraping?

Scenario 11

In a distributed environment, how should we obtain value(s) we need to reason over, if they are not in the same system or under the same control? (In this example, the FBI system can't decide whether to release/transfer information until it knows what the Florida system will do. It needs to be able to ask the Florida system what rules it is going to follow. This problem is not limited to being able to access other parties' rules. In other examples, it may be that in order to know which rule to fire, TAMI will need to know how another system categorizes the data.)

$Revision: 24872 $ of
$Date: 2008-08-18
$Author: lkagal & kkw$