\chapter{Performance}

In any practical system, performance is a concern. This project focused on providing a functional implementation of a policy assurance system, and achieved that goal. In terms of performance, the only optimization we performed was to ensure that the query history grows linearly with the number of queries, and that policies grow linearly with the number of items that they check.

It is relatively easy to write policies that grow faster than would be feasible to implement. For example, one of the primitives that we originally sought to implement was a $MAX(m,n)$ rule, meaning, ``given $n$ fields, the user may only see up to $m$ of them.'' This policy is trivial in the case of $MAX(n-1,n)$, in that the only way to trigger it is to access all $n$ fields. However, the current AIR language would require that for a general implementation of $MAX(m,n)$, we check every possible combination. This is due to the fact that AIR is a production rule system and does not maintain any state. Thus, a straightforward rule like $MAX(m,n)$ grows exponentially with $(m-n)$.

With the help of Yotam Aron, we ran a series of tests using the history policy and the AIR reasoner. The time values listed are the times given by the AIR reasoner itself; there may be more accurate ways of measuring run time. We have found that the way we write queries will influence the running time of our checks. The AIR reasoner abhors complexity, so any attempt to limit the patterns that it checks will help it to run faster. The reasoner runs faster with a smaller history file. We can tailor our policies to shrink the size of air:pattern fields at the expense of creating more rules. 

Yotam ran one test suite running Ubuntu on VMWare. Resources allocated: 512 MB memory, 
12GB hard disk space. The results are in table \ref{no-opt}.

\begin{table}
\begin{tabular}{|c|c|c|c|}
\hline
Items in Query History & Reasoning & After Loading & Actual Reasoning \\
\hline
0 & 1.43096 & 1.094741 & 1.0751231\\
1 & 1.52762 & 1.32497 & 1.307404\\
2 & 1.7949221 & 1.601938 & 1.584473133\\ 
3 & 2.3070598 & 2.0158172 & 1.99806\\ 
4 & 2.728109121 & 2.513138056 & 2.495877981\\ 
5 & 3.490919828 & 3.255658865 & 3.238258839\\ 
6 & 4.528033018 & 4.235877991 & 4.218565941\\ 
7 & 5.564540863 & 5.312869072 & 5.295244932\\ 
8 & 7.044108152 & 6.777283192 & 6.759871006\\ 
9 & 8.317480087 & 8.051841974 & 8.028425932\\ 
10 & 10.18744802 & 9.915023088 & 9.896337986\\ 
11 & 12.22676992 & 11.94390798 & 11.925632\\ 
12 & 15.13193297 & 14.83994699 & 14.82177997\\ 
13 & 17.82276082 & 17.52010894 & 17.502671\\ 
14 & 21.40320492 & 21.10541201 & 21.087852\\ 
15 & 24.56065702 & 24.24677181 & 24.22919488\\ 
16 & 29.64700389 & 29.2435019 & 29.17709184\\ 
17 & 33.9228971 & 33.52368808 & 33.4938252\\ 
18 & 44.79428411 & 44.41020393 & 44.35875702\\ 
19 & 50.51201296 & 50.00673199 & 49.98922586\\ 
\hline
\end{tabular}
\caption{Running time, in seconds, of unoptimized queries.}
\label{no-opt}
\end{table}

\begin{figure}
\centering
\includegraphics{no-opt}
\caption{Unoptimized policy run time from table \ref{no-opt}.}
\label{no-opt-plot}
\end{figure}

Yotam then ran one additional test suite running Ubuntu on VMWare. Resources allocated: 768 MB memory, 
12GB hard disk space. Despite the difference in memory allocation, this shows a good trend, with better scaling for a large history. The results are in table \ref{with-opt}.

\begin{table}
\begin{tabular}{|c|c|c|c|}
\hline
Items in Query History & Reasoning & After Loading & Actual Reasoning\\
\hline
0 & 1.548950911 & 1.343198061 & 1.324027061\\
1 & 1.335716963 & 1.097095966 & 1.078705072\\
2 & 1.426616192 & 1.18595314 & 1.159878016\\
3 & 1.465291023 & 1.198924065 & 1.178475857\\
4 & 1.604300976 & 1.294827938 & 1.275640965\\
5 & 1.621567011 & 1.348763943 & 1.330348969\\
10 & 2.188154936 & 1.863446951 & 1.843798876\\
15 & 2.896314144 & 2.549362183 & 2.529920101\\
20 & 3.677877903 & 3.288823843 & 3.269448996\\
25 & 4.998191118 & 4.573461056 & 4.555090189\\
30 & 6.7367239 & 6.265938997 & 6.247066021\\
40 & 12.97512007 & 12.29570103 & 12.26716304\\
50 & 21.24516296 & 20.6023891 & 20.57518101\\
60 & 31.52937508 & 30.80437517 & 30.78589511\\
70 & 47.03233194 & 46.23697996 & 46.211133\\
80 & 63.81845903 & 62.74732184 & 62.72839189\\
90 & 93.3954649 & 92.36639786 & 92.34837484\\
100 & 117.659286 & 116.265708 & 116.2480021\\
200 & 864.6837819 & 862.648648 & 862.6293769\\
\hline
\end{tabular}
\caption{Running time, in seconds, of optimized queries.}
\label{with-opt}
\end{table}

\begin{figure}
\centering
\includegraphics{with-opt}
\caption{Unoptimized policy run time from table \ref{with-opt}. Note log scale on the X axis.}
\label{with-opt-plot}
\end{figure}

We can clearly see exponential growth toward the right hand side of these figures. At present, this is the greatest impediment to scaling. Future changes or optimizations to the reasoner could help this run faster.

In practice, our finds show that it would be difficult for a policy to perform any matching against more than about 100 terms at a time. Any policy that does not support query history will likely begin to have difficulty with a perticularly complex query that accesses on the order of 100 different variables. A policy that supports a query history will have difficulty if there are on the order of 100 queries in said history. A more robust machine might bump these figures by an order of magnitude, but these seem to be the limitations of our system with hardware available today. Thus, it is likely that an enterprise environment looking to implement this software will hit its performance limitations relatively quickly, without additional, out-of-system means in place to mitigate its limitations.