Started proofs for BI --> TI reduction.

master
Aaron Huber 2020-09-23 17:20:36 -04:00
parent c70afe6b18
commit 05915da3ae
2 changed files with 79 additions and 5 deletions

View File

@ -614,16 +614,74 @@ For any $\poly$ then, it is true that all coefficients in $\abs{\etree}(1,\ldots
\subsubsection{Known Reduction Result $\bi \mapsto \ti$}
It is well known that an arbitrary $\bi$ can be reduced to a query $\poly$ over a $\ti$. For completeness, let us describe the reduction.
The reduction consists of a query $\poly$ over a constructed $\ti$. To construct the $\ti$ given an arbitrary $\bi$ %with $|b|$ blocks ($b = \{b_1,\ldots, b_\ell\}$, the set of blocks in the $\bi$) and an upper bound of $r$ alternatives per block,
\begin{Theorem}\label{theorem:bi-red-ti}
For any $\bi$, there exists a query $\poly$ and $\ti$ such that $\poly(\ti)$ outputs the possible worlds of $\bi$ according to their respective probabilities.
\end{Theorem}
\begin{Definition}[BI-reduction]\label{def:bi-red-ti}
Given an arbitrary $\bi$, there exists a query $\poly$ and $\ti$ such that $\poly\left(\ti\right)$ produces exactly the worlds of the $\bi$ according to their probabilities.
\end{Definition}
\begin{Definition}[Query $\poly$]\label{def:bi-red-ti-q}
Given a total ordering across the tuple alternatives in block $b$ of $\bi$, $\poly$ is constructed to map all possible worlds of $\ti$ for which $a_i$ is the greatest according to the ordering, to the disjoint $\bi$ world that $a_i$ appears in.
\end{Definition}
The reduction consists of the construction of a query $\poly$ and $\ti$ such that $\poly$ is computed over $\ti$, as in $\poly(\ti)$. To construct the $\ti$ given an arbitrary $\bi$ %with $|b|$ blocks ($b = \{b_1,\ldots, b_\ell\}$, the set of blocks in the $\bi$) and an upper bound of $r$ alternatives per block,
a tuple alternative $a_i$ in given block $b$ is transcribed to a tuple in the $\ti$ with probability
\begin{equation*}
\begin{equation}
x_{b, i} = \begin{cases}
\frac{P(a_i)}{\prod_{j = 1}^{i - 1}(1 - P(x_j))} &\textbf{if }i > 1\\
P(a_i) &\textbf{if } i = 1.
\end{cases}
\end{equation*}
\end{cases}\label{eq:bi-red-ti-func}
\end{equation}
The above mapping is applied across all tuples in the input $\bi$.
This method for computing the probabilities of the tuples in the constructed $\ti$ allows for the following. For a block $b$, the powerset of possible worlds in the $\ti$ is mapped in such a way that the first tuple appearing in a possible world of the $\ti$ has that world mapped to its corresponding $\bi$ world. The sum of the probabilities of all such $\ti$ worlds mapped to a a given tuple $x_{b, i}$ equals the probability of the tuple in the original $\bi$. This leaves us with the task of constructing a query $\poly$ over $\ti$ such that each world of the $\ti$ is mapped to the disjoint $\bi$ world of the first tuple appearing in the $\ti$ world.
This method for computing the probabilities of the tuples in the constructed $\ti$ allows for the following. Assuming a total ordering across the $\bi$ tuples, for a block $b$, the powerset of possible worlds in the $\ti$ is mapped in such a way that the first ordered tuple appearing in a possible world of the $\ti$ has that world mapped to its corresponding $\bi$ world.
\begin{Lemma}\label{lem:bi-red-ti-prob}
The sum of the probabilities of all $\ti$ worlds mapped to a a given tuple $x_{b, i}$ equals the probability of the tuple $a_i$ in the original $\bi$.
\end{Lemma}
\begin{proof}[Proof of Lemma ~\ref{lem:bi-red-ti-prob}]
The proof is by induction. Given a tuple $a_i$ in block $b$ of $\bi$ such that $1 \leq i \leq \abs{b}$, (where $\abs{b}$ denotes the number of alternative tuples in block $b$), by ~\cref{eq:bi-red-ti-p-func} $P(x_i = 1) = \frac{P(a_i = 1)}{1 \cdot \prod_{j = 1}^{i - 1} (1 - P(x_j))}$ where $x_i$ denotes the transcription of $a_i$ into the $\ti$.
For the base case, we have that $i = 1$ which implies that $P(x_i) = P(a_i)$, that $\abs{b} = 1$, the set $b = \{a_1\}$, and the powerset $2^b = \{\emptyset, \{1\}\} = \wSet_{\ti}$, i.e., the set of all possible worlds in $\ti$. Note $P(x_i) = P(a_i)$ and the base case is satisfied.
For coolness, also see that $P(\neg x_i) = 1 - P(x_i) = 1 - P(a_i) = \emptyset$, so there is, in this case, a one to one correspondence of possible worlds and their respective probabilities in both $\ti$ and $\bi$, but this is extraneous information for the proof.
The hypothesis is then that for $k \geq 1$ tuple alternatives, ~\cref{lem:bi-red-ti-prob} holds.
For the inductive step, prove that ~\cref{lem:bi-red-ti-prob} holds for $k + 1$ alternatives. By definition of the query $\poly$ (~\cref{def:bi-red-ti-q}), it is a fact that only the set $\{x_{k + 1}\}$ in the powerset $2^b$ is mapped to $\bi$ world $\{a_i\}$. Then for $\ti$ world $\wElem_{x_{k + 1}}$ it is the case that $P(\wElem_{x_{k + 1}} \in \wSet_{\ti} = 1) = \prod_{j = 1}^{k} (1 - P(x_j = 1)) \cdot x_{k + 1}$. Since by ~\cref{eq:bi-red-ti-func} $P(x_{k + 1}) = \frac{P(a_{k + 1})}{\prod_{j = 1}^{k}(1 - P(x_j))}$, we get
\begin{align*}
P(\wElem_{x_{k + 1}} \in \wSet_{\ti} = 1) =& \prod_{j = 1}^{k} (1 - P(x_j = 1)) \cdot x_{k + 1}\\
=&\prod_{j = 1}^{k} (1 - P(x_j = 1)) \cdot \frac{P(a_{k + 1})}{\prod_{j = 1}^{k}(1 - P(x_j))}\\
=&P(a_{k + 1}).
\end{align*}
\end{proof}
\qed
This leaves us with the task of constructing a query $\poly$ over $\ti$ such that each world of the $\ti$ is mapped to the disjoint $\bi$ world of the first tuple appearing in the $\ti$ world. Setting $\poly$ to the following query yields the desired result.
\begin{lstlisting}
SELECT A FROM TI as a
WHERE A = 1 OR
OR A = 2 AND NOT EXISTS(SELECT A FROM TI as b
WHERE A = 1 AND a.blockID = b.blockID)
$\vdots$
OR A = $|$b.blockID$|$ AND NOT EXISTS(SELECT A FROM TI as b
WHERE A = 1 OR A = 2 $\ldots$ A = $|$b.blockID$|$ AND a.blockID = b.blockID
\end{lstlisting}
\begin{Lemma}\label{lem:bi-red-ti-q}
The query $\poly$ satisfies the requirements of ~\cref{def:bi-red-ti-q}.
\end{Lemma}
\begin{proof}[Proof of Lemma ~\ref{lem:bi-red-ti-q}]
For any possible world in $2^b$, notice that the WHERE clause selects the tuple with the greatest ordering in the possible world. For all other tuples, disjunction of predicates dictates that no other tuple will be in the output by mutual exclusivity of the disjunction. Thus, it is the case for any $\ti$ possible world, that the tuple $x_i$ with the greatest ordering appearing in that possible world will alone be in the output, and all such possible worlds $x_i$ as the greatest in the ordering will output the same world corresponding to the $\bi$ world for the disjoint tuple $a_i$.
\end{proof}
\qed
\begin{proof}[Proof of Theorem ~\ref{theorem:bi-red-ti}]
By lemmas ~\ref{lem:bi-red-ti-prob} and ~\ref{lem:bi-red-ti-q} the proof follows.
\end{proof}

View File

@ -25,6 +25,22 @@
\usepackage{todonotes}
\usepackage{graphicx}
\usepackage{listings}
%%%%%%%%%% SQL + proveannce listing settings
\lstdefinestyle{psql}
{
tabsize=2,
basicstyle=\small\upshape\ttfamily,
language=SQL,
morekeywords={PROVENANCE,BASERELATION,INFLUENCE,COPY,ON,TRANSPROV,TRANSSQL,TRANSXML,CONTRIBUTION,COMPLETE,TRANSITIVE,NONTRANSITIVE,EXPLAIN,SQLTEXT,GRAPH,IS,ANNOT,THIS,XSLT,MAPPROV,cxpath,OF,TRANSACTION,SERIALIZABLE,COMMITTED,INSERT,INTO,WITH,SCN,UPDATED,LENS,SCHEMA_MATCHING,string,WINDOW,max,OVER,PARTITION,FIRST_VALUE,WITH},
extendedchars=false,
keywordstyle=\bfseries,
mathescape=true,
escapechar=@,
sensitive=true
}
\lstset{style=psql}
%%%%%%%%%%%%%%%%%%BORROWED FROM UADB paper^-----
\usepackage{fancyvrb}
\usepackage{caption}
\usepackage{subcaption}