From ebf7554f1121593f65af270ed3ebe62d88183926 Mon Sep 17 00:00:00 2001 From: Aaron Huber Date: Thu, 21 Nov 2019 14:44:24 -0500 Subject: [PATCH] Problem Definition Rough Draft Added --- main.tex | 2 +- prob_def.tex | 15 +++++++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-) create mode 100644 prob_def.tex diff --git a/main.tex b/main.tex index 76d573c..26aa1be 100644 --- a/main.tex +++ b/main.tex @@ -123,7 +123,7 @@ %\input{abstract} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - +\input{prob_def} \input{notation} \input{analysis} \input{est_bounds} diff --git a/prob_def.tex b/prob_def.tex new file mode 100644 index 0000000..ff50a26 --- /dev/null +++ b/prob_def.tex @@ -0,0 +1,15 @@ +% -*- root: main.tex -*- +\section{Problem Definition} +\label{sec:prob-def} +Our work involves overcoing the exponential computation time that it takes to compute world existence for a given tuple $\tup$ by estimating in polynomial time the same quantity within an ($\epsilon$, $\delta$) range. We employ the technique of sketching to obtain these results. + +The setting on which our work applies is as follows. First, we are given a database $\db$. We limit ourselves to positive queries. A positive query $\query$ is a query composed from the following set of operators: selection ($\selection$), projection ($\projection$), join/cross-product ($\join$), and union ($\union$), abbreviated as SPJU. Given database $\db$, a query Q performs its operations upon all the rows belonging to the tables it involves. + +Since our problem space involves estimating the $\kDom$ value for a given world of an incomplete/probabilistic database, we are particularly interested in the projection, union, and join operators. Because $\selection$ only removes vectors from a query output, rather than combining or merging tuples together, as its counterparts do, $\selection$ is not necessary to consider. + +We could picture that each tuple has its own annotation, communicating the $\kDom$ value for each of the possible worlds. This annotation could be a vector $\genV$ of size $\numWorlds$, if we assume, for example, a Tuple Independent Database, where $N$ is the number of worlds. Each index $i$ of $\genV$ holds the $\kDom$ value for the $i^{th}$ world. + +In the above setting, consider a query $\query$ = $\projection(R\join S\join T\join U)$. The output of the 4-way join will be tuples who match all the selection conditions for each $\join$ operation. To calculate world membership, the vectors for each tuple subpart are multiplied. This is the equivalent of taking the Hadamard product across four vectors, for each tuple in the output. The final $\projection$ operation, will involve summing the vectors of the tuples from the join output, whose attributes share the same value(s). + +Such a query setting generalizes to the Sum of Products operation over the tuple vectors $\genV$. +