Started Polynomial Equivalence subsection.

2022-01-20 17:23:59 -05:00 · 2022-01-20 17:23:59 -05:00 · 0b4e3076f0
parent 7c4ec19edd
commit 0b4e3076f0
1 changed files with 13 additions and 9 deletions
--- a/intro-rewrite-070921.tex
+++ b/intro-rewrite-070921.tex
@ -12,7 +12,7 @@ $\pdb = \inparen{\inset{0,\ldots, c}^\numvar, \mathcal{P}}$ is a bag of tuples s
 %Since each tuple in $\pdb$ has a mutually exclusive probability distribution over its possible multiplicities, it is natural to reduce a \abbrCTIDB to traditional (set) block independent database (\abbrBIDB).  We refer to the reduced \abbrBIDB as a $1$-\abbrBIDB, as it is the case that each tuple can appear in a possible world at most $c = 1$ time.  \Cref{fig:ctidb-red} shows an example of this reduction.
 %}  
 \secrev{
-Allowing for $\leq c$ multiplicities across all tuples gives rise to having $\leq \inparen{c+1}^\numvar$ possible worlds instead of the usual $2^\numvar$ possible worlds of a (set) $1$-\abbrTIDB. 
+Allowing for $\leq c$ multiplicities across all tuples gives rise to having $\leq \inparen{c+1}^\numvar$ possible worlds instead of the usual $2^\numvar$ possible worlds of the traditional set \abbrTIDB. 
 In this work, it is natural to be specifically considering bag query semantics.

 We can formally state this problem as:
@ -157,7 +157,10 @@ Further, we generalize the \abbrPDB data model considered by the approximation a
 }
 \secrev{
 \subsection{Polynomial Equivalence}
-A common encoding of probabilistic databases (e.g., in \cite{IL84a,Imielinski1989IncompleteII,Antova_fastand,DBLP:conf/vldb/AgrawalBSHNSW06} and many others) relies on annotating tuples with lineages, propositional formulas that describe the set of possible worlds that the tuple appears in.  The bag semantics analog is a provenance/lineage polynomial $\apolyqdt$~\cite{DBLP:conf/pods/GreenKT07} (see~\Cref{fig:nxDBSemantics} for a definition), a polynomial with non-zero integer coefficients and exponents, over integer variables $\vct{X}$ encoding input tuple multiplicities.
+A common encoding of probabilistic databases (e.g., in \cite{IL84a,Imielinski1989IncompleteII,Antova_fastand,DBLP:conf/vldb/AgrawalBSHNSW06} and many others) relies on annotating tuples with lineages, propositional formulas that describe the set of possible worlds that the tuple appears in.  The bag semantics analog is a provenance/lineage polynomial $\apolyqdt$~\cite{DBLP:conf/pods/GreenKT07}, a polynomial with non-zero integer coefficients and exponents, over integer variables $\vct{X}$ encoding input tuple multiplicities.
+
+Intuitively, a \abbrCTIDB lends itself to a useful reduction to a specific type of block independent database (\abbrBIDB) which we refer to as a $1$-\abbrBIDB.  A $1$-\abbrBIDB is a \abbrBIDB in the traditional sense of allowing no duplicate tuples, \emph{but} where we use bag query semantics instead of the usual set query semantics.
+(see~\Cref{fig:nxDBSemantics} for a definition)
 \begin{figure}
  \begin{align*}
 	  \polyqdt{\project_A(\query)}{\dbbase}{\tup} =& \sum_{\tup': \project_A(\tup') = \tup} \polyqdt{\query}{\dbbase}{\tup'} &
@ -182,19 +185,20 @@ A common encoding of probabilistic databases (e.g., in \cite{IL84a,Imielinski198
 We drop $\query$, $\dbbase$, and $\tup$ from $\apolyqdt$ when they are clear from the context or irrelevant to the discussion. We now specify the problem of computing the expectation of tuple multiplicity in the language of lineage polynomials:
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{Problem}[Expected Multiplicity of Lineage Polynomials]\label{prob:bag-pdb-poly-expected}
-Given an $\raPlus$ query $\query$, 
-\AHchange{
-\abbrCTIDB $\pdb$
-}
-and result tuple $\tup$, compute the expected
-multiplicity of the polynomial $\apolyqdt$ (i.e., $\expct_{\vct{W}\sim \pdassign}\pbox{\apolyqdt(\vct{W})}$).,
-where $\pdassign$ is the distribution induced by $\pd$ on the relevant assignments $\vct{W}$ to variables of $\apolyqdt$.
+Given an $\raPlus$ query $\query$, \abbrCTIDB $\pdb$ and result tuple $\tup$, compute the expected
+multiplicity of the polynomial $\apolyqdt$ (i.e., $\expct_{\vct{W}\sim \pdassign}\pbox{\apolyqdt(\vct{W})}$).
+%,
+%where $\pdassign$ is the distribution induced by $\pd$ on the relevant assignments $\vct{W}$ to variables of $\apolyqdt$.
 \end{Problem}
 We note that computing \Cref{prob:expect-mult} 
 is equivalent to computing \Cref{prob:bag-pdb-poly-expected} (see \Cref{prop:expection-of-polynom}).
 In this work, we study the complexity of \Cref{prob:bag-pdb-poly-expected} for several models of probabilistic databases and various encodings of such polynomials.
 }

+\AHchange{
+\LARGE Old Stuff
+}
+
 A probabilistic database (PDB) $\pdb$ is a pair $\inparen{\idb, \pd}$, where $\idb$ is a set of deterministic database instances called possible worlds and $\pd$ is a probability distribution over $\idb$.
 \AHchange{
 A tuple independent database (\abbrTIDB) (to which we will refer to later) is a \abbrPDB such that each tuple is an independent random event.