Started Polynomial Equivalence subsection.
This commit is contained in:
parent
7c4ec19edd
commit
0b4e3076f0
|
@ -12,7 +12,7 @@ $\pdb = \inparen{\inset{0,\ldots, c}^\numvar, \mathcal{P}}$ is a bag of tuples s
|
|||
%Since each tuple in $\pdb$ has a mutually exclusive probability distribution over its possible multiplicities, it is natural to reduce a \abbrCTIDB to traditional (set) block independent database (\abbrBIDB). We refer to the reduced \abbrBIDB as a $1$-\abbrBIDB, as it is the case that each tuple can appear in a possible world at most $c = 1$ time. \Cref{fig:ctidb-red} shows an example of this reduction.
|
||||
%}
|
||||
\secrev{
|
||||
Allowing for $\leq c$ multiplicities across all tuples gives rise to having $\leq \inparen{c+1}^\numvar$ possible worlds instead of the usual $2^\numvar$ possible worlds of a (set) $1$-\abbrTIDB.
|
||||
Allowing for $\leq c$ multiplicities across all tuples gives rise to having $\leq \inparen{c+1}^\numvar$ possible worlds instead of the usual $2^\numvar$ possible worlds of the traditional set \abbrTIDB.
|
||||
In this work, it is natural to be specifically considering bag query semantics.
|
||||
|
||||
We can formally state this problem as:
|
||||
|
@ -157,7 +157,10 @@ Further, we generalize the \abbrPDB data model considered by the approximation a
|
|||
}
|
||||
\secrev{
|
||||
\subsection{Polynomial Equivalence}
|
||||
A common encoding of probabilistic databases (e.g., in \cite{IL84a,Imielinski1989IncompleteII,Antova_fastand,DBLP:conf/vldb/AgrawalBSHNSW06} and many others) relies on annotating tuples with lineages, propositional formulas that describe the set of possible worlds that the tuple appears in. The bag semantics analog is a provenance/lineage polynomial $\apolyqdt$~\cite{DBLP:conf/pods/GreenKT07} (see~\Cref{fig:nxDBSemantics} for a definition), a polynomial with non-zero integer coefficients and exponents, over integer variables $\vct{X}$ encoding input tuple multiplicities.
|
||||
A common encoding of probabilistic databases (e.g., in \cite{IL84a,Imielinski1989IncompleteII,Antova_fastand,DBLP:conf/vldb/AgrawalBSHNSW06} and many others) relies on annotating tuples with lineages, propositional formulas that describe the set of possible worlds that the tuple appears in. The bag semantics analog is a provenance/lineage polynomial $\apolyqdt$~\cite{DBLP:conf/pods/GreenKT07}, a polynomial with non-zero integer coefficients and exponents, over integer variables $\vct{X}$ encoding input tuple multiplicities.
|
||||
|
||||
Intuitively, a \abbrCTIDB lends itself to a useful reduction to a specific type of block independent database (\abbrBIDB) which we refer to as a $1$-\abbrBIDB. A $1$-\abbrBIDB is a \abbrBIDB in the traditional sense of allowing no duplicate tuples, \emph{but} where we use bag query semantics instead of the usual set query semantics.
|
||||
(see~\Cref{fig:nxDBSemantics} for a definition)
|
||||
\begin{figure}
|
||||
\begin{align*}
|
||||
\polyqdt{\project_A(\query)}{\dbbase}{\tup} =& \sum_{\tup': \project_A(\tup') = \tup} \polyqdt{\query}{\dbbase}{\tup'} &
|
||||
|
@ -182,19 +185,20 @@ A common encoding of probabilistic databases (e.g., in \cite{IL84a,Imielinski198
|
|||
We drop $\query$, $\dbbase$, and $\tup$ from $\apolyqdt$ when they are clear from the context or irrelevant to the discussion. We now specify the problem of computing the expectation of tuple multiplicity in the language of lineage polynomials:
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\begin{Problem}[Expected Multiplicity of Lineage Polynomials]\label{prob:bag-pdb-poly-expected}
|
||||
Given an $\raPlus$ query $\query$,
|
||||
\AHchange{
|
||||
\abbrCTIDB $\pdb$
|
||||
}
|
||||
and result tuple $\tup$, compute the expected
|
||||
multiplicity of the polynomial $\apolyqdt$ (i.e., $\expct_{\vct{W}\sim \pdassign}\pbox{\apolyqdt(\vct{W})}$).,
|
||||
where $\pdassign$ is the distribution induced by $\pd$ on the relevant assignments $\vct{W}$ to variables of $\apolyqdt$.
|
||||
Given an $\raPlus$ query $\query$, \abbrCTIDB $\pdb$ and result tuple $\tup$, compute the expected
|
||||
multiplicity of the polynomial $\apolyqdt$ (i.e., $\expct_{\vct{W}\sim \pdassign}\pbox{\apolyqdt(\vct{W})}$).
|
||||
%,
|
||||
%where $\pdassign$ is the distribution induced by $\pd$ on the relevant assignments $\vct{W}$ to variables of $\apolyqdt$.
|
||||
\end{Problem}
|
||||
We note that computing \Cref{prob:expect-mult}
|
||||
is equivalent to computing \Cref{prob:bag-pdb-poly-expected} (see \Cref{prop:expection-of-polynom}).
|
||||
In this work, we study the complexity of \Cref{prob:bag-pdb-poly-expected} for several models of probabilistic databases and various encodings of such polynomials.
|
||||
}
|
||||
|
||||
\AHchange{
|
||||
\LARGE Old Stuff
|
||||
}
|
||||
|
||||
A probabilistic database (PDB) $\pdb$ is a pair $\inparen{\idb, \pd}$, where $\idb$ is a set of deterministic database instances called possible worlds and $\pd$ is a probability distribution over $\idb$.
|
||||
\AHchange{
|
||||
A tuple independent database (\abbrTIDB) (to which we will refer to later) is a \abbrPDB such that each tuple is an independent random event.
|
||||
|
|
Loading…
Reference in a new issue