paper-BagRelationalPDBsAreHard/abstract.tex

20 lines
2.2 KiB
TeX

%root: main.tex
%!TEX root=./main.tex
\begin{abstract}
In this work, we study the problem of computing a tuple's expected multiplicity over probabilistic databases with bag semantics (where each tuple is associated with a multiplicity) exactly and approximately.
We consider bag-\abbrTIDB\xplural where we have a bound $\bound$ on the maximum multiplicity of each tuple and tuples are independent probabilistic events (we refer to such databases \BGdel{bag-\abbrTIDB\xplural}{let's just use c-TIDBs from the get-go?} as \abbrCTIDB\xplural).
\BGdel{In this work we consider the case when $\bound$ is a constant (since that is what is used in practice).}{overlap with surrounding}
We are specifically
interested in the fine-grained complexity of computing expected multiplicities and how it compares to the complexity of deterministic query evaluation algorithms --- if these complexities are comparable, it opens the door to practical deployment of probabilistic databases.
Unfortunately, % we show the reverse;
our results imply that computing expected multiplicities for \abbrCTIDB\xplural based on the results\BG{That is confusing, it may not be clear to most readers why the result of detecministic QP are useful here} produced by such query evaluation algorithms introduces super-linear overhead\BG{Over what?} (under parameterized complexity hardness assumptions/conjectures).
We proceed to study approximation of expected result tuple multiplicities for positive relational algebra queries ($\raPlus$) over \abbrCTIDB\xplural and for a non-trivial subclass of block-independent databases (\abbrBIDB\xplural).
We develop a sampling algorithm that computes a $(1 \pm \epsilon)$-approximation of the expected multiplicity of an output tuple in time linear in the runtime of the corresponding\BG{that just sounds hand-wavy, can we say something more concrete than comparable?} deterministic query for any $\raPlus$ query.
% By removing Bag-PDB's reliance on the sum-of-products representation of polynomials, this result paves the way for future work on PDBs that are competitive with deterministic databases.
\end{abstract}
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "main"
%%% End: