Updating ACM stylesheet and cleaning up the nastier consequences
parent
a2b8867edb
commit
163d4007f4
|
@ -1,7 +1,7 @@
|
|||
%root: main.tex
|
||||
%!TEX root=./main.tex
|
||||
\begin{abstract}
|
||||
The problem of computing the marginal probability of a tuple in the result of a query over set-probabilistic databases (PDBs) can be reduced to calculating the probability of the \emph{lineage formula} of the result, a Boolean formula over random variables representing the existence of tuples in each of the database's possible worlds.
|
||||
The problem of computing the marginal probability of a tuple in the result of a query over set-probabilistic databases (PDBs) can be reduced to calculating the probability of the \emph{lineage formula} of the result, a Boolean formula over random variables representing the existence of tuples in the database's possible worlds.
|
||||
The analog for bag semantics is a natural number-valued polynomial over random variables that evaluates to the multiplicity of the tuple in each world.
|
||||
In this work, we study the problem of calculating the expectation of such polynomials (a tuple's expected multiplicity) exactly and approximately.
|
||||
For tuple-independent databases (TIDBs), the expected multiplicity of a query result tuple can trivially be computed in linear time in the size of the tuple's lineage, if this polynomial is encoded as a sum of products.
|
||||
|
|
744
acmart.cls
744
acmart.cls
File diff suppressed because it is too large
Load Diff
|
@ -22,8 +22,8 @@ Analogously, this problem can be reduced to computing the expectation of the lin
|
|||
This problem has received much less attention, perhaps because the problem is trivially tractable.
|
||||
In fact it is linear time when the lineage polynomial is encoded in the typical sum of products (SOP) representation.
|
||||
However, there exist compressed representations of polynomials, e.g., factorizations~\cite{factorized-db}, that can be polynomially more concise than the SOP representation of a polynomial.
|
||||
These compression schemes are close analogs of typical database optimizations like projection push-down~\cite{DBLP:conf/pods/KhamisNR16}, hinting that perhaps even Bag-PDBs inherently have higher query processing complexity than deterministic databases.
|
||||
In this paper, we confirm this intuition, first proving (by reduction from counting $k$-matchings) that computing the expected count of a query result tuple is super-linear (\sharpwonehard) in the size of a compressed (factorized~\cite{factorized-db}) lineage representation, and then relating the size of the compressed lineage to the cost of answering a deterministic query.
|
||||
These compression schemes are analogous to typical database optimizations like projection push-down~\cite{DBLP:conf/pods/KhamisNR16}, hinting that perhaps even Bag-PDBs have higher query processing complexity than deterministic databases.
|
||||
In this paper, we confirm this intuition, first proving (by reduction from counting $k$-matchings) that computing the expected count of a query result tuple is super-linear (\sharpwonehard) in the size of a compressed lineage representation, and then relating the size of the compressed lineage to the cost of answering a deterministic query.
|
||||
|
||||
In spite of this negative result, not everything is lost.
|
||||
We develop an approximation algorithm for expected counts of SPJU query results over Bag-PDBs that is, to our knowledge, the the first linear time (in the size of the factorized lineage) $(1-\epsilon)$-approximation.
|
||||
|
|
2
main.tex
2
main.tex
|
@ -95,6 +95,7 @@ sensitive=true
|
|||
% \orcid{1234-5678-9012}
|
||||
\affiliation{%
|
||||
\institution{Illinois Institute of Technology}
|
||||
\country{USA}
|
||||
}
|
||||
\email{sfeng14@hawk.iit.edu,bglavic@iit.edu}
|
||||
|
||||
|
@ -102,6 +103,7 @@ sensitive=true
|
|||
% \orcid{1234-5678-9012}
|
||||
\affiliation{%
|
||||
\institution{University at Buffalo}
|
||||
\country{USA}
|
||||
}
|
||||
\email{ahuber,okennedy,atri@buffalo.edu}
|
||||
|
||||
|
|
Loading…
Reference in New Issue