complexity macros

master
Boris Glavic 2020-12-11 19:29:15 -06:00
parent 9e02e98638
commit 7bc9328ee5
2 changed files with 5 additions and 4 deletions

View File

@ -2,7 +2,7 @@
\section{Introduction}
Modern production databases like Postgres and Oracle use bag semantics. In contrast, most implementations of probabilistic databases (PDBs) are built in the setting of set semantics, where computing the probability of an output tuple is analogous to weighted model counting (a known \sharpPhard problem).
Modern production databases like Postgres and Oracle use bag semantics. In contrast, most implementations of probabilistic databases (PDBs) are built in the setting of set semantics, where computing the probability of an output tuple is analogous to weighted model counting (a known \sharpphard problem).
%the annotation of the tuple is a lineage formula ~\cite{DBLP:series/synthesis/2011Suciu}, which can essentially be thought of as a boolean formula. It is known that computing the probability of a lineage formula is \#-P hard in general
In PDBs, a boolean formula, ~\cite{DBLP:series/synthesis/2011Suciu} also called a lineage formula, encodes the conditions under which each output tuple appears in the result.
%The marginal probability of this formula being true is the tuple's probability to appear in a possible world.
@ -99,7 +99,7 @@ Assume the following $\mathbb{B}/\mathbb{N}$ variable assignments: $W_a\mapsto T
\end{align*}
In the set/lineage setting, we find that the boolean query is satisfied, while in the bags evaluation we see how many combinations of the input satsify the query.
\end{Example}
Note that computing the probability of the query of ~\cref{ex:intro} in set semantics is indeed $\sharpP$ hard, since it is a query that is non-hierarchical
Note that computing the probability of the query of ~\cref{ex:intro} in set semantics is indeed \sharpphard, since it is a query that is non-hierarchical
%, i.e., for $Vars(\poly)$ denoting the set of variables occuring across all atoms of $\poly$, a function $sg(x)$ whose output is the set of all atoms that contain variable $x$, we have that $sg(A) \cap sg(B) \neq \emptyset$ and $sg(A)\not\subseteq sg(B)$ and $sg(B)\not\subseteq sg(A)$,
~\cite{10.1145/1265530.1265571}. %Thus, computing $\expct\pbox{\poly(W_a, W_b, W_c)}$, i.e. the probability of the output with annotation $\poly(W_a, W_b, W_c)$, ($\prob(q)$ in Dalvi, Sucui) is hard in set semantics.
To see why this computation is hard for query $\poly$ over set semantics, from the query input we compute an output lineage formula of $\poly(W_a, W_b, W_c) = W_aW_b \vee W_bW_c \vee W_cW_a$. Note that the conjunctive clauses are not independent of one another and the computation of the probability is not linear in the size of $\poly(W_a, W_b, W_c)$:

View File

@ -169,8 +169,9 @@ sensitive=true
\input{single_p}
\input{lin_sys}
\input{approx_alg}
%\input{bi_cancellation}
% \input{bi_cancellation}
\input{related-work}
\input{conclusions}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%