Updated P1.4

This commit is contained in:
Atri Rudra 2021-09-12 09:47:09 -04:00
parent df23c8a5c0
commit a24373bc59

View file

@ -213,19 +213,21 @@ For queries on the hard side of the dichotomy, and the best known algorithmic ap
\mypar{Approximating the expected multiplicities}
\AR{Have done my pass till here}
Our initial results indicate that \abbrBPDB{}s can not achieve comparable performance to deterministic databases for exact results.
In the remainder of this work, we demonstrate that a $(1-\epsilon)$ (multiplicative) approximation with competitive performance is achievable.
Our negative results indicate that \abbrBPDB{}s can not achieve comparable performance to deterministic databases for exact results (under standard complexity results). In fact, under plausible hardness conjecture, one cannot improve upon the trivial algorithm to exactly compute the expected multiplicities for \abbrTIDB. A natural followup questions is whether we can do better if we are willing to settle for an approximation to the expeccted multiplities.
In the remainder of this work, we demonstrate that a $(1\pm\epsilon)$ (multiplicative) approximation with competitive performance is indeed achievable.
Like set-probabilistic databases, our approach adopts the intensional model of query evaluation, as illustrated in \Cref{fig:two-step}.
Given input $\pdb$ and $\query$, the first step, which we will refer to as \termStepOne (\abbrStepOne), outputs every tuple $\tup$ that possibly satisfies $\query$, annotated with its lineage polynomial ($\poly$).
Given input $\dbbase$ and $\query$, the first step, which we will refer to as \termStepOne (\abbrStepOne), outputs every tuple $\tup$ that possibly satisfies $\query$, annotated with its lineage polynomial ($\poly(\vct{X})=\apolyqdt\inparen{\vct{X}}$).
The second step, \termStepTwo (\abbrStepTwo) consists of computing $\expct\pbox{\poly(\vct{\randWorld})}$ from the output of the first step.
For bag-\abbrPDB $\pdb$, query $Q$, let $\timeOf{\abbrStepOne}(Q,\pdb)$ denote the runtime of \abbrStepOne.
With the output of step-one as $\circuit$, respectively denote by $\timeOf{\abbrStepTwo}(\circuit)$ the runtime of \abbrStepTwo, allowing us to formally define our objective:
For \abbrBPDB $\pdb$, query $\query$, let $\timeOf{\abbrStepOne}(Q,\dbbase,\circuit)$ denote the runtime of \abbrStepOne, when it outputs $\circuit$ (which is a representation of $\poly$-- more on this representation shortly).
Let us denote by $\timeOf{\abbrStepTwo}(\circuit)$ (recall $\circuit$ is the output of \abbrStepOne) the runtime of \abbrStepTwo, allowing us to formally define our objective:
\begin{Problem}\label{prob:big-o-joint-steps}
Given bag-\abbrPDB $\pdb$, $\raPlus$ query $\query$ and output tuple $\tup$,
does there exist a $(1-\epsilon)$-approximation of \Cref{prob:bag-pdb-query-eval} where
$\timeOf{}^*(Q, \pdb) = \timeOf{\abbrStepOne}(Q,\pdb) + \timeOf{\abbrStepTwo}(\circuit) = O(\qruntime{Q, \dbbase})$
Given \abbrBPDB $\pdb$, $\raPlus$ query $\query$ and output tuple $\tup$,
does there exist a $(1\pm\epsilon)$-approximation of $\expct_{\db\sim\pd}\pbox{\query\inparen{\db}\inparen{\tup}}$ (for all resuult tuples $\tup$) for some $\circuit$ such that
$\timeOf{\abbrStepOne}(Q,\dbbase,\circuit) + \timeOf{\abbrStepTwo}(\circuit) \le O(\qruntime{Q, \dbbase})$?
\end{Problem}
Note that if the answer to the above problem is yes, then we have shown that the answer to \Cref{prob:informal} is yes (when we are interested in approximating the expected muktiplities).
We show in \Cref{sec:circuit-runtime}\OK{confirm this ref} an $O(\qruntime{Q, \dbbase})$ algorithm for constructing the lineage polynomial of the singleton result tuple of a count query.
% , and by extension the first step is in \sharpwonehard\AH{\sharpwonehard is not defined.}.