Finally, started my pass on Sec 5

master
Atri Rudra 2020-12-16 21:34:26 -05:00
parent f2a6a62320
commit b9ad03d01b
2 changed files with 9 additions and 1 deletions

View File

@ -1,5 +1,9 @@
%!TEX root=./main.tex
\section{Comparison to Determinstic Runtime}
\section{Generalizations}
In this section, we consider couple of generalizations/corollaries of our results so far. In particular, in~\Cref{sec:circuits} we first consider the case when the compressed polynomial is represented by a Directed Acyclic Graph (DAG) instead of the earlier (expression) tree (\Cref{def:express-tree}) and we observe that all of our results carry over to the DAG representation. Then we formalize our claim in~\Cref{sec:intro} that a linear runtime algorithm for our problem would imply that we can process PDBs in the same time as deterministic query processing. Finally, in~\Cref{sec:momemts}, we make some simple observations on how our results can be used to estimate moments beyond the expectation of a lineage polynomial.
\subsection{Lineage circuits}
\label{sec:circuits}
Thus far, our analysis of the runtime of $\onepass$ has been in terms of the size of the compressed lineage polynomial.
We now show that this models the behavior of a deterministic database by proving that for any boolean conjunctive query, we can construct a compressed lineage polynomial with the same complexity as it would take to evaluate the query on a deterministic \emph{bag-relational} database.
@ -148,3 +152,6 @@ The circuit for $Q$ has $|V_{Q_1}|+\ldots+|V_{Q_n}|+(n-1)|{Q_1} \bowtie \ldots \
The property holds for all recursive queries, and the proof holds.
\end{proof}
\subsection{Higher moments}
\label{sec:momemts}

View File

@ -2,6 +2,7 @@
%!TEX root=./main.tex
\section{Introduction}
\label{sec:intro}
Modern production databases like Postgres and Oracle use bag semantics, while research on probabilistic databases (PDBs)~\cite{DBLP:series/synthesis/2011Suciu,DBLP:conf/sigmod/BoulosDMMRS05,DBLP:conf/icde/AntovaKO07a,DBLP:conf/sigmod/SinghMMPHS08} focuseses predominantly on query evaluation under set semantics.
This is not surprising, as the conventional strategy for encoding the lineage of a query result --- a key component of query evaluation in PDBs --- makes computing typical statistics like marginal probabilities or moments easy (at worst linear in the size of the lineage) for bags, but hard (at worst exponential in the size of the lineage) for sets.