output tuple -> result tuple

This commit is contained in:
Atri Rudra 2021-09-13 17:58:23 -04:00
parent 656027c3c9
commit 4c01ed4530

View file

@ -15,7 +15,7 @@ The natural generalization of the problem of computing marginal probabilities of
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{Problem}[Expected Multiplicity]\label{prob:bag-pdb-query-eval}
Given an \raPlus query\footnote{The class of positive relational algebra (\raPlus) queries consists of all queries that can be composed of the positive (monotonic) relational algebra operators: selection, projection, join, and union (SPJU).} $\query$, \abbrBPDB $\pdb$, and output tuple $\tup$, compute the expected
Given an \raPlus query\footnote{The class of positive relational algebra (\raPlus) queries consists of all queries that can be composed of the positive (monotonic) relational algebra operators: selection, projection, join, and union (SPJU).} $\query$, \abbrBPDB $\pdb$, and result tuple $\tup$, compute the expected
multiplicity ($\expct_{\db\sim\pd}\pbox{\query\inparen{\db}\inparen{\tup}}$)
of tuple $\tup$.
\end{Problem}
@ -76,7 +76,7 @@ We drop $\query$, $\dbbase$, and $\tup$ from $\apolyqdt$ when they are clear fro
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{Problem}[Expected Multiplicity of Lineage Polynomials]\label{prob:bag-pdb-poly-expected}
Given an $\raPlus$ query $\query$, \abbrBPDB $\pdb$, and output tuple $\tup$, compute the expected
Given an $\raPlus$ query $\query$, \abbrBPDB $\pdb$, and result tuple $\tup$, compute the expected
multiplicity of the polynomial $\apolyqdt$ (i.e., $\expct_{\vct{W}\sim \pdassign}\pd\pbox{\apolyqdt(\vct{W})}$),
where $\pdassign$ is the distribution induced by $\pd$ on the relevant assignements to variables of $\apolyqdt$.
\end{Problem}
@ -310,7 +310,7 @@ Given a circuit $\circuit$ for $\apolyqdt$ (over all result tuples $\tup$) for \
%graph query for the special case of all $\prob_i = \prob$ for some $\prob$ in $(0, 1)$;
%(ii) To complement our hardness results, we consider an approximate version of~\Cref{prob:intro-stmt}, where instead of computing the expected multiplicity exactly, we allow for an $(1\pm\epsilon)$-\emph{multiplicative} approximation of the expected multiplicitly.
(i) We show that for typical database usage patterns (e.g. when the circuit is a tree or is generated by recent worst-case optimal join algorithms or their Functional Aggregate Query (FAQ)/Aggregations and Joins over Annotated Relations (AJAR) followups~\cite{DBLP:conf/pods/KhamisNR16, ajar}), where there is a single result tuple the answer to \Cref{prob:intro-stmt} for \abbrTIDB is {\em yes}.\footnote{We can approximate the expected output tuple multiplicities (for all output tuples {\em simultanesouly} with only $O(\log{Z})=O_k(\log{n})$ overhead (where $Z$ is the number of output tuples) over the runtime of a broad class of query processing algorithms (see \Cref{app:sec-cicuits}).}
(i) We show that for typical database usage patterns (e.g. when the circuit is a tree or is generated by recent worst-case optimal join algorithms or their Functional Aggregate Query (FAQ)/Aggregations and Joins over Annotated Relations (AJAR) followups~\cite{DBLP:conf/pods/KhamisNR16, ajar}), where there is a single result tuple the answer to \Cref{prob:intro-stmt} for \abbrTIDB is {\em yes}.\footnote{We can approximate the expected result tuple multiplicities (for all result tuples {\em simultanesouly} with only $O(\log{Z})=O_k(\log{n})$ overhead (where $Z$ is the number of result tuples) over the runtime of a broad class of query processing algorithms (see \Cref{app:sec-cicuits}).}
% the approximation algorithm has runtime linear in the size of the compressed lineage encoding (
In contrast, known approximation techniques in set-\abbrPDB\xplural are at most quadratic in the size of the compressed lineage encoding~\cite{DBLP:conf/icde/OlteanuHK10,DBLP:journals/jal/KarpLM89}.
%Atri: The footnote below does not add much
@ -327,7 +327,7 @@ SELECT 1 FROM OnTime a, Route r, OnTime b
WHERE a.city = r.city1 AND b.city = r.city2
\end{lstlisting}
%$Q()\dlImp$$OnTime(\text{City}), Route(\text{City}, \text{City}'),$ $OnTime(\text{City}')$
It can be verified that $\poly\inparen{A, B, C, E, X, Y, Z}$ for the sole output tuple (i.e. the count) of $\query$ is $AXB + BYE + BZC$. Now consider the product query $\query^2(\db) = \query(\db) \times \query(\db)$.
It can be verified that $\poly\inparen{A, B, C, E, X, Y, Z}$ for the sole result tuple (i.e. the count) of $\query$ is $AXB + BYE + BZC$. Now consider the product query $\query^2(\db) = \query(\db) \times \query(\db)$.
The lineage polynomial for $Q^2$ is given by $\poly^2\inparen{A, B, C, E, X, Y, Z}$:\AR{Changed the variable $D$ to $E$ to avoid conflict with use of $D$ as a DB.}
\begin{multline*}