Moved def poly degree as well as cut a few other extraneous sentences.
parent
167623ab98
commit
900820976e
10
appendix.tex
10
appendix.tex
|
@ -11,6 +11,16 @@
|
|||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\onecolumn
|
||||
\section{Missing details from Section~\ref{sec:background}}\label{sec:proofs-background}
|
||||
\subsection{Polynomials}
|
||||
\begin{Definition}[Degree]\label{def:degree-of-poly}
|
||||
The degree of polynomial $\genpoly(\vct{X})$ is the largest $\sum_{\tup\in S}d_\tup
|
||||
$ for all $\vct{d}\in\inset{0,\ldots,\hideg}^S$
|
||||
such that $c_{(d_1,\dots,d_n)}\ne 0$.
|
||||
We denote the degree of $\genpoly$ as $\deg\inparen{\genpoly}$.
|
||||
\end{Definition}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
As an example, the degree of the polynomial $X^2+2XY^2+Y^2$ is $3$.
|
||||
Product terms in lineage arise only from join operations (\Cref{fig:nxDBSemantics}), so intuitively, the degree of a lineage polynomial is analogous to the largest number of joins needed to produce a result tuple.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Background details for proof of~\Cref{prop:expection-of-polynom}}\label{app:subsec:background-nxdbs}
|
||||
|
|
|
@ -117,7 +117,7 @@ In particular, if $\prob_0>0$ and $\gamma<1$ are absolute constants then the abo
|
|||
|
||||
The restriction on $\gamma$ is satisfied by any
|
||||
$1$-\abbrTIDB (where $\gamma=0$ in the equivalent $1$-\abbrBIDB of~\Cref{prop:ctidb-reduct})
|
||||
as well as for all three queries of the PDBench \abbrBIDB benchmark (\Cref{app:subsec:experiment}). Further, we can also argue the following result, recalling from~\Cref{sec:intro} for \abbrCTIDB $\pdb = \inparen{\worlds, \bpd}$, where $\tupset$ is the set of possible tuples across all possible worlds of $\pdb$.
|
||||
as well as for all three queries of the PDBench \abbrBIDB benchmark (\Cref{app:subsec:experiment}). Further, we can also argue the following result.%, recalling from~\Cref{sec:intro} for \abbrCTIDB $\pdb = \inparen{\worlds, \bpd}$, where $\tupset$ is the set of possible tuples across all possible worlds of $\pdb$.
|
||||
|
||||
\begin{Lemma}
|
||||
\label{lem:ctidb-gamma}
|
||||
|
|
|
@ -20,15 +20,6 @@ Unless othewise noted, we consider all polynomials to be in \abbrSMB representat
|
|||
When it is unclear, we use $\smbOf{\genpoly}~\inparen{\smbOf{\poly}}$ to denote the \abbrSMB form of a polynomial (lineage polynomial) $\genpoly~\inparen{\poly}$.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\begin{Definition}[Degree]\label{def:degree-of-poly}
|
||||
The degree of polynomial $\genpoly(\vct{X})$ is the largest $\sum_{\tup\in S}d_\tup
|
||||
$ for all $\vct{d}\in\inset{0,\ldots,\hideg}^S$
|
||||
such that $c_{(d_1,\dots,d_n)}\ne 0$.
|
||||
We denote the degree of $\genpoly$ as $\deg\inparen{\genpoly}$.
|
||||
\end{Definition}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
As an example, the degree of the polynomial $X^2+2XY^2+Y^2$ is $3$.
|
||||
Product terms in lineage arise only from join operations (\Cref{fig:nxDBSemantics}), so intuitively, the degree of a lineage polynomial is analogous to the largest number of joins needed to produce a result tuple.
|
||||
|
||||
We call a polynomial $\poly\inparen{\vct{X}}$ a \emph{\abbrCTIDB-lineage polynomial} (or simply lineage polynomial), if it is clear from context that there exists an $\raPlus$ query $\query$, \abbrCTIDB $\pdb$, and result tuple $\tup$ such that $\poly\inparen{\vct{X}} = \apolyqdt\inparen{\vct{X}}.$
|
||||
|
||||
|
|
|
@ -65,7 +65,7 @@ Set query evaluation semantics over $1$-\abbrTIDB\xplural have been studied exte
|
|||
to be \sharpphard\cite{10.1145/1265530.1265571}.
|
||||
Grohe et. al.~\cite{https://doi.org/10.48550/arxiv.2201.11524} studied bag-\abbrTIDB\xplural allowing for unbounded multiplicities which requires them to explicitly address the issue of a succinct representation of probability distributions over infinitely many multiplicities.
|
||||
This work demonstrated the existence of a dichotomy for
|
||||
the problem of computing the probability that an output tuple has a multiplicity of at most $k$.
|
||||
the problem of computing the probability that an output tuple has a multiplicity of at most $s$.
|
||||
% investigates the query evaluation problem over bag-\abbrTIDB\xplural when computing the probability of an output tuple having at most a multiplicity of $k$, showing that a dichotomy exists for this problem.
|
||||
% While the authors observe that computing the expectation of an output tuple multiplicity is in polynomial time, no further (fine-grained) analysis of the expected value is considered.
|
||||
% Our work in contrast assumes a finite bound on the multiplicities where we simply list the finitely many probability values (and hence do not need consider a more succinct representation). Further, our work primarily looks into the fine-grained analysis of computing the expected multiplicity of an output tuple.
|
||||
|
@ -193,7 +193,7 @@ The proof in \Cref{subsec:proof-exp-poly-rpoly} follows by~\Cref{prop:ctidb-redu
|
|||
\subsection{Our Techniques}
|
||||
\mypar{Lower Bound Proof Techniques}
|
||||
Our main hardness result shows that computing~\Cref{prob:expect-mult} is $\sharpwonehard$ for $1$-\abbrTIDB. To prove this result we show that for the same $\query_1$ from the example above, for an arbitrary `product width' $k$, the query $\qhard^k$ is able to encode various hard graph-counting problems (assuming $\bigO{\numvar}$ tuples rather than the $\bigO{1}$ tuples in \Cref{fig:two-step}).
|
||||
We do so by considering an arbitrary graph $G$ (analogous to relation $\boldsymbol{R}$ of $\query_1$) and analyzing how the coefficients in the (univariate) polynomial $\widetilde{\poly}\left(p,\dots,p\right)$ relate to counts of subgraphs in $G$ that are isomorphic to various subgraphs with $k$ edges. E.g., we exploit the fact that the coefficient corresponding to the power of $2k$ in $\poly$ of $\qhard^k$ is proportional to the number of $k$-matchings in $G$,
|
||||
We do so by considering an arbitrary graph $G$ (analogous to relation $\boldsymbol{R}$ of $\query_1$) and analyzing how the coefficients in the (univariate) polynomial $\widetilde{\poly}\left(p,\dots,p\right)$ relate to counts of subgraphs in $G$ that are isomorphic to various subgraphs with $k$ edges. E.g., we exploit the fact that the coefficient corresponding to $\prob^{2k}$ in $\rpoly\inparen{\prob,\ldots,\prob}$ of $\qhard^k$ is proportional to the number of $k$-matchings in $G$,
|
||||
a known hard problem in parameterized/fine-grained complexity literature.
|
||||
|
||||
|
||||
|
@ -218,7 +218,7 @@ For example, if we insist that $\circuit$ represent the lineage polynomial in \a
|
|||
and hence, just $\timeOf{\abbrStepOne}(\query,\tupset,\circuit)$ is too large.
|
||||
However, systems can directly emit compact, factorized representations of $\poly(\vct{X})$ (e.g., as a consequence of the standard projection push-down optimization~\cite{DBLP:books/daglib/0020812}).
|
||||
Accordingly, this work uses (arithmetic) circuits\footnote{
|
||||
An arithmetic circuit is a DAG with variable and/or numeric source nodes and internal, each nodes representing either an addition or multiplication operator.
|
||||
An arithmetic circuit is a DAG with variable/numeric source gates and multiplication/addition internal/sink gates.
|
||||
}
|
||||
as the representation system of $\poly(\vct{X})$, and we show in \Cref{sec:circuit-depth} an $\bigO{\qruntime{\optquery{\query}, \tupset, \bound}}$ algorithm for constructing the lineage polynomial for all result tuples of an $\raPlus$ query $\query$ (or more more precisely, a single circuit $\circuit$ with one sink per tuple representing the tuple's lineage).
|
||||
|
||||
|
|
|
@ -13,12 +13,10 @@ A circuit $\circuit$ is a Directed Acyclic Graph (DAG) with source gates (in deg
|
|||
%
|
||||
Each gate has the following members: \type, \vari{input}, \val, \vpartial, \degval, \vari{Lweight}, and \vari{Rweight}, where \type is the value type $\{\circplus, \circmult, \var, \tnum\}$ and \vari{input} the list of inputs. Source gates have an extra member \val for the value. $\circuit_\linput$ ($\circuit_\rinput$) denotes the left (right) input of \circuit.
|
||||
\end{Definition}
|
||||
When the underlying DAG is a tree (with edges pointing towards the root), the structure is an expression tree \etree. In such a case, the root of \etree is analogous to the sink of \circuit. The fields \vari{partial}, \degval, \vari{Lweight}, and \vari{Rweight} are used in the proofs of \Cref{sec:proofs-approx-alg}.
|
||||
|
||||
We refer to the structure when the underlying DAG is a tree (with edges pointing towards the root) as an expression tree \etree. %In such a case, the root of \etree is analogous to the sink of \circuit. The fields \vari{partial}, \degval, \vari{Lweight}, and \vari{Rweight} are used in the proofs of \Cref{sec:proofs-approx-alg}.
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
||||
The circuits in \Cref{fig:two-step} encode their respective polynomials in column $\poly$.
|
||||
Note that the ciricuit \circuit representing $AX$ and the circuit \circuit' representing $B\inparen{Y+Z}$ each encode a tree, with edges pointing towards the root.
|
||||
The circuits $\inparen{1}$ and $\inparen{2}$ in column $\poly$ of \Cref{fig:two-step} are both expression trees.%encode their respective polynomials in column $\poly$.
|
||||
%Note that the ciricuit \circuit representing $AX$ and the circuit \circuit' representing $B\inparen{Y+Z}$ each encode a tree, with edges pointing towards the root.
|
||||
|
||||
|
||||
\begin{figure}[t!]
|
||||
|
|
|
@ -48,6 +48,10 @@
|
|||
\tabcolsep=0.075cm
|
||||
%\captionof{table}{Q}
|
||||
%\setlength{\cellspacetoplimit}{4pt}
|
||||
%\setlength\extrarowheight{10pt}
|
||||
% \setlength{\cellspacetoplimit}{10mm}
|
||||
% \setlength{\cellspacebottomlimit}{10mm}
|
||||
%\renewcommand{\arraystretch}{1.5}
|
||||
\begin{tabular}{>{\footnotesize}c | >{\centering\arraybackslash\footnotesize}m{1.95cm} | >{\centering\arraybackslash\footnotesize}m{3.95cm}}
|
||||
%\multicolumn{3}{c}{$\boldsymbol{\query_2(\pdb)}$}\\[1mm]
|
||||
%\toprule
|
||||
|
@ -55,15 +59,15 @@
|
|||
\midrule
|
||||
%\hline
|
||||
%\\\\[-3.5\medskipamount]
|
||||
$e_1$ & $AX$ &\resizebox{!}{9mm}{
|
||||
$e_1$ & $AX$ &\adjustbox{valign=b}{\resizebox{!}{9mm}{
|
||||
\begin{tikzpicture}[thick]
|
||||
\node[gen_tree_node](sink) at (0.5, 0.8){$\boldsymbol{\circmult}$};
|
||||
\node[gen_tree_node](source1) at (0, 0){$A$};
|
||||
\node[gen_tree_node](source2) at (1, 0){$X$};
|
||||
\draw[->](source1)--(sink);
|
||||
\draw[->] (source2)--(sink);
|
||||
\end{tikzpicture}% & $0.5 \cdot 1.0 + 0.5 \cdot 1.0 = 1.0$
|
||||
}\\% & $0.9$ \\
|
||||
\end{tikzpicture}$\inparen{1}$% & $0.5 \cdot 1.0 + 0.5 \cdot 1.0 = 1.0$
|
||||
}}\\% & $0.9$ \\
|
||||
$e_2$ & $B(Y + Z)$ Or $BY+ BZ$&
|
||||
\adjustbox{valign=m}{
|
||||
\resizebox{!}{14mm} {
|
||||
|
@ -80,7 +84,7 @@
|
|||
\draw[->] (b1) -- (b2);
|
||||
\draw[->] (a2) -- (a3);
|
||||
\draw[->] (b2) -- (a3);
|
||||
\end{tikzpicture}
|
||||
\end{tikzpicture}$\inparen{2}$
|
||||
}} \adjustbox{valign=m}{Or}
|
||||
%%%%%%%%%%%
|
||||
%Non factorized circuit%
|
||||
|
|
Loading…
Reference in New Issue