Moved def poly degree as well as cut a few other extraneous sentences.

master
Aaron Huber 2022-05-26 10:13:59 -04:00
parent 167623ab98
commit 900820976e
6 changed files with 25 additions and 22 deletions

View File

@ -11,6 +11,16 @@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\onecolumn
\section{Missing details from Section~\ref{sec:background}}\label{sec:proofs-background}
\subsection{Polynomials}
\begin{Definition}[Degree]\label{def:degree-of-poly}
The degree of polynomial $\genpoly(\vct{X})$ is the largest $\sum_{\tup\in S}d_\tup
$ for all $\vct{d}\in\inset{0,\ldots,\hideg}^S$
such that $c_{(d_1,\dots,d_n)}\ne 0$.
We denote the degree of $\genpoly$ as $\deg\inparen{\genpoly}$.
\end{Definition}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
As an example, the degree of the polynomial $X^2+2XY^2+Y^2$ is $3$.
Product terms in lineage arise only from join operations (\Cref{fig:nxDBSemantics}), so intuitively, the degree of a lineage polynomial is analogous to the largest number of joins needed to produce a result tuple.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Background details for proof of~\Cref{prop:expection-of-polynom}}\label{app:subsec:background-nxdbs}

View File

@ -117,7 +117,7 @@ In particular, if $\prob_0>0$ and $\gamma<1$ are absolute constants then the abo
The restriction on $\gamma$ is satisfied by any
$1$-\abbrTIDB (where $\gamma=0$ in the equivalent $1$-\abbrBIDB of~\Cref{prop:ctidb-reduct})
as well as for all three queries of the PDBench \abbrBIDB benchmark (\Cref{app:subsec:experiment}). Further, we can also argue the following result, recalling from~\Cref{sec:intro} for \abbrCTIDB $\pdb = \inparen{\worlds, \bpd}$, where $\tupset$ is the set of possible tuples across all possible worlds of $\pdb$.
as well as for all three queries of the PDBench \abbrBIDB benchmark (\Cref{app:subsec:experiment}). Further, we can also argue the following result.%, recalling from~\Cref{sec:intro} for \abbrCTIDB $\pdb = \inparen{\worlds, \bpd}$, where $\tupset$ is the set of possible tuples across all possible worlds of $\pdb$.
\begin{Lemma}
\label{lem:ctidb-gamma}

View File

@ -20,15 +20,6 @@ Unless othewise noted, we consider all polynomials to be in \abbrSMB representat
When it is unclear, we use $\smbOf{\genpoly}~\inparen{\smbOf{\poly}}$ to denote the \abbrSMB form of a polynomial (lineage polynomial) $\genpoly~\inparen{\poly}$.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{Definition}[Degree]\label{def:degree-of-poly}
The degree of polynomial $\genpoly(\vct{X})$ is the largest $\sum_{\tup\in S}d_\tup
$ for all $\vct{d}\in\inset{0,\ldots,\hideg}^S$
such that $c_{(d_1,\dots,d_n)}\ne 0$.
We denote the degree of $\genpoly$ as $\deg\inparen{\genpoly}$.
\end{Definition}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
As an example, the degree of the polynomial $X^2+2XY^2+Y^2$ is $3$.
Product terms in lineage arise only from join operations (\Cref{fig:nxDBSemantics}), so intuitively, the degree of a lineage polynomial is analogous to the largest number of joins needed to produce a result tuple.
We call a polynomial $\poly\inparen{\vct{X}}$ a \emph{\abbrCTIDB-lineage polynomial} (or simply lineage polynomial), if it is clear from context that there exists an $\raPlus$ query $\query$, \abbrCTIDB $\pdb$, and result tuple $\tup$ such that $\poly\inparen{\vct{X}} = \apolyqdt\inparen{\vct{X}}.$

View File

@ -65,7 +65,7 @@ Set query evaluation semantics over $1$-\abbrTIDB\xplural have been studied exte
to be \sharpphard\cite{10.1145/1265530.1265571}.
Grohe et. al.~\cite{https://doi.org/10.48550/arxiv.2201.11524} studied bag-\abbrTIDB\xplural allowing for unbounded multiplicities which requires them to explicitly address the issue of a succinct representation of probability distributions over infinitely many multiplicities.
This work demonstrated the existence of a dichotomy for
the problem of computing the probability that an output tuple has a multiplicity of at most $k$.
the problem of computing the probability that an output tuple has a multiplicity of at most $s$.
% investigates the query evaluation problem over bag-\abbrTIDB\xplural when computing the probability of an output tuple having at most a multiplicity of $k$, showing that a dichotomy exists for this problem.
% While the authors observe that computing the expectation of an output tuple multiplicity is in polynomial time, no further (fine-grained) analysis of the expected value is considered.
% Our work in contrast assumes a finite bound on the multiplicities where we simply list the finitely many probability values (and hence do not need consider a more succinct representation). Further, our work primarily looks into the fine-grained analysis of computing the expected multiplicity of an output tuple.
@ -193,7 +193,7 @@ The proof in \Cref{subsec:proof-exp-poly-rpoly} follows by~\Cref{prop:ctidb-redu
\subsection{Our Techniques}
\mypar{Lower Bound Proof Techniques}
Our main hardness result shows that computing~\Cref{prob:expect-mult} is $\sharpwonehard$ for $1$-\abbrTIDB. To prove this result we show that for the same $\query_1$ from the example above, for an arbitrary `product width' $k$, the query $\qhard^k$ is able to encode various hard graph-counting problems (assuming $\bigO{\numvar}$ tuples rather than the $\bigO{1}$ tuples in \Cref{fig:two-step}).
We do so by considering an arbitrary graph $G$ (analogous to relation $\boldsymbol{R}$ of $\query_1$) and analyzing how the coefficients in the (univariate) polynomial $\widetilde{\poly}\left(p,\dots,p\right)$ relate to counts of subgraphs in $G$ that are isomorphic to various subgraphs with $k$ edges. E.g., we exploit the fact that the coefficient corresponding to the power of $2k$ in $\poly$ of $\qhard^k$ is proportional to the number of $k$-matchings in $G$,
We do so by considering an arbitrary graph $G$ (analogous to relation $\boldsymbol{R}$ of $\query_1$) and analyzing how the coefficients in the (univariate) polynomial $\widetilde{\poly}\left(p,\dots,p\right)$ relate to counts of subgraphs in $G$ that are isomorphic to various subgraphs with $k$ edges. E.g., we exploit the fact that the coefficient corresponding to $\prob^{2k}$ in $\rpoly\inparen{\prob,\ldots,\prob}$ of $\qhard^k$ is proportional to the number of $k$-matchings in $G$,
a known hard problem in parameterized/fine-grained complexity literature.
@ -218,7 +218,7 @@ For example, if we insist that $\circuit$ represent the lineage polynomial in \a
and hence, just $\timeOf{\abbrStepOne}(\query,\tupset,\circuit)$ is too large.
However, systems can directly emit compact, factorized representations of $\poly(\vct{X})$ (e.g., as a consequence of the standard projection push-down optimization~\cite{DBLP:books/daglib/0020812}).
Accordingly, this work uses (arithmetic) circuits\footnote{
An arithmetic circuit is a DAG with variable and/or numeric source nodes and internal, each nodes representing either an addition or multiplication operator.
An arithmetic circuit is a DAG with variable/numeric source gates and multiplication/addition internal/sink gates.
}
as the representation system of $\poly(\vct{X})$, and we show in \Cref{sec:circuit-depth} an $\bigO{\qruntime{\optquery{\query}, \tupset, \bound}}$ algorithm for constructing the lineage polynomial for all result tuples of an $\raPlus$ query $\query$ (or more more precisely, a single circuit $\circuit$ with one sink per tuple representing the tuple's lineage).

View File

@ -13,12 +13,10 @@ A circuit $\circuit$ is a Directed Acyclic Graph (DAG) with source gates (in deg
%
Each gate has the following members: \type, \vari{input}, \val, \vpartial, \degval, \vari{Lweight}, and \vari{Rweight}, where \type is the value type $\{\circplus, \circmult, \var, \tnum\}$ and \vari{input} the list of inputs. Source gates have an extra member \val for the value. $\circuit_\linput$ ($\circuit_\rinput$) denotes the left (right) input of \circuit.
\end{Definition}
When the underlying DAG is a tree (with edges pointing towards the root), the structure is an expression tree \etree. In such a case, the root of \etree is analogous to the sink of \circuit. The fields \vari{partial}, \degval, \vari{Lweight}, and \vari{Rweight} are used in the proofs of \Cref{sec:proofs-approx-alg}.
We refer to the structure when the underlying DAG is a tree (with edges pointing towards the root) as an expression tree \etree. %In such a case, the root of \etree is analogous to the sink of \circuit. The fields \vari{partial}, \degval, \vari{Lweight}, and \vari{Rweight} are used in the proofs of \Cref{sec:proofs-approx-alg}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The circuits in \Cref{fig:two-step} encode their respective polynomials in column $\poly$.
Note that the ciricuit \circuit representing $AX$ and the circuit \circuit' representing $B\inparen{Y+Z}$ each encode a tree, with edges pointing towards the root.
The circuits $\inparen{1}$ and $\inparen{2}$ in column $\poly$ of \Cref{fig:two-step} are both expression trees.%encode their respective polynomials in column $\poly$.
%Note that the ciricuit \circuit representing $AX$ and the circuit \circuit' representing $B\inparen{Y+Z}$ each encode a tree, with edges pointing towards the root.
\begin{figure}[t!]

View File

@ -48,6 +48,10 @@
\tabcolsep=0.075cm
%\captionof{table}{Q}
%\setlength{\cellspacetoplimit}{4pt}
%\setlength\extrarowheight{10pt}
% \setlength{\cellspacetoplimit}{10mm}
% \setlength{\cellspacebottomlimit}{10mm}
%\renewcommand{\arraystretch}{1.5}
\begin{tabular}{>{\footnotesize}c | >{\centering\arraybackslash\footnotesize}m{1.95cm} | >{\centering\arraybackslash\footnotesize}m{3.95cm}}
%\multicolumn{3}{c}{$\boldsymbol{\query_2(\pdb)}$}\\[1mm]
%\toprule
@ -55,15 +59,15 @@
\midrule
%\hline
%\\\\[-3.5\medskipamount]
$e_1$ & $AX$ &\resizebox{!}{9mm}{
$e_1$ & $AX$ &\adjustbox{valign=b}{\resizebox{!}{9mm}{
\begin{tikzpicture}[thick]
\node[gen_tree_node](sink) at (0.5, 0.8){$\boldsymbol{\circmult}$};
\node[gen_tree_node](source1) at (0, 0){$A$};
\node[gen_tree_node](source2) at (1, 0){$X$};
\draw[->](source1)--(sink);
\draw[->] (source2)--(sink);
\end{tikzpicture}% & $0.5 \cdot 1.0 + 0.5 \cdot 1.0 = 1.0$
}\\% & $0.9$ \\
\end{tikzpicture}$\inparen{1}$% & $0.5 \cdot 1.0 + 0.5 \cdot 1.0 = 1.0$
}}\\% & $0.9$ \\
$e_2$ & $B(Y + Z)$ Or $BY+ BZ$&
\adjustbox{valign=m}{
\resizebox{!}{14mm} {
@ -80,7 +84,7 @@
\draw[->] (b1) -- (b2);
\draw[->] (a2) -- (a3);
\draw[->] (b2) -- (a3);
\end{tikzpicture}
\end{tikzpicture}$\inparen{2}$
}} \adjustbox{valign=m}{Or}
%%%%%%%%%%%
%Non factorized circuit%