Update on Overleaf.

2022-02-23 03:49:47 +00:00 · 2022-02-23 03:49:47 +00:00 · 98571dc16a
parent 1e27058b0c
commit 98571dc16a
3 changed files with 27 additions and 13 deletions
--- a/abstract.tex
+++ b/abstract.tex
@ -2,7 +2,7 @@
 %!TEX root=./main.tex
 \begin{abstract}
 In this work, we study the problem computing a tuple's expected multiplicity over bag-\abbrTIDB\xplural exactly and approximately.
-  We refer to bag-\abbrTIDB\xplural as \abbrCTIDB\xplural, where $\bound$ is the bound on the maximum multiplicity. We are specifically
+  We consider bag-\abbrTIDB\xplural where we have a bound $\bound$ on the maximum multiplicity (we refer to such bag-\abbrTIDB\xplural as \abbrCTIDB\xplural). In this work we consider the case when $\bound$ is a constant (since that is what is used in practice). We are specifically
   interested in the fine-grained complexity and how it compares to the complexity of deterministic query evaluation algorithms --- if these complexities are comparable, it opens the door to practical deployment of probabilistic databases.
  Unfortunately, % we show the reverse;
  our results imply that computing expected multiplicities for \abbrCTIDB\xplural based on the results produced by such query evaluation algorithms introduces super-linear overhead (under parameterized complexity hardness assumptions/conjectures).
--- a/approx_alg.tex
+++ b/approx_alg.tex
@ -132,16 +132,19 @@ O\left(\left(\size(\circuit) + \frac{\log{\frac{1}{\conf}}\cdot k\cdot \log{k} \
 In particular, if $\prob_0>0$ and $\gamma<1$ are absolute constants then the above runtime simplifies to $O_k\left(\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)$.
 \end{Theorem}

-\begin{Lemma}
-Given \emph{\abbrOneBIDB} computed from the reduction of~\Cref{def:ctidb-reduct}, $\gamma\inparen{\circuit}=\inparen{c + 1}^{-k}$.
-\end{Lemma}
-\begin{Corollary}
-Given any \abbrCTIDB circuit \circuit, $\poly\inparen{\vct{X}} = \polyf\inparen{\circuit}$, for $k =\degree\inparen{\circuit}$, $\gamma\inparen{\circuit}$, and $\prob_i\ge\prob_0$ for all $i\in\pbox{\numvar}$.  The results of~\Cref{cor:approx-algo-const-p} follow for estimating $\rpoly\inparen{\prob_1,\ldots, \prob_\numvar}$.
-\end{Corollary}
+
+%\begin{Corollary}
+%Given any \abbrCTIDB circuit \circuit, $\poly\inparen{\vct{X}} = \polyf\inparen{\circuit}$, for $k =\degree\inparen{\circuit}$, $\gamma\inparen{\circuit}$, and $\prob_i\ge\prob_0$ for all $i\in\pbox{\numvar}$.  The results of~\Cref{cor:approx-algo-const-p} follow for estimating $\rpoly\inparen{\prob_1,\ldots, \prob_\numvar}$.
+%\end{Corollary}

 The restriction on $\gamma$ is satisfied by any
 $1$-\abbrTIDB (where $\gamma=0$ in the equivalent $1$-\abbrBIDB of~\Cref{def:ctidb-reduct})
-as well as for all three queries of the PDBench \abbrBIDB benchmark (see \Cref{app:subsec:experiment} for experimental results).
+as well as for all three queries of the PDBench \abbrBIDB benchmark (see \Cref{app:subsec:experiment} for experimental results). Further, we can alo argue the following result:
+
+\begin{Lemma}
+\label{lem:c-TIDB-gamma}
+Given \emph{\abbrOneBIDB} computed from the reduction of~\Cref{def:ctidb-reduct}, $\gamma\inparen{\circuit}=\inparen{c + 1}^{-k}$.
+\end{Lemma}

 We briefly connect the runtime in \Cref{eq:approx-algo-runtime} to the algorithm outline earlier (where we ignore the dependence on $\multc{\cdot}{\cdot}$, which is needed to handle the cost of arithmetic operations over integers). The $\size(\circuit)$ comes from the time take to run \onepass once (\onepass essentially computes $\abs{\circuit}(1,\ldots, 1)$ using the natural circuit evaluation algorithm on $\circuit$). We make $\frac{\log{\frac{1}{\conf}}}{\inparen{\error'}^2\cdot(1-\gamma)^2\cdot \prob_0^{2k}}$ many calls to \sampmon (each of which essentially traces $O(k)$ random sink to source paths in $\circuit$ all of which by definition have length at most $\depth(\circuit)$).

@ -162,13 +165,23 @@ Note that the above implies that with the assumption $\prob_0>0$ and $\gamma<1$

 %\AH{Is it standard to assume that in the asymptotic notation above, $\error$ and $\delta$ are constant?  Otherwise this does not uphold~\Cref{prob:intro-stmt}.}

-Finally, note that by \Cref{prop:circuit-depth} and \Cref{lem:circ-model-runtime} for any $\raPlus$ query $\query$, there exists a circuit $\circuit^*$ for $\apolyqdt$ such that $\depth(\circuit^*)\le O_{|Q|}(\log{n})$ and $\size(\circuit)\le O_k\inparen{\qruntime{\query, \dbbase}}$. Using this along with \Cref{lem:val-ub}, \Cref{cor:approx-algo-const-p} and the fact that $n\le \qruntime{\query, \dbbase}$, we answer \Cref{prob:big-o-joint-steps} in the affirmative as follows:
+Finally, note that by \Cref{prop:circuit-depth} and \Cref{lem:circ-model-runtime} for any $\raPlus$ query $\query$, there exists a circuit $\circuit^*$ for $\apolyqdt$ such that $\depth(\circuit^*)\le O_{|Q|}(\log{n})$ and $\size(\circuit)\le O_k\inparen{\qruntime{\query, \dbbase}}$. Using this along with \Cref{lem:val-ub}, \Cref{cor:approx-algo-const-p} and the fact that $n\le \qruntime{\query, \dbbase}$, we have the following corollary:
 \begin{Corollary}
 \label{cor:approx-algo-punchline}
-Let $\query$ be an $\raPlus$ query and $\pdb$ be a \emph{\abbrOneBIDB} with $p_0>0$ and $\gamma<1$ (where $p_0,\gamma$ as in \Cref{cor:approx-algo-const-p}) are absolute constants. Let $\poly(\vct{X})=\apolyqdt$ for any result tuple $\tup$ with $\deg(\poly)=k$. Then one can compute an approximation satisfying \Cref{eq:approx-algo-bound-main} in time $O_{k,|Q|,\error',\conf}\inparen{\qruntime{\query, \tupset, \bound}}$ (given $\query,\tupset$ and $p_i$ for each $i\in [n]$ that defines $\pd$).
+Let $\query$ be an $\raPlus$ query and $\pdb$ be a \emph{\abbrOneBIDB} with $p_0>0$ and $\gamma<1$ (where $p_0,\gamma$ as in \Cref{cor:approx-algo-const-p}) are absolute constants. Let $\poly(\vct{X})=\apolyqdt$ for any result tuple $\tup$ with $\deg(\poly)=k$. Then one can compute an approximation satisfying \Cref{eq:approx-algo-bound-main} in time $O_{k,|Q|,\error',\conf}\inparen{\qruntime{\optquery{\query}, \tupset, \bound}}$ (given $\query,\tupset$ and $p_i$ for each $i\in [n]$ that defines $\pd$).
 %Let $\poly(\vct{X})$ be a \abbrBIDB-lineage polynomial correspoding to an \abbrBIDB circuit $\circuit$ that satisfies the specific conditions in \Cref{lem:val-ub}. Then one can compute an approximation satisfying \Cref{eq:approx-algo-bound-main} in time
 % $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)$. % for the case when $\circuit$ satisfies the specific conditions in \Cref{lem:val-ub}.
 \end{Corollary}
+
+Next, we note that the above result along with \Cref{lem:c-TIDB-gamma}
+answers \Cref{prob:big-o-joint-steps} in the affirmative as follows:
+\begin{Corollary}
+\label{cor:approx-algo-punchline-ctidb}
+Let $\query$ be an $\raPlus$ query and $\pdb$ be a \abbrCTIDB with $p_0>0$ (where $p_0$ as in \Cref{cor:approx-algo-const-p}) is an absolute constant. Let $\poly(\vct{X})=\apolyqdt$ for any result tuple $\tup$ with $\deg(\poly)=k$. Then one can compute an approximation satisfying \Cref{eq:approx-algo-bound-main} in time $O_{k,|Q|,\error',\conf,\bound}\inparen{\qruntime{\optquery{\query}, \tupset, \bound}}$ (given $\query,\tupset$ and $p_i$ for each $i\in [n]$ that defines $\pd$).
+%Let $\poly(\vct{X})$ be a \abbrBIDB-lineage polynomial correspoding to an \abbrBIDB circuit $\circuit$ that satisfies the specific conditions in \Cref{lem:val-ub}. Then one can compute an approximation satisfying \Cref{eq:approx-algo-bound-main} in time
+% $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)$. % for the case when $\circuit$ satisfies the specific conditions in \Cref{lem:val-ub}.
+\end{Corollary}
+
 %\AH{What is $\abs{\query}$?  Isn't that just $k$?}
 If we want to approximate the expected multiplicities of all $Z=O(n^k)$ result tuples $\tup$ simultaneously, we just need to run the above result with $\conf$ replaced by $\frac \conf Z$. Note this increases the runtime by only a logarithmic factor.

--- a/ra-to-poly.tex
+++ b/ra-to-poly.tex
@ -99,7 +99,7 @@ We now define the reduced polynomial $\rpoly'$ of a \abbrOneBIDB.
 		&&&\poly'\pbox{\rel,\tupset', \tup_j} = j\cdot X_{\tup, j}.
 	\end{align*}\\[-10mm]
 \end{minipage}}
-	\caption{Construction of the lineage (polynomial) for an $\raPlus$ query $\query$ over $\gentupset$.} 
+	\caption{Construction of the lineage (polynomial) for an $\raPlus$ query $\query$ over $\tupset'$.} 
 	\label{fig:lin-poly-bidb-redux}
 \end{figure}
 \begin{Definition}[$\rpoly'$]\label{def:reduced-poly-redux}
@ -107,8 +107,9 @@ Given a polynomial $\poly'\inparen{\vct{X}}$ generated from a \abbrOneBIDB produ
 \end{Definition}
 Then given the disjoint requirement and the semantics for constructing the lineage polynomial over a \abbrOneBIDB, $\poly'\pbox{\rel,\tupset',\tup}$ is of the same structure as the reformulated polynomial $\refpoly{}$ of step i) from~\Cref{def:reduced-poly}, which then implies that $\rpoly'$ is the reduced polynomial that results from step ii) of~\Cref{def:reduced-poly}, and further that~\Cref{lem:tidb-reduce-poly} immediately follows for \abbrOneBIDB polynomials.
 \begin{Lemma}
-Given any \abbrCTIDB $\pdb$, its reduced counterpart \emph{\abbrOneBIDB} $\pdb'$, $\raPlus$ query $\query$, and lineage polynomial
- $\poly'\inparen{\vct{X}}=\poly'\pbox{\query,\tupset,\tup}\inparen{\vct{X}}$, it holds that $
+Given any %\abbrCTIDB $\pdb$, its reduced counterpart 
+\emph{\abbrOneBIDB} $\pdb'$, $\raPlus$ query $\query$, and lineage polynomial
+ $\poly'\inparen{\vct{X}}=\poly'\pbox{\query,\tupset',\tup}\inparen{\vct{X}}$, it holds that $
 	\expct_{\vct{W} \sim \pdassign'}\pbox{\poly'\inparen{\vct{W}}} = \rpoly'\inparen{\probAllTup}.
 $%, where $\probAllTup = \inparen{\inparen{\prob_{\tup, j}}_{\tup\in\tupset, j\in\pbox{c}}}.$%,\ldots,\prob_{\abs{\tupset}, \bound}}$ is defined by $\bpd$.
 %$\expct_{\rvworld\sim\bpd'}\pbox{\poly'\inparen{\rvworld}} = \rpoly'\inparen{\vct{\prob}}$.