Done with first part of S4 appendix

2021-04-06 23:43:47 -04:00 · 2021-04-06 23:43:47 -04:00 · edde2da120
parent cc109567cc
commit edde2da120
1 changed files with 7 additions and 3 deletions
--- a/app_approx-alg-analysis.tex
+++ b/app_approx-alg-analysis.tex
@ -66,14 +66,14 @@ Applying this bound in the runtime bound in~\Cref{lem:approx-alg} gives the firs
 %\paragraph{Sufficient condition for $\abs{\circuit}(1,\ldots, 1)$ to be size $O(N)$}
 %For our runtime results to be relevant, it must be the case that the sum of the coefficients computed by \onepass is indeed size $O(N)$ since there are $O(\log{N})$ bits in the RAM model where $N$ is the size of the input.  The size of the input here is \size(\circuit).  We show that when \size$(\circuit_\linput) = N_\linput$, \size$(\circuit_\rinput) = N_\rinput$, where $N_\linput + N_\rinput \leq N$, this is indeed the case.

-We will prove~\Cref{lem:val-ub} by considering the three cases separetly. We first being with the case when $\circuit$ is a tree:
+We will prove~\Cref{lem:val-ub} by considering the three cases separately. We first being with the case when $\circuit$ is a tree:
 \begin{Lemma}
 \label{lem:C-ub-tree}
 Let $\circuit$ be a tree (i.e. the sub-circuits corresponding to two children of a node in $\circuit$ are completely disjoint). Then we have
 \[\abs{\circuit}(1,\dots,1)\le \left(\size(\circuit)\right)^{\degree(\circuit)+1}.\]
 \end{Lemma}
 \begin{proof}%[Proof of $\abs{\circuit}(1,\ldots, 1)$ is size $O(N)$]
-For notational simplcity define $N=\size(\circuit)$ and $k=\degree(\circuit)$.
+For notational simplicity define $N=\size(\circuit)$ and $k=\degree(\circuit)$.
 To prove this result, we by prove by induction on $\depth(\circuit)$ that $\abs{\circuit}(1,\ldots, 1) \leq N^{k+1 }$.
 For the base case, we have that \depth(\circuit) $= 0$, and there can only be one node which must contain a coefficient (or constant) of $1$.  In this case, $\abs{\circuit}(1,\ldots, 1) = 1$, and \size(\circuit) $= 1$, and it is true that $\abs{\circuit}(1,\ldots, 1) = 1 \leq N^{k+1} = 1^{1} = 1$.

@ -99,7 +99,7 @@ N_\linput^{k+1} + N_\rinput^{k+1}\nonumber\\
 &\leq (N-1)^{k+1 } \label{eq:sumcoeff-plus-upper}\\
 &\leq N^{k+1}.\nonumber
 \end{align}
-In the above, the first inequality follows from the inductive hypothesis (and the fact that $k_\linput,k_\rinput\le k$)  while the second inequality follows from the fact that since $\circuit$ is a tree we have $N_\linput+N_\rinput=N-1$ and the fact that $k\ge 0$. This compeletes the proof.
+In the above, the first inequality follows from the inductive hypothesis (and the fact that $k_\linput,k_\rinput\le k$)  while the second inequality follows from the fact that since $\circuit$ is a tree we have $N_\linput+N_\rinput=N-1$ and the fact that $k\ge 0$. This completes the proof.
 %Similar to the $\circmult$ case, \cref{eq:sumcoeff-plus-upper} upperbounds its LHS by the fact that the maximum base and exponent combination is always greater than or equal to the sum of lower base/exponent combinations.  The final equality is true given the constraint over the inputs.

 %Since $\abs{\circuit}(1,\ldots, 1) \leq N^{2^k}$ for all circuits such that all $\circplus$ gates share at most one gate with their sibling (across their respective subcircuits), then $\log{N^{2^k}} = 2^k \cdot \log{N}$ which for fixed $k$ yields the desired $O(\log{N})$ bits for $O(1)$ arithmetic operations.% for the given query class.
@ -135,3 +135,7 @@ Now consider the case when the sink node is $+$, we get that
 \end{align*}
 In the above the first inequality follows from the inductive hypothesis while the second inequality follows from the facts that $k_\linput,k_\rinput\le k$ and $N_\linput,N_\rinput\le N-1$. The final inequality follows from the fact that $k\ge 0$.
 \end{proof}
+
+Finally, we consider the case when $\circuit$ encodes the run of the algorithm from~\cite{DBLP:conf/pods/KhamisNR16} on an FAQ query. We cannot handle the full generality of an FAQ query but we can handle an FAQ query that has a ``core'' join query on $k$ relations and then a subset of the $k$ attributes are ``summed'' out (e.g. the sum could be because of projecting out a subset of attributes from the join query). While the algorithm~\cite{DBLP:conf/pods/KhamisNR16} essentially figures out when to `push in' the sums, in our case since we only care about $\abs{\circuit}(1,\dots,1)$ we will consider the obvious circuit that computes the ``inner join" using a worst-case optimal join (WCOJ) algorithm like~\cite{NPRR} and then adding in the addition gates. The basic idea is very simple: we will argue that the there are at most $\size(\circuit)^k$ tuples in the join output (each with having a value of $1$ in $\abs{\circuit}(1,\dots,1)$). Then the largest value we can see in $\abs{\circuit}(1,\dots,1)$ is by summing up these at most $\size(\circuit)^k$ values of $1$. Note that this immediately implies the claimed bound in~\Cref{lem:val-ub}.
+
+We now sketch the argument for the claim about the join query above. First, we note that the computation of a WCOJ algorithm like~\cite{NPRR} can be expressed as a circuit with {\em multiple} sinks (one for each output tuple). Note that annotation corresponding to $\mathbf{t}$ in $\circuit$ is the polynomial $\prod_{e\in E} R(\pi_e(\mathbf{t}))$ (where $E$ indexes the set of relations). It is easy to see that in this case the value of  $\mathbf{t}$ in $\abs{\circuit}(1,\dots,1)$ will be $1$ (by multiplying $1$ $k$ times). The claim on the number of output tuples follow from the trivial bound of multiplying the input size bound (each relation has at most $n\le \size(\circuit)$ tuples and hence we get an overall bound of $n^k\le\size(\circuit)^k$). Note that we did not really use anything about the WCOJ algorithm except for the fact that $\circuit$ for the join part only is built only of multiplication gates. In fact, we do not need the better WCOJ join size bounds either (since we used the trivial $n^k$ bound). As a final remark, we note that we can build the circuit for the join part by running say the algorithm from~\cite{DBLP:conf/pods/KhamisNR16} on an FAQ query that just has the join query but each tuple is annotated with the corresponding variable $X_i$ (i.e. the semi-ring for the FAQ query is $\mathbb{N}[\mathbf{X}]$).