paper-BagRelationalPDBsAreHard/approx_alg.tex

%root: main.tex
%!TEX root=./main.tex

\section{$1 \pm \epsilon$ Approximation Algorithm}\label{sec:algo}

In~\Cref{sec:hard}, we showed that computing the expected multiplicity of a compressed representation of a bag polynomial for \ti (even just based on project-join queries) is unlikely to be possible in linear time (\Cref{thm:mult-p-hard-result}), even if all tuples have the same probability  (\Cref{th:single-p-hard}).
Given this, we now design an approximation algorithm for our problem that runs in {\em linear time}.
The folowing approximation algorithm applies to \bi, though our bounds are more meaningful for a non-trivial subclass of \bis that contains both \tis, as well as the PDBench benchmark~\cite{pdbench}.
%it is then desirable to have an algorithm to approximate the multiplicity in linear time, which is what we describe next.

\subsection{Preliminaries and some more notation}

We now introduce useful definitions and notation related to circuits and polynomials.  Kindly note that all proofs and pseudocode can be found in \cref{sec:proofs-approx-alg}.
\begin{Definition}[Variables in a monomial]\label{def:vars}
 Given a monomial $v$, we use $\var(v)$ to denote the set of variables in $v$.
\end{Definition}
\noindent For example the monomial $XY$ has $\var(XY)=\inset{X,Y}$.


\begin{Definition}[$\expansion{\circuit}$]\label{def:expand-circuit}
The logical view of $\expansion{\circuit}$ is a list of tuples $(\monom, \coef)$, where $\monom$ is a set of variables and $\coef$ is in $\reals$.
$\expansion{\circuit}$ has the following recursive definition ($\circ$ is list concatenation).

$\expansion{\circuit} =
\begin{cases}
					\expansion{\circuit_\linput} \circ \expansion{\circuit_\rinput}		&\textbf{ if }\circuit.\type = \circplus\\
					\left\{(\monom_\linput \cup \monom_\rinput, \coef_\linput \cdot \coef_\rinput) ~|~(\monom_\linput, \coef_\linput) \in \expansion{\circuit_\linput}, (\monom_\rinput, \coef_\rinput) \in \expansion{\circuit_\rinput}\right\} 		&\textbf{ if }\circuit.\type = \circmult\\
					\elist{(\emptyset, \circuit.\val)}								&\textbf{ if }\circuit.\type = \tnum\\
					\elist{(\{\circuit.\val\}, 1)}									&\textbf{ if }\circuit.\type = \var.\\
\end{cases}
$
\end{Definition}
For further explanation, please refer to \cref{example:expr-tree-T}.

\begin{Definition}[$\abs{\circuit}(\vct{X})$]\label{def:positive-circuit}
For any circuit $\circuit$, the corresponding
{\em positive circuit}, denoted $\abs{\circuit}$, is obtained from $\circuit$ as follows. For each leaf node $\ell$ of $\circuit$ where $\ell.\type$ is $\tnum$, update $\ell.\vari{value}$ to $|\ell.\vari{value}|$. 
\end{Definition}
Please see \cref{ex:def-pos-circ} for an illustration.

\begin{Definition}[\size($\cdot$)]
The function \size~ takes a circuit $\circuit$ as input and outputs the number of gates (nodes) in \circuit.
\end{Definition}

\begin{Definition}[\depth($\cdot$)]
The function \depth~ has circuit $\circuit$ as input and outputs the number of levels in \circuit.
\end{Definition}

\begin{Definition}[$\degree(\cdot)$]\footnote{Note that the degree of $\polyf(\abs{\circuit})$ is always upper bounded by $\deg(\circuit)$ and the latter can be strictly larger (e.g. consider the case when $\circuit$ multiplies two copies of the constant $1$-- here we have $\deg(\circuit)=1$ but degree of $\polyf(\abs{\circuit})$ is $0$).}
$\degree(\circuit)$ is defined recursively as follows:
\[\degree(\circuit)=
\begin{cases}
\max(\degree(\circuit_\linput),\degree(\circuit_\rinput)) & \text{ if }\circuit.type=+\\
\degree(\circuit_\linput) + \degree(\circuit_\rinput)+1 &\text{ if }\circuit.type=\times\\
0 & \text{otherwise}.
\end{cases}
\]
\end{Definition}
Finally, we will need the following notation for the complexity of multiplying large integers:
\begin{Definition}[$\multc{\cdot}{\cdot}$]\footnote{We note that when doing arithmetic operations on the RAM model for input of size $N$, we have that $\multc{O(\log{N})}{O(\log{N})}=O(1)$. More generally we have $\multc{N}{O(\log{N})}=O(N\log{N}\log\log{N})$.}
In a RAM model of word size of $W$-bits, $\multc{M}{W}$ denotes the complexity of multiplying two integers represented with $M$-bits. (We will assume that for input of size $N$, $W=O(\log{N})$.
\end{Definition}

\subsection{Our main result}
In the subsequent subsections we will prove the following theorem.

\begin{Theorem}\label{lem:approx-alg}
Let \circuit be a circuit for a UCQ over \bi and define $\poly(\vct{X})=\polyf(\circuit)$ and let $k=\degree(\circuit)$.
Then an estimate $\mathcal{E}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ can be computed in time
{\small
\[O\left(\left(\size(\circuit) + \frac{\log{\frac{1}{\conf}}\cdot \abs{\circuit}^2(1,\ldots, 1)\cdot  k\cdot \log{k} \cdot \depth(\circuit))}{\inparen{\error'}^2\cdot\rpoly^2(\prob_1,\ldots, \prob_\numvar)}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)\]
}
such that
\begin{equation}
\label{eq:approx-algo-bound}
\probOf\left(\left|\mathcal{E} - \rpoly(\prob_1,\dots,\prob_\numvar)\right|> \error' \cdot \rpoly(\prob_1,\dots,\prob_\numvar)\right) \leq \conf.
\end{equation}
\end{Theorem}

To get linear runtime results from~\Cref{lem:approx-alg}, we will need to define another parameter modeling the (weighted) number of monomials in $\expansion{\circuit}$ to be `canceled' when it is modded with $\mathcal{B}$ (\Cref{def:mod-set-polys}).
\begin{Definition}[Parameter $\gamma$]\label{def:param-gamma}
Given an expression tree $\circuit$, define
\[\gamma(\circuit)=\frac{\sum_{(\monom, \coef)\in \expansion{\circuit}} \abs{\coef}\cdot \indicator{\monom\mod{\mathcal{B}}\equiv 0}}{\abs{\circuit}(1,\ldots, 1)}\]
\end{Definition}

\noindent We next present a few corollaries of~\Cref{lem:approx-alg}.
\begin{Corollary}
\label{cor:approx-algo-const-p}
Let $\poly(\vct{X})$ be as in~\Cref{lem:approx-alg} and let $\gamma=\gamma(\circuit)$. Further let it be the case that $\prob_i\ge \prob_0$ for all $i\in[\numvar]$. Then an estimate $\mathcal{E}$  of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ satisfying~\Cref{eq:approx-algo-bound} can be computed in time
\[O\left(\left(\size(\circuit) + \frac{\log{\frac{1}{\conf}}\cdot k\cdot \log{k} \cdot \depth(\circuit))}{\inparen{\error'}^2\cdot(1-\gamma)^2\cdot \prob_0^{2k}}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)\]
In particular, if $\prob_0>0$ and $\gamma<1$ are absolute constants then the above runtime simplifies to $O_k\left(\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)$.
\end{Corollary}

The restriction on $\gamma$ is satisfied by any \ti (where $\gamma=0$) as well as for all three queries of the PDBench \bi benchmark (Please see \Cref{app:subsec:experiment} for experimental results).

Finally, we address the $\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}$ term in the runtime. %In \cref{susec:proof-val-up}, we show the following:
\begin{Lemma}
\label{lem:val-ub}
For any circuit $\circuit$ with $\degree(\circuit)=k$, we have
\[\abs{\circuit}(1,\ldots, 1)\le 2^{2^k\cdot \size(\circuit)}.\]
Further, under either of the following conditions:
\begin{enumerate}
\item $\circuit$ is a tree,
\item $\circuit$ encodes the run of the algorithm in~\cite{DBLP:conf/pods/KhamisNR16} on an FAQ query,
\end{enumerate}
we have
\[\abs{\circuit}(1,\ldots, 1)\le  \size(\circuit)^{O(k)}.\]
\end{Lemma}

Note that the above implies that with the assumption $\prob_0>0$ and $\gamma<1$ are absolute constants from \Cref{cor:approx-algo-const-p}, then the runtime there simplies to $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)^2\cdot \log{\frac{1}{\conf}}\right)$ for general circuits $\circuit$ and to $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)$ for the case when $\circuit$ satisfies the special conditions in~\Cref{lem:val-ub}. In~\Cref{app:proof-lem-val-ub} we argue that these conditions are very general and encompass many interesting scenarios.

\subsection{Approximating $\rpoly$}
\approxq (\cref{alg:mon-sam}) modifies \circuit with a call to \onepass.  It then samples from $\circuit_{\vari{mod}}\numsamp$ times and uses that information to approximate $\rpoly$.


\subsubsection{Correctness}

In order to prove~\Cref{lem:approx-alg}, we will need to argue the correctness of \approxq, which relies on the correctness of auxiliary algorithms \onepass and \sampmon.

\begin{Lemma}\label{lem:one-pass}
The $\onepass$ function completes in time:
$$O\left(\size(\circuit) \cdot \multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit}}\right)$$
  $\onepass$ guarantees two post-conditions:  First, for each subcircuit $\vari{S}$ of $\circuit$, we have that $\vari{S}.\vari{partial}$ is set to $\abs{\vari{S}}(1,\ldots, 1)$.  Second, when $\vari{S}.\type  = \circplus$, \subcircuit.\lwght $= \frac{\abs{\subcircuit_\linput}(1,\ldots, 1)}{\abs{\subcircuit}(1,\ldots, 1)}$ and likewise for \subcircuit.\rwght.
\end{Lemma}
To prove correctness of~\Cref{alg:mon-sam}, we only use the following fact that follows from the above lemma: for the modified circuit ($\circuit_{\vari{mod}}$), $\circuit_{\vari{mod}}.\vari{partial}=\abs{\circuit}(1,\dots,1)$.

\begin{Lemma}\label{lem:sample}
The function $\sampmon$ completes in time
$$O(\log{k} \cdot k \cdot \depth(\circuit)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log{\size(\circuit)}})$$
 where $k = \degree(\circuit)$.  The function returns every $\left(\monom, sign(\coef)\right)$ for $(\monom, \coef)\in \expansion{\circuit}$ with probability $\frac{|\coef|}{\abs{\circuit}(1,\ldots, 1)}$. 
\end{Lemma}

With the above two lemmas, we are ready to argue the following result (proof in~\Cref{sec:proofs-approx-alg}):
\begin{Theorem}\label{lem:mon-samp}
For any $\circuit$ with $\degree(poly(|\circuit|)) = k$, algorithm \ref{alg:mon-sam} outputs an estimate $\vari{acc}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ such that
\[\probOf\left(\left|\vari{acc} - \rpoly(\prob_1,\ldots, \prob_\numvar)\right|> \error \cdot \abs{\circuit}(1,\ldots, 1)\right) \leq \conf,\]
 in $O\left(\left(\size(\circuit)+\frac{\log{\frac{1}{\conf}}}{\error^2} \cdot k \cdot\log{k} \cdot \depth(\circuit)\right)\cdot \multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log{\size(\circuit)}}\right)$ time.
\end{Theorem}


\subsection{\onepass\ Algorithm}
\label{sec:onepass}

\noindent \onepass\ (Algorithm ~\ref{alg:one-pass-iter} in \Cref{sec:proofs-approx-alg}) iteratively visits each gate one time according to the topological ordering of \circuit annotating the \lwght, \rwght, and \prt variables of each node according to the definitions above.  Lemma~\ref{lem:one-pass} is proved in~\Cref{sec:proofs-approx-alg}.

\subsection{\sampmon\ Algorithm}
\label{sec:samplemonomial}

A naive (slow) implementation of \sampmon\ would first compute $\expansion{\circuit}$ and then sample from it.
Instead, \Cref{alg:sample} selects a monomial from $\expansion{\circuit}$ by top-down traversal of the input \circuit.  More details on the traversal can be found in \cref{subsec:sampmon-remarks}.

%
%$\sampmon$ is given in \Cref{alg:sample}, and a proof of its correctness (via \Cref{lem:sample}) is provided in \Cref{sec:proofs-approx-alg}.


%%%%%%%%%%%%%%%%%%%%%%%

%%% Local Variables:
%%% mode: latex
%%% TeX-master: "main"
%%% End:
Finished implementing Atri's changes 073120. 2020-08-04 15:30:57 -04:00			`%root: main.tex`
Finishing S4. 2020-12-19 01:15:50 -05:00			`%!TEX root=./main.tex`

Moved proofs into appendix (S3, S4) 2020-12-17 16:40:48 -05:00			`\section{$1 \pm \epsilon$ Approximation Algorithm}\label{sec:algo}`
Changes to Approx. Alg section, mostly cosmetic 2020-12-07 17:02:12 -05:00
shorten 2020-12-19 16:44:18 -05:00			`In~\Cref{sec:hard}, we showed that computing the expected multiplicity of a compressed representation of a bag polynomial for \ti (even just based on project-join queries) is unlikely to be possible in linear time (\Cref{thm:mult-p-hard-result}), even if all tuples have the same probability (\Cref{th:single-p-hard}).`
			`Given this, we now design an approximation algorithm for our problem that runs in {\em linear time}.`
Implemented @oliver's 021221 suggestions. 2021-02-15 13:24:19 -05:00			`The folowing approximation algorithm applies to \bi, though our bounds are more meaningful for a non-trivial subclass of \bis that contains both \tis, as well as the PDBench benchmark~\cite{pdbench}.`
Done in Sec 4 till definition of gamma 2020-12-14 11:47:18 -05:00			`%it is then desirable to have an algorithm to approximate the multiplicity in linear time, which is what we describe next.`

			`\subsection{Preliminaries and some more notation}`

Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`We now introduce useful definitions and notation related to circuits and polynomials. Kindly note that all proofs and pseudocode can be found in \cref{sec:proofs-approx-alg}.`
Done in Sec 4 till definition of gamma 2020-12-14 11:47:18 -05:00			`\begin{Definition}[Variables in a monomial]\label{def:vars}`
			`Given a monomial $v$, we use $\var(v)$ to denote the set of variables in $v$.`
			`\end{Definition}`
Misc clarifications 2020-12-20 17:13:52 -05:00			`\noindent For example the monomial $XY$ has $\var(XY)=\inset{X,Y}$.`
Done in Sec 4 till definition of gamma 2020-12-14 11:47:18 -05:00
In the middle of Oliver's 091420 suggestions 2020-09-16 16:27:50 -04:00
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`\begin{Definition}[$\expansion{\circuit}$]\label{def:expand-circuit}`
			`The logical view of $\expansion{\circuit}$ is a list of tuples $(\monom, \coef)$, where $\monom$ is a set of variables and $\coef$ is in $\reals$.`
			`$\expansion{\circuit}$ has the following recursive definition ($\circ$ is list concatenation).`
Started adjusting figures and equations in the Introduction. 2021-03-08 12:48:22 -05:00
			`$\expansion{\circuit} =`
Try this one neat trick to save 2 pages :) 2020-12-19 16:46:26 -05:00			`\begin{cases}`
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`\expansion{\circuit_\linput} \circ \expansion{\circuit_\rinput} &\textbf{ if }\circuit.\type = \circplus\\`
			`\left\{(\monom_\linput \cup \monom_\rinput, \coef_\linput \cdot \coef_\rinput) ~\|~(\monom_\linput, \coef_\linput) \in \expansion{\circuit_\linput}, (\monom_\rinput, \coef_\rinput) \in \expansion{\circuit_\rinput}\right\} &\textbf{ if }\circuit.\type = \circmult\\`
			`\elist{(\emptyset, \circuit.\val)} &\textbf{ if }\circuit.\type = \tnum\\`
			`\elist{(\{\circuit.\val\}, 1)} &\textbf{ if }\circuit.\type = \var.\\`
Started adjusting figures and equations in the Introduction. 2021-03-08 12:48:22 -05:00			`\end{cases}`
			`$`
Finishing S4. 2020-12-19 01:15:50 -05:00			`\end{Definition}`
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`For further explanation, please refer to \cref{example:expr-tree-T}.`
In the middle of Oliver's 091420 suggestions 2020-09-16 16:27:50 -04:00
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`\begin{Definition}[$\abs{\circuit}(\vct{X})$]\label{def:positive-circuit}`
Changes to Algorithm OnePass and Lemma 4.10 proof. 2021-01-27 18:37:02 -05:00			`For any circuit $\circuit$, the corresponding`
			`{\em positive circuit}, denoted $\abs{\circuit}$, is obtained from $\circuit$ as follows. For each leaf node $\ell$ of $\circuit$ where $\ell.\type$ is $\tnum$, update $\ell.\vari{value}$ to $\|\ell.\vari{value}\|$.`
Main algorithm, some definitions for approx algo. 2020-08-14 19:22:16 -04:00			`\end{Definition}`
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`Please see \cref{ex:def-pos-circ} for an illustration.`
Revised SampleMonomial correctness proof. 2020-08-25 13:10:25 -04:00
More revision of OnePass per @atri 020421 comments/feedback 2021-02-05 11:43:01 -05:00			`\begin{Definition}[\size($\cdot$)]`
Fixed all of @atri's suggestions up to but NOT including sample monomial proof. 2021-02-19 11:15:52 -05:00			`The function \size~ takes a circuit $\circuit$ as input and outputs the number of gates (nodes) in \circuit.`
More revision of OnePass per @atri 020421 comments/feedback 2021-02-05 11:43:01 -05:00			`\end{Definition}`

			`\begin{Definition}[\depth($\cdot$)]`
Fixed all of @atri's suggestions up to but NOT including sample monomial proof. 2021-02-19 11:15:52 -05:00			`The function \depth~ has circuit $\circuit$ as input and outputs the number of levels in \circuit.`
More revision of OnePass per @atri 020421 comments/feedback 2021-02-05 11:43:01 -05:00			`\end{Definition}`

Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`\begin{Definition}[$\degree(\cdot)$]\footnote{Note that the degree of $\polyf(\abs{\circuit})$ is always upper bounded by $\deg(\circuit)$ and the latter can be strictly larger (e.g. consider the case when $\circuit$ multiplies two copies of the constant $1$-- here we have $\deg(\circuit)=1$ but degree of $\polyf(\abs{\circuit})$ is $0$).}`
Finished pass on Section 4 (Aaron) 2021-04-07 12:21:41 -04:00			`$\degree(\circuit)$ is defined recursively as follows:`
			`\[\degree(\circuit)=`
Updated definiton of deg(C) 2021-04-06 21:14:29 -04:00			`\begin{cases}`
Finished pass on Section 4 (Aaron) 2021-04-07 12:21:41 -04:00			`\max(\degree(\circuit_\linput),\degree(\circuit_\rinput)) & \text{ if }\circuit.type=+\\`
			`\degree(\circuit_\linput) + \degree(\circuit_\rinput)+1 &\text{ if }\circuit.type=\times\\`
Updated definiton of deg(C) 2021-04-06 21:14:29 -04:00			`0 & \text{otherwise}.`
			`\end{cases}`
			`\]`
Implemented @atri 030921 suggestions. 2021-03-11 11:42:46 -05:00			`\end{Definition}`
Added notation for int mult complexity 2021-04-06 10:40:05 -04:00			`Finally, we will need the following notation for the complexity of multiplying large integers:`
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`\begin{Definition}[$\multc{\cdot}{\cdot}$]\footnote{We note that when doing arithmetic operations on the RAM model for input of size $N$, we have that $\multc{O(\log{N})}{O(\log{N})}=O(1)$. More generally we have $\multc{N}{O(\log{N})}=O(N\log{N}\log\log{N})$.}`
Added notation for int mult complexity 2021-04-06 10:40:05 -04:00			`In a RAM model of word size of $W$-bits, $\multc{M}{W}$ denotes the complexity of multiplying two integers represented with $M$-bits. (We will assume that for input of size $N$, $W=O(\log{N})$.`
			`\end{Definition}`

Done in Sec 4 till definition of gamma 2020-12-14 11:47:18 -05:00			`\subsection{Our main result}`
			`In the subsequent subsections we will prove the following theorem.`
Finished changes @atri 090320 pass. 2020-09-08 12:05:51 -04:00
Started restructuring lemma 13 proof 2020-08-22 15:47:56 -04:00			`\begin{Theorem}\label{lem:approx-alg}`
Implemented @atri 030921 suggestions. 2021-03-11 11:42:46 -05:00			`Let \circuit be a circuit for a UCQ over \bi and define $\poly(\vct{X})=\polyf(\circuit)$ and let $k=\degree(\circuit)$.`
			`Then an estimate $\mathcal{E}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ can be computed in time`
Read through: Space, grammar, notation fixes 2021-04-07 01:02:46 -04:00			`{\small`
Stuck in proof of Lemma 4.15 2021-04-06 16:35:11 -04:00			`\[O\left(\left(\size(\circuit) + \frac{\log{\frac{1}{\conf}}\cdot \abs{\circuit}^2(1,\ldots, 1)\cdot k\cdot \log{k} \cdot \depth(\circuit))}{\inparen{\error'}^2\cdot\rpoly^2(\prob_1,\ldots, \prob_\numvar)}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)\]`
Read through: Space, grammar, notation fixes 2021-04-07 01:02:46 -04:00			`}`
Done till corollaries of main thm of Sec 4. Still need to make pass on the algo boxes and their lemmas 2020-12-14 14:11:46 -05:00			`such that`
			`\begin{equation}`
			`\label{eq:approx-algo-bound}`
Added probability macros for S4 2020-12-20 00:10:20 -05:00			`\probOf\left(\left\|\mathcal{E} - \rpoly(\prob_1,\dots,\prob_\numvar)\right\|> \error' \cdot \rpoly(\prob_1,\dots,\prob_\numvar)\right) \leq \conf.`
Done till corollaries of main thm of Sec 4. Still need to make pass on the algo boxes and their lemmas 2020-12-14 14:11:46 -05:00			`\end{equation}`
Started restructuring lemma 13 proof 2020-08-22 15:47:56 -04:00			`\end{Theorem}`

Read through: Space, grammar, notation fixes 2021-04-07 01:02:46 -04:00			To get linear runtime results from~\Cref{lem:approx-alg}, we will need to define another parameter modeling the (weighted) number of monomials in $\expansion{\circuit}$ to be `canceled' when it is modded with $\mathcal{B}$ (\Cref{def:mod-set-polys}).
Done in Sec 4 till definition of gamma 2020-12-14 11:47:18 -05:00			`\begin{Definition}[Parameter $\gamma$]\label{def:param-gamma}`
Changes to Algorithm OnePass and Lemma 4.10 proof. 2021-01-27 18:37:02 -05:00			`Given an expression tree $\circuit$, define`
			`\[\gamma(\circuit)=\frac{\sum_{(\monom, \coef)\in \expansion{\circuit}} \abs{\coef}\cdot \indicator{\monom\mod{\mathcal{B}}\equiv 0}}{\abs{\circuit}(1,\ldots, 1)}\]`
Done in Sec 4 till definition of gamma 2020-12-14 11:47:18 -05:00			`\end{Definition}`

Read through: Space, grammar, notation fixes 2021-04-07 01:02:46 -04:00			`\noindent We next present a few corollaries of~\Cref{lem:approx-alg}.`
Done till corollaries of main thm of Sec 4. Still need to make pass on the algo boxes and their lemmas 2020-12-14 14:11:46 -05:00			`\begin{Corollary}`
			`\label{cor:approx-algo-const-p}`
Changes to Algorithm OnePass and Lemma 4.10 proof. 2021-01-27 18:37:02 -05:00			`Let $\poly(\vct{X})$ be as in~\Cref{lem:approx-alg} and let $\gamma=\gamma(\circuit)$. Further let it be the case that $\prob_i\ge \prob_0$ for all $i\in[\numvar]$. Then an estimate $\mathcal{E}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ satisfying~\Cref{eq:approx-algo-bound} can be computed in time`
Stuck in proof of Lemma 4.15 2021-04-06 16:35:11 -04:00			`\[O\left(\left(\size(\circuit) + \frac{\log{\frac{1}{\conf}}\cdot k\cdot \log{k} \cdot \depth(\circuit))}{\inparen{\error'}^2\cdot(1-\gamma)^2\cdot \prob_0^{2k}}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)\]`
			`In particular, if $\prob_0>0$ and $\gamma<1$ are absolute constants then the above runtime simplifies to $O_k\left(\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)$.`
Done till corollaries of main thm of Sec 4. Still need to make pass on the algo boxes and their lemmas 2020-12-14 14:11:46 -05:00			`\end{Corollary}`
Moved proofs into appendix (S3, S4) 2020-12-17 16:40:48 -05:00
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`The restriction on $\gamma$ is satisfied by any \ti (where $\gamma=0$) as well as for all three queries of the PDBench \bi benchmark (Please see \Cref{app:subsec:experiment} for experimental results).`
Addressing a few comments. 2020-12-19 23:20:31 -05:00
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`Finally, we address the $\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}$ term in the runtime. %In \cref{susec:proof-val-up}, we show the following:`
Done with pass on Sec 4 2021-04-06 11:21:52 -04:00			`\begin{Lemma}`
			`\label{lem:val-ub}`
			`For any circuit $\circuit$ with $\degree(\circuit)=k$, we have`
Updated definiton of deg(C) 2021-04-06 21:14:29 -04:00			`\[\abs{\circuit}(1,\ldots, 1)\le 2^{2^k\cdot \size(\circuit)}.\]`
Read through: Space, grammar, notation fixes 2021-04-07 01:02:46 -04:00			`Further, under either of the following conditions:`
Done with pass on Sec 4 2021-04-06 11:21:52 -04:00			`\begin{enumerate}`
			`\item $\circuit$ is a tree,`
Still working on S4 appendix 2021-04-06 23:17:19 -04:00			`\item $\circuit$ encodes the run of the algorithm in~\cite{DBLP:conf/pods/KhamisNR16} on an FAQ query,`
Done with pass on Sec 4 2021-04-06 11:21:52 -04:00			`\end{enumerate}`
			`we have`
Stuck in proof of Lemma 4.15 2021-04-06 16:35:11 -04:00			`\[\abs{\circuit}(1,\ldots, 1)\le \size(\circuit)^{O(k)}.\]`
Done with pass on Sec 4 2021-04-06 11:21:52 -04:00			`\end{Lemma}`

Read through: Space, grammar, notation fixes 2021-04-07 01:02:46 -04:00			Note that the above implies that with the assumption $\prob_0>0$ and $\gamma<1$ are absolute constants from \Cref{cor:approx-algo-const-p}, then the runtime there simplies to $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)^2\cdot \log{\frac{1}{\conf}}\right)$ for general circuits $\circuit$ and to $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)$ for the case when $\circuit$ satisfies the special conditions in~\Cref{lem:val-ub}. In~\Cref{app:proof-lem-val-ub} we argue that these conditions are very general and encompass many interesting scenarios.
Done with pass on Sec 4 2021-04-06 11:21:52 -04:00
Main algorithm, some definitions for approx algo. 2020-08-14 19:22:16 -04:00			`\subsection{Approximating $\rpoly$}`
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`\approxq (\cref{alg:mon-sam}) modifies \circuit with a call to \onepass. It then samples from $\circuit_{\vari{mod}}\numsamp$ times and uses that information to approximate $\rpoly$.`
Main algorithm, some definitions for approx algo. 2020-08-14 19:22:16 -04:00
More corrections up to Lemma 13. 2020-09-04 18:32:40 -04:00
Started with proof of correctness of approx algo 2020-12-14 22:37:30 -05:00
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`\subsubsection{Correctness}`
Started with proof of correctness of approx algo 2020-12-14 22:37:30 -05:00
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`In order to prove~\Cref{lem:approx-alg}, we will need to argue the correctness of \approxq, which relies on the correctness of auxiliary algorithms \onepass and \sampmon.`
Finished Section 3. 2020-09-01 14:39:50 -04:00
			`\begin{Lemma}\label{lem:one-pass}`
Read through: Space, grammar, notation fixes 2021-04-07 01:02:46 -04:00			`The $\onepass$ function completes in time:`
Some changes to App C. 2021-04-07 17:27:11 -04:00			`$$O\left(\size(\circuit) \cdot \multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit}}\right)$$`
Done with pass on Sec 4 2021-04-06 11:21:52 -04:00			`$\onepass$ guarantees two post-conditions: First, for each subcircuit $\vari{S}$ of $\circuit$, we have that $\vari{S}.\vari{partial}$ is set to $\abs{\vari{S}}(1,\ldots, 1)$. Second, when $\vari{S}.\type = \circplus$, \subcircuit.\lwght $= \frac{\abs{\subcircuit_\linput}(1,\ldots, 1)}{\abs{\subcircuit}(1,\ldots, 1)}$ and likewise for \subcircuit.\rwght.`
More adjustments to Approx Algo per 081420 discussion. 2020-08-15 13:01:52 -04:00			`\end{Lemma}`
Finished pass on Section 4 (Aaron) 2021-04-07 12:21:41 -04:00			`To prove correctness of~\Cref{alg:mon-sam}, we only use the following fact that follows from the above lemma: for the modified circuit ($\circuit_{\vari{mod}}$), $\circuit_{\vari{mod}}.\vari{partial}=\abs{\circuit}(1,\dots,1)$.`
In the middle of Oliver's 091420 suggestions 2020-09-16 16:27:50 -04:00
Finished Section 3. 2020-09-01 14:39:50 -04:00			`\begin{Lemma}\label{lem:sample}`
Read through: Space, grammar, notation fixes 2021-04-07 01:02:46 -04:00			`The function $\sampmon$ completes in time`
			`$$O(\log{k} \cdot k \cdot \depth(\circuit)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log{\size(\circuit)}})$$`
Small fix on Section 4 pass (Aaron) 2021-04-07 12:26:07 -04:00			`where $k = \degree(\circuit)$. The function returns every $\left(\monom, sign(\coef)\right)$ for $(\monom, \coef)\in \expansion{\circuit}$ with probability $\frac{\|\coef\|}{\abs{\circuit}(1,\ldots, 1)}$.`
Finished Section 3. 2020-09-01 14:39:50 -04:00			`\end{Lemma}`
More adjustments to Approx Algo per 081420 discussion. 2020-08-15 13:01:52 -04:00
Done with pass on Sec 4 2021-04-06 11:21:52 -04:00			`With the above two lemmas, we are ready to argue the following result (proof in~\Cref{sec:proofs-approx-alg}):`
In the middle of Oliver's 091420 suggestions 2020-09-16 16:27:50 -04:00			`\begin{Theorem}\label{lem:mon-samp}`
Implemented @atri 030921 suggestions. 2021-03-11 11:42:46 -05:00			`For any $\circuit$ with $\degree(poly(\|\circuit\|)) = k$, algorithm \ref{alg:mon-sam} outputs an estimate $\vari{acc}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ such that`
Changes to Algorithm OnePass and Lemma 4.10 proof. 2021-01-27 18:37:02 -05:00			`\[\probOf\left(\left\|\vari{acc} - \rpoly(\prob_1,\ldots, \prob_\numvar)\right\|> \error \cdot \abs{\circuit}(1,\ldots, 1)\right) \leq \conf,\]`
Done with pass on Sec 4 2021-04-06 11:21:52 -04:00			`in $O\left(\left(\size(\circuit)+\frac{\log{\frac{1}{\conf}}}{\error^2} \cdot k \cdot\log{k} \cdot \depth(\circuit)\right)\cdot \multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log{\size(\circuit)}}\right)$ time.`
In the middle of Oliver's 091420 suggestions 2020-09-16 16:27:50 -04:00			`\end{Theorem}`

Correctness of OnePass started. 2020-08-13 20:54:06 -04:00
Done with pass on Sec 4 2020-12-15 01:09:00 -05:00			`\subsection{\onepass\ Algorithm}`
Started with proof of correctness of approx algo 2020-12-14 22:37:30 -05:00			`\label{sec:onepass}`

Finished pass on Section 4 (Aaron) 2021-04-07 12:21:41 -04:00			`\noindent \onepass\ (Algorithm ~\ref{alg:one-pass-iter} in \Cref{sec:proofs-approx-alg}) iteratively visits each gate one time according to the topological ordering of \circuit annotating the \lwght, \rwght, and \prt variables of each node according to the definitions above. Lemma~\ref{lem:one-pass} is proved in~\Cref{sec:proofs-approx-alg}.`
Finished up to page 4 on 1st pass Atri 090320 pass. 2020-09-07 12:30:07 -04:00
Done with pass on Sec 4 2020-12-15 01:09:00 -05:00			`\subsection{\sampmon\ Algorithm}`
Started with proof of correctness of approx algo 2020-12-14 22:37:30 -05:00			`\label{sec:samplemonomial}`
More cleaning up Approx Alg. 2020-08-17 13:52:18 -04:00
Changes to Algorithm OnePass and Lemma 4.10 proof. 2021-01-27 18:37:02 -05:00			`A naive (slow) implementation of \sampmon\ would first compute $\expansion{\circuit}$ and then sample from it.`
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`Instead, \Cref{alg:sample} selects a monomial from $\expansion{\circuit}$ by top-down traversal of the input \circuit. More details on the traversal can be found in \cref{subsec:sampmon-remarks}.`

Read through: Space, grammar, notation fixes 2021-04-07 01:02:46 -04:00			`%`
Finished 1st pass on trimming Sec 4 (Aaron) 2021-04-08 11:29:37 -04:00			`%$\sampmon$ is given in \Cref{alg:sample}, and a proof of its correctness (via \Cref{lem:sample}) is provided in \Cref{sec:proofs-approx-alg}.`


More work on 'safe' queries over BIDB 2020-10-01 14:38:40 -04:00			`%%%%%%%%%%%%%%%%%%%%%%%`
shorten 2020-12-19 16:44:18 -05:00
			`%%% Local Variables:`
			`%%% mode: latex`
			`%%% TeX-master: "main"`
			`%%% End:`