Small fixes to Appendix C

2021-04-09 11:48:10 -04:00 · 2021-04-09 11:48:10 -04:00 · 9dade793f7
parent 9f660007ec
commit 9dade793f7
6 changed files with 25 additions and 20 deletions
--- a/app_one-pass-analysis.tex
+++ b/app_one-pass-analysis.tex
@ -159,21 +159,21 @@ level 2/.style={sibling distance=0.7cm},

 \subsection{Proof of ~\Cref{lem:one-pass}}\label{sec:proof-one-pass}
 \paragraph{\onepass Correctness}
-We prove the correct computation of \prt, \lwght, \rwght values on \circuit by induction over the number of iterations in line ~\ref{alg:one-pass-loop} over the topological order \topord of the input circuit \circuit.  Note that \topord is the standard definition of a topological ordering over the DAG structure of \circuit.
+We prove the correct computation of \prt, \lwght, \rwght values on \circuit by induction over the number of iterations in line~\ref{alg:one-pass-loop} over the topological order \topord of the input circuit \circuit.  Note that \topord is the standard definition of a topological ordering over the DAG structure of \circuit.

-For the base case, we have only one gate, which by definition is a source gate and must be either \var or \tnum.  In this case, as per \Cref{eq:T-all-ones}, lines ~\ref{alg:one-pass-var} and ~\ref{alg:one-pass-num} correctly compute \circuit.\prt as $1$ and \circuit.\val respectively.
+For the base case, we have only one gate, which by definition is a source gate and must be either \var or \tnum.  In this case, as per \Cref{eq:T-all-ones}, lines~\ref{alg:one-pass-var} and~\ref{alg:one-pass-num} correctly compute \circuit.\prt as $1$ and \circuit.\val respectively.

-For the inductive hypothesis, assume that \onepass correctly computes \subcircuit.\prt, \subcircuit.\lwght, and \subcircuit.\rwght for all gates \gate in \circuit with $k > 0$ iterations over \topord.  
+For the inductive hypothesis, assume that \onepass correctly computes \subcircuit.\prt, \subcircuit.\lwght, and \subcircuit.\rwght for all gates \gate in \circuit with $k \geq 0$ iterations over \topord.  

-We now prove for $k + 1$ iterations that \onepass correctly computes the \prt, \lwght, and \rwght values for each gate $\gate_\vari{i}$ in \circuit.  %By the hypothesis the first $k$ gates (alternatively \textit{iterations}) have correctly computed values.  
-Note that the $\gate_\vari{k + 1}$ must be in the last ordering of all gates $\gate_\vari{i}$ for $i \in [k + 1]$.  It is also the case that $\gate_{k+1}$ has  two inputs.  Finally, note that for \size(\circuit) > 1, if $\gate_{k+1}$ is a leaf node, we are back to the base case.  Otherwise $\gate_{k + 1}$ is an internal node $\gate_\vari{s}.\type = \circplus$ or $\gate_\vari{s}.\type = \circmult$.
+We now prove for $k + 1$ iterations that \onepass correctly computes the \prt, \lwght, and \rwght values for each gate $\gate_\vari{i}$ in \circuit for $i \in [k + 1]$.
+Note that the $\gate_\vari{k + 1}$ must be in the last ordering of all gates $\gate_\vari{i}$.  It is also the case that $\gate_{k+1}$ has  two inputs.  Finally, note that for \size(\circuit) > 1, if $\gate_{k+1}$ is a leaf node, we are back to the base case.  Otherwise $\gate_{k + 1}$ is an internal node $\gate_\vari{s}.\type = \circplus$ or $\gate_\vari{s}.\type = \circmult$.

-When $\gate_{k+1}.\type = \circplus$, then by line ~\ref{alg:one-pass-plus} $\gate_{k+1}$.\prt $= \gate_{{k+1}_\lchild}$.\prt $+ \gate_{{k+1}_\rchild}$.\prt, a correct computation, as per \Cref{eq:T-all-ones}.  Further, lines ~\ref{alg:one-pass-lwght} and ~\ref{alg:one-pass-rwght} compute $\gate_{{k+1}}.\lwght = \frac{\gate_{{k+1}_\lchild}.\prt}{\gate_{{k+1}}.\prt}$ and analogously for $\gate_{{k+1}}.\rwght$.  Note that all values needed for each computation have been correctly computed by the I.H.
+When $\gate_{k+1}.\type = \circplus$, then by line ~\ref{alg:one-pass-plus} $\gate_{k+1}$.\prt $= \gate_{{k+1}_\lchild}$.\prt $+ \gate_{{k+1}_\rchild}$.\prt, a correct computation, as per \Cref{eq:T-all-ones}.  Further, lines ~\ref{alg:one-pass-lwght} and ~\ref{alg:one-pass-rwght} compute $\gate_{{k+1}}.\lwght = \frac{\gate_{{k+1}_\lchild}.\prt}{\gate_{{k+1}}.\prt}$ and analogously for $\gate_{{k+1}}.\rwght$.  Note that all values needed for each computation have been correctly computed by the inductive hypothesis.

 When $\gate_{k+1}.\type = \circmult$, then line ~\ref{alg:one-pass-mult} computes $\gate_{k+1}.\prt = \gate_{{k+1}_\lchild.\prt} \circmult \gate_{{k+1}_\rchild}.\prt$, which indeed is correct, as per \Cref{eq:T-all-ones}.

 \paragraph{\onepass Runtime}
-It is known that $\topord(G)$ is computable in linear time.  Next, each of the $\size(\circuit)$ iterations of the loop in ~\Cref{alg:one-pass-loop} take $O\left( \multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}\right)$ time.  It is easy to see that all numbers that the algorithm computes is at most $\abs{\circuit}(1,\dots,1)$. Hence, by definition each such operation takes $\multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}$ time, which proves the claimed runtime.
+It is known that $\topord(G)$ is computable in linear time.  Next, each of the $\size(\circuit)$ iterations of the loop in \cref{alg:one-pass-loop} take $O\left( \multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}\right)$ time.  It is easy to see that each of all the numbers which the algorithm computes is at most $\abs{\circuit}(1,\dots,1)$. Hence, by definition each such operation takes $\multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}$ time, which proves the claimed runtime.
 %In general it is known that an arithmetic computation which requires $M$ bits takes $O(\frac{\log{M}}{\log{N}})$ time for an input size $N$.  Since each of the arithmetic operations at a given gate has a bit size of $O(\log{\abs{\circuit}(1,\ldots, 1)})$,  thus, we obtain the general runtime of $O\left(\size(\circuit)\cdot \frac{\log{\abs{\circuit}(1,\ldots, 1)}}{\log{\size(\circuit)}}\right)$.


--- a/app_samp-monom-analysis.tex
+++ b/app_samp-monom-analysis.tex
@ -5,7 +5,7 @@
 \input{app_sample-monomial-pseudo-code}
 We briefly describe the top-down traversal of \sampmon.  For a parent $+$ gate, the input to be visited is sampled from the weighted distribution precomputed by \onepass.
 When a parent $\times$ node is visited, both inputs are visited.
-The algorithm computes two properties: the set of all variable leaf nodes visited, and the product of signs of visited coefficient leaf nodes.
+The algorithm computes two properties: the set of all variable leaf nodes visited, and the product of the signs of visited coefficient leaf nodes.
 %
 We will assume the TreeSet data structure to maintain sets with logarithmic time insertion and linear time traversal of its elements.
 While we would like to take advantage of the space efficiency gained in using a circuit \circuit instead an expression tree \etree, we do not know that such a method exists when computing a sample of the input polynomial representation.  
@ -13,23 +13,25 @@ While we would like to take advantage of the space efficiency gained in using a
 The efficiency gains of circuits over trees is found in the capability of circuits to only require space for each \emph{distinct} term in the compressed representation.  This saves space in such polynomials containing non-distinct terms multiplied or added to each other, e.g., $x^4$.  However, to avoid biased sampling, it is imperative to sample from both inputs of a multiplication gate, independently, which is indeed the approach of \sampmon.  

 \subsection{Proof of~\Cref{lem:sample}}\label{sec:proof-sample-monom}
+\begin{proof}
+%\paragraph*{\sampmon returns a valid monomial}
 We first need to show that $\sampmon$ indeed returns a monomial $\monom$,\footnote{Technically it returns $\var(\monom)$ but for less cumbersome notation we will refer to $\var(\monom)$ simply by $\monom$ in this proof.} such that $(\monom, \coef)$ is in $\expansion{\circuit}$, which we do by induction on the depth of $\circuit$.

-For the base case, let the depth $d$ of $\circuit$ be $0$.  We have that the root node is either a constant $\coef$ for which by line ~\ref{alg:sample-num-return} we return $\{~\}$, or we have that $\circuit.\type = \var$ and $\circuit.\val = x$, and  by line ~\ref{alg:sample-var-return} we return $\{x\}$.  Both cases sample a monomial%satisfy ~\cref{def:monomial}
+For the base case, let the depth $d$ of $\circuit$ be $0$.  We have that the root node is either a constant $\coef$ for which by line~\ref{alg:sample-num-return} we return $\{~\}$, or we have that $\circuit.\type = \var$ and $\circuit.\val = x$, and  by line~\ref{alg:sample-var-return} we return $\{x\}$.  Both cases sample a monomial%satisfy ~\cref{def:monomial}
 , and the base case is proven.

 For the inductive hypothesis, assume that for $d \leq k$ for some $k \geq 0$, that it is indeed the case that $\sampmon$ returns a monomial.

-For the inductive step, let us take a circuit $\circuit$ with $d = k + 1$.  Note that each input has depth $d \leq k$, and by inductive hypothesis both of them return a valid monomial.  Then the root can be either a $\circplus$ or $\circmult$ node.  For the case of a $\circplus$ root node, line ~\ref{alg:sample-plus-bsamp} of $\sampmon$ will choose one of the inputs of the root.  By inductive hypothesis it is the case that a monomial in \expansion(\circuit) is being returned from either input.  Then it follows that for the case of $+$ root node a valid monomial is returned by $\sampmon$.  When the root is a $\circmult$ node, line ~\ref{alg:sample-times-union} %and ~\ref{alg:sample-times-product} multiply
+For the inductive step, let us take a circuit $\circuit$ with $d = k + 1$.  Note that each input has depth $d \leq k$, and by inductive hypothesis both of them return a valid monomial.  Then the root can be either a $\circplus$ or $\circmult$ node.  For the case of a $\circplus$ root node, line~\ref{alg:sample-plus-bsamp} of $\sampmon$ will choose one of the inputs of the root.  By inductive hypothesis it is the case that a monomial in \expansion{\circuit} is being returned from either input.  Then it follows that for the case of $+$ root node a valid monomial is returned by $\sampmon$.  When the root is a $\circmult$ node, line~\ref{alg:sample-times-union} %and ~\ref{alg:sample-times-product} multiply
 computes the set union of the monomials returned by the two inputs of the root, and it is trivial to see
 %by definition ~\ref{def:monomial}
 %the product of two monomials is also a monomial, and
-by ~\Cref{def:expand-circuit} that \monom is a valid monomial in some $(\monom, \coef) \in \expansion{\circuit}$.
+by~\cref{def:expand-circuit} that \monom is a valid monomial in some $(\monom, \coef) \in \expansion{\circuit}$.

 We will next prove by induction on the depth $d$ of $\circuit$ that the $(\monom,\coef) \in \expansion{\circuit}$ is the \monom returned by $\sampmon$ with a probability %`that is in accordance with the monomial sampled,
 $\frac{|\coef|}{\abs{\circuit}\polyinput{1}{1}}$.

-For the base case $d = 0$, by definition ~\ref{def:express-tree} we know that the root has to be either a coefficient or a variable.  For either case, the probability of the value returned is $1$ since there is only one value to sample from.  When the root is a variable $x$ the algorithm correctly returns $(\{x\}, 1 )$.  When the root is a coefficient, \sampmon ~correctly returns $(\{~\}, sign(\coef_i))$.
+For the base case $d = 0$, by definition~\ref{def:circuit} we know that the root has to be either a coefficient or a variable.  For either case, the probability of the value returned is $1$ since there is only one value to sample from.  When the root is a variable $x$ the algorithm correctly returns $(\{x\}, 1 )$.  When the root is a coefficient, \sampmon ~correctly returns $(\{~\}, sign(\coef_i))$.

 For the inductive hypothesis, assume that for $d \leq k$ and $k \geq 0$ $\sampmon$ indeed samples $\monom$ in $(\monom, \coef)$ in $\expansion{\circuit}$ with probability $\frac{|\coef|}{\abs{\circuit}\polyinput{1}{1}}$.%bove is true.%lemma ~\ref{lem:sample} is true.

@ -50,8 +52,8 @@ and we obtain the desired result.



-\paragraph{Run-time Analysis}
-It is easy to check that except for lines~\ref{alg:sample-times-union} and~\ref{alg:sample-plus-bsamp}, all lines take $O(1)$ time.  For \Cref{alg:sample-times-uinon}, consider an execution of~\Cref{alg:sample-times-union}. We note that we will be adding a given set of variables to some set at most once: since the sum of the sizes of the sets at a given level is at most $\degree(\circuit)$, each gate visited takes $O(\log{\degree(\circuit)})$.  For \Cref{alg:sample-plus-bsamp}, note that we pick $\circuit_\linput$ with probability $\frac a{a+b}$ where $a=\circuit.\vari{Lweight}$ and $b=\circuit.\vari{Rweight}$. We can implement this step by picking a random number $r\in[a+b]$ and then checking if $r\le a$. It is easy to check that $a+b\le \abs{\circuit}(1,\dots,1)$. This means we need to add and compare $\log{\abs{\circuit}(1,\ldots, 1)}$-bit numbers, which can certainly be done in time $\multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}$ (note that this is an over-estimate).
+%\paragraph*{Run-time Analysis}
+It is easy to check that except for lines~\ref{alg:sample-times-union} and~\ref{alg:sample-plus-bsamp}, all lines take $O(1)$ time.  For \Cref{alg:sample-times-union}, consider an execution of~\Cref{alg:sample-times-union}. We note that we will be adding a given set of variables to some set at most once: since the sum of the sizes of the sets at a given level is at most $\degree(\circuit)$, each gate visited takes $O(\log{\degree(\circuit)})$.  For \Cref{alg:sample-plus-bsamp}, note that we pick $\circuit_\linput$ with probability $\frac a{a+b}$ where $a=\circuit.\vari{Lweight}$ and $b=\circuit.\vari{Rweight}$. We can implement this step by picking a random number $r\in[a+b]$ and then checking if $r\le a$. It is easy to check that $a+b\le \abs{\circuit}(1,\dots,1)$. This means we need to add and compare $\log{\abs{\circuit}(1,\ldots, 1)}$-bit numbers, which can certainly be done in time $\multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}$ (note that this is an over-estimate).
 % we have $> O(1)$ time when $\abs{\circuit}(1,\ldots, 1) > \size(\circuit)$.  when this is the case that for each sample, we have $\frac{\log{\abs{\circuit}(1,\ldots, 1)}}{\log{\size(\circuit)}}$ operations, since we need to read in and then compare numbers of of $\log{{\abs{\circuit}(1,\ldots, 1)}}$ bits.  
 Denote \cost(\circuit) (\Cref{eq:cost-sampmon}) to be an upper bound of the number of nodes visited by \sampmon.  Then the runtime is $O\left(\cost(\circuit)\cdot \log{\degree(\circuit)}\cdot \multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}\right)$.

@ -139,4 +141,7 @@ As in the $\circmult$ case the \emph{reduced} invariant of \reduce implies that

 Similar to the case of $\circuit.\type = \circmult$, (\ref{eq:plus-rhs}) follows by equations $(\ref{eq:cost-sampmon})$ and $(\ref{eq:ih-bound-cost})$.

-This proves (\ref{eq:strict-upper-bound}) for the $\circplus$ case, as desired. % and thus the claimed $O(k\log{k}\cdot \frac{\log{\abs{\circuit}(1,\ldots, 1)}}{\size(\circuit)}\cdot\depth(\circuit))$ runtime for $k = \degree(\circuit)$ follows.
+This proves (\ref{eq:strict-upper-bound}) for the $\circplus$ case, as desired. 
+\end{proof}
+
+% and thus the claimed $O(k\log{k}\cdot \frac{\log{\abs{\circuit}(1,\ldots, 1)}}{\size(\circuit)}\cdot\depth(\circuit))$ runtime for $k = \degree(\circuit)$ follows.
--- a/app_sample-monomial-pseudo-code.tex
+++ b/app_sample-monomial-pseudo-code.tex
@ -3,7 +3,7 @@
 	\caption{\sampmon(\circuit)}
 	\label{alg:sample}
 	\begin{algorithmic}[1]
-		\revision{\Require \circuit: Circuit}
+		\Require \circuit: Circuit
 		\Ensure \vari{vars}: TreeSet
 		\Ensure \vari{sgn} $\in \{-1, 1\}$
 		\Comment{\Cref{alg:one-pass-iter} should have been run before this one} % algorithm ~\ref{alg:sample}}
--- a/approx_alg.tex
+++ b/approx_alg.tex
@ -114,7 +114,7 @@ The algorithm (\approxq detailed in \Cref{alg:mon-sam}) to prove~\Cref{lem:appro
 \label{eq:tilde-Q-bi}
 \rpoly\inparen{X_1,\dots,X_\numvar}=\hspace*{-1mm}\sum_{(\monom,\coef)\in \expansion{\circuit}} \hspace*{-2mm} \indicator{\monom\mod{\mathcal{B}}\not\equiv 0}\cdot \coef\cdot\hspace*{-2mm}\prod_{X_i\in \var\inparen{\monom}}\hspace*{-2mm} X_i
 \end{equation}
-Given the above, the algorithm is a sampling based algorithm for the above sum: we sample (via \sampmon) $(\monom,\coef)\in \expansion{\circuit}$ with probability proportional%\footnote{We could have also uniformly sampled from $\expansion{\circuit}$ but this gives better parameters.}
+Given the above, the algorithm is a sampling based algorithm for the above sum: we sample (via \sampmon) $(\monom,\coef)\in \expansion{\circuit}$ with probability proportional %\footnote{We could have also uniformly sampled from $\expansion{\circuit}$ but this gives better parameters.}
 to $\abs{\coef}$ and compute $Y=\indicator{\monom\mod{\mathcal{B}}\not\equiv 0}\cdot \prod_{X_i\in \var\inparen{\monom}} p_i$. Taking $\numsamp$ samples and computing the average of $Y$ gives us our final estimate. \onepass is used to compute the sampling probabilities needed in \sampmon (details are in~\Cref{sec:proofs-approx-alg}).
 %\approxq (\cref{alg:mon-sam}) modifies \circuit with a call to \onepass.  It then samples from $\circuit_{\vari{mod}}\numsamp$ times and uses that information to approximate $\rpoly$.

--- a/circuits-model-runtime.tex
+++ b/circuits-model-runtime.tex
@ -76,7 +76,7 @@ This follows from~\Cref{lem:circuits-model-runtime} (\cref{sec:circuit-runtime})
 %
 We make a simple observation to conclude the presentation of our results.
 So far we have only focused on the expectation of $\poly$.  In addition, we could e.g. prove bounds of probability of the multiplicity being at least $1$.  Progress can be made on this as follows:
-For any positive integer $m$ we can compute the $m$-th moment of the multiplicities, allowing us to e.g. to use Chebyschev inequality or other high moment based probability bounds on the events we might be interested in.
+For any positive integer $m$ we can compute the $m$-th moment of the multiplicities, allowing us to e.g. use Chebyschev inequality or other high moment based probability bounds on the events we might be interested in.
 We leave a further investigation of this question for future work.

 %%% Local Variables:
--- a/macros.tex
+++ b/macros.tex
@ -88,7 +88,7 @@
 \newcommand{\lchild}{\vari{L}}
 \newcommand{\rchild}{\vari{R}}
 %members of T
-\newcommand{\val}{\vari{val}\xspace}
+\newcommand{\val}{\vari{val}}
 \newcommand{\wght}{\vari{weight}\xspace}
 \newcommand{\vpartial}{\vari{partial}\xspace}
 %types of T
@ -117,7 +117,7 @@
 \newcommand{\degree}{\func{deg}}
 \newcommand{\size}{\func{size}}
 \newcommand{\depth}{\func{depth}}
-\newcommand{\topord}{\func{TopOrd}}
+\newcommand{\topord}{\func{TopOrd}\xspace}
 %saving \treesize for now to keep latex from breaking
 \newcommand{\treesize}{\func{size}}
 \newcommand{\sign}{\func{sgn}}