paper-BagRelationalPDBsAreHard/approx_alg.tex

%root: main.tex
\section{$1 \pm \epsilon$ Approximation Algorithm}
%\AH{I am attempting to rewrite this section mostly from scratch.  This will involve taking 'baby' steps towards the goals we spoke of on Friday 080720 as well as throughout the following week on chat channel.}
%
%\AH{\textbf{BEGIN}: Old stuff.}
%
%
%\begin{proof}
%
%Let us now show a sampling scheme which can run in $O\left(|\poly|\cdot k\right)$ per sample.  
%
%First, consider when $\poly$ is already an SOP of pure products.  In this case, sampling is trivial, and one would sample from the $\setsize$ terms with probability proportional to the product of probabilitites for each variable in the sampled monomial.
%
%Second, consider when $\poly$ has a POS form with a product width of $k$.  In this case, we can view $\poly$ as an expression tree, where the leaves represent the individual values of each factor.  The leaves are joined together by either a $\times$ or $+$ internal node, and so on, until we reach the root, which is joining the $k$-$\times$ nodes.  
%
%Then for each $\times$ node, we multiply its subtree values, while for each $+$ node, we pick one of its children with probability proportional to the product of probabilities across its variables.
%
%\AH{I think I mean to say a probability proportional to the number of elements in it's given subtree.}
%
%The above sampling scheme is in $O\left(|\poly|\cdot k\right)$ time then, since we have for either case, that at most the scheme would perform within a factor of the $|\poly|$ operations, and those operations are repeated the product width of $k$ times.
%
%Thus, it is the case, that we can approximate $\rpoly(\prob_1,\ldots, \prob_n)$ within the claimed confidence bounds and computation time, thus proving the lemma.\AH{State why.}
%
%\AH{Discuss how we have that $\rpoly \geq O(\setsize)$.  Discuss that we need $b-a$ to be small.}
%\end{proof}
%
%\qed
%\AH{{\bf END:} Old Stuff}


%\begin{Definition}[Polynomial]\label{def:polynomial}
%The expression $\poly(\vct{X})$ is a polynomial if it satisfies the standard mathematical definition of polynomial, and additionally is in the standard monomial basis.
%\end{Definition}  

%To clarify defintion ~\ref{def:polynomial}, a polynomial in the standard monomial basis is one whose monomials are in SOP form, and whose non-distinct monomials have been collapsed into one distinct monomial, with its corresponding coefficient accurately reflecting the number of monomials combined.

Now, some useful definitions and notation.  For illustrative purposes in the definitions below, let us consider when $\poly(\vct{X}) = 2x^2 + 3xy - 2y^2$.

\begin{Definition}[Degree]\label{def:degree}
The degree of polynomial $\poly(\vct{X})$ is the maximum sum of the exponents of a monomial, over all monomials.
\end{Definition}

The degree of $\poly(\vct{X})$ in the above example is $2$.  In this note we consider only finite degree polynomials.

\AH{We need to verify that this definition is consistent with the rest of the paper.  Also, it might be useful to specify coefficients are 1?}
\begin{Definition}[Monomial]\label{def:monomial}
A monomial is a product of a fixed set of variables, each raised to a non-negative integer power.
\end{Definition}

For example, the expression $xy$ is a monomial from the term $3xy$ of $\poly(\vct{X})$, produced from the set of variables $\vct{X} = \{x, y\}$.

%\begin{Definition}[$|\vct{X}|$]\label{def:num-vars}
%Denote the number of variables in $\poly(\vct{X})$ as $|\vct{X}|$.
%\end{Definition}
%
%In the running example, $|\vct{X}| = 2$.

\begin{Definition}[Expression Tree]\label{def:express-tree}
An expression tree $\etree$ is a binary %an ADT logically viewed as an n-ary
tree, whose internal nodes are from the set $\{+, \times\}$, with leaf nodes being either from the set $\mathbb{R}$ $(\tnum)$ or from the set of monomials $(\var)$.  The members of $\etree$ are \type, \val, \vari{partial}, \vari{children}, and \vari{weight}, where \type is the type of value stored in the root node of $\etree$, \val is the value stored, and \vari{children} is the a list of the root of $\etree$'s children.  Remaining fields hold values whose semantics we will fix later.  When $\etree$ is used as input of ~\cref{alg:mon-sam} and ~\cref{alg:one-pass}, the values of \vari{partial} and \vari{weight} will not be set. %SEMANTICS FOR \etree:   \vari{partial} is the sum of $\etree$'s coefficients , n, and \vari{weight} is the probability of $\etree$ being sampled.
\end{Definition}

Note that $\etree$ encodes an expression generally \textit{not} in the standard monomial basis, for example, when $\etree$ represents the expression $(x + 2y)(2x - y)$.

\begin{Definition}[poly$(\cdot)$]\label{def:poly-func}
Denote $poly(\etree)$ to be the function that takes as input expression tree $\etree$ and outputs its corresponding polynomial.  Recursively defined on $\etree$ as follows, where $\etree_\lchild$ and $\etree_\rchild$ denote the left and right child of $\etree$ respectively.

	\begin{align*}
		&\etree.\type = +\mapsto&& \polyf(\etree_\lchild) + \polyf(\etree_\rchild)\\
		&\etree.\type = \times\mapsto&& \polyf(\etree_\lchild) \cdot \polyf(\etree_\rchild)\\
		&\etree.\type =  \var \text{ OR } \tnum\mapsto&& \etree.\val
	\end{align*}
\end{Definition}

\AH{1) Is $OR$ the best way to express the third case? 
\par2) Below seems like over-defining to me.  Is this really necessary?  The first sentence I think is \textit{enough}.}
Note that addition and multiplication follow the normal interpretation over polynomials.  Specifically, when adding two monomials whose variables and respective exponents agree, the coefficients corresponding to the monomials are added and their sum is multiplied to the monomial.  Multiplication here is denoted by concatenation of the monomial and coefficient.  When two monomials are multiplied, the product of each corresponding coefficient is computed, and the variables in each monomial are multiplied, i.e., the exponents of like variables are added.  Again we notate this by the direct product of coefficient product and all disitinct variables in the two monomials, with newly computed exponents.

\begin{Definition}[Expression Tree Set]\label{def:express-tree-set}$\etreeset{\smb}$ is the set of all possible expression trees $\etree$, such that $poly(\etree) = \poly(\vct{X})$.  
\end{Definition}

For our running example, $\etreeset{\smb} = \{2x^2 + 3xy - 2y^2, (x + 2y)(2x - y)\}$.  Note that \cref{def:express-tree-set} implies that $\etree \in \etreeset{poly(\etree)}$.

\AH{Just wondering if there is a more simple way to describe ~\cref{def:expand-tree}.  \par Also not sure about the notation \vari{List}.}
\begin{Definition}[Expanded T]\label{def:expand-tree}
$\expandtree{\etree}$ is the pure SOP expansion of $\etree$.  The logical view of \expandtree{\etree} ~is a list of tuples $(\monom, \coef)$, where $\monom$ is of type monomial and $\coef$ is in $\mathbb{R}$, recursively defined as
	\begin{align*}
		&\etree.\type =  +  \mapsto&& \elist{\expandtree{\etree_\lchild}, \expandtree{\etree_\rchild}}\\
		&\etree.\type =  \times \mapsto&& \elist{\expandtree{\etree_\lchild} \otimes \expandtree{\etree_\rchild}}\\
		&\etree.\type = \tnum \mapsto&& \elist{(\emptyset, \etree.\val)}\\
		&\etree.\type = \var \mapsto&& \elist{(\etree.\val, 1)}
	\end{align*}
such that the multiplication of two tuples %is the standard multiplication over monomials and the standard multiplication over coefficients to produce the product tuple, as in 
is their direct product $(\monom_1, \coef_1) \cdot (\monom_2, \coef_2) = (\monom_1 \times \monom_2, \coef_1 \times \coef_2)$ such that monomials $\monom_1$ and $\monom_2$ are concatenated in a product operation, while the standard product operation over reals applies to $\coef_1 \times \coef_2$.  The operator $\otimes$ is defined as the cross-product tuple multiplication of all such tuples returned by both $\expandtree{\etree_\lchild}$ and $\expandtree{\etree_\rchild}$.
\end{Definition}

\begin{Example}\label{example:expr-tree-T}
To illustrate \cref{def:expand-tree} with an example, consider the product $(x + 2y)(2x - y)$ and its expression tree $\etree$ in Figure ~\ref{fig:expr-tree-T}.  The pure expansion of the product is $2x^2 - xy + 4xy - 2y^2 = \expandtree{\etree}$, logically viewed as $[(2, x^2), (-1, xy), (4, xy), (-2, y^2)]$.  (For preciseness, note that $\etree$ would use a $+$ node to model the second factor ($\etree_\rchild$), while storing a child coefficient of $-1$ for the variable $y$.  The subtree $\etree_\rchild$ would be $+(\times(2, x), \times(-1, y))$, and one can see that $\etree_\rchild$ is indeed equivlent to $(2x - y)$). 
\end{Example}


\begin{figure}[h!]

\begin{tikzpicture}[thick, level distance=0.9cm,level 1/.style={sibling distance=3.55cm}, level 2/.style={sibling distance=1.8cm}, level 3/.style={sibling distance=0.8cm}]% level/.style={sibling distance=6cm/(#1 * 1.5)}]
	\node[tree_node]{$\boldsymbol{\times}$}
		child{node[tree_node]{$\boldsymbol{+}$}
			child{node[tree_node]{$\boldsymbol{\times}$}
				child[missing]{node[tree_node]{}}
				child{node[tree_node]{x}}
				}
			child{node[tree_node]{$\boldsymbol{\times}$}
				child{node[tree_node]{2}}
				child{node[tree_node]{y}}
				}
			}
		child{node[highlight_treenode] (TR) {$\boldsymbol{+}$}
			child{node[tree_node]{$\boldsymbol{\times}$}
				child{node[tree_node]{2}}
				child{node[tree_node]{x}}
				}
			child{node[tree_node]{$\boldsymbol{\times}$}
				child{node[tree_node] (neg-leaf) {-1}}
				child{node[tree_node]{y}}
				}
			%child[sibling distance= 0cm, grow=north east, red]{node[tree_node]{$\etree_\rchild$}}
			};
%		\node[below=2pt  of neg-leaf, inner sep=1pt, blue] (neg-comment) {\textbf{Negation pushed to leaf nodes}};
%		\draw[<-|, blue] (neg-leaf) -- (neg-comment);
		\node[above right=0.7cm of TR, highlight_color, inner sep=0pt, font=\bfseries] (tr-comment) {$\etree_\rchild$};
		\draw[<-|, highlight_color] (TR) -- (tr-comment);
\end{tikzpicture}

\caption{Expression tree $\etree$ for the product $\boldsymbol{(x + 2y)(2x - y)}$.}
\label{fig:expr-tree-T}
\end{figure}


\begin{Definition}[Positive T]\label{def:positive-tree}
Let $\abs{\etree}$ denote the resulting expression tree such that, for each leaf node $\etree'$ of $\etree$ where $\etree'.\type$ is $\tnum$, $\etree'.\vari{value} = |\etree'.\vari{value}|$. %value $\coef$ of each coefficient leaf node in $\etree$ is set to %$\coef_i$ in $\etree$ is exchanged with its absolute value$|\coef|$.
\end{Definition}

Using the same polynomial from the above example, $poly(\abs{\etree}) = (x + 2y)(2x + y) = 2x^2 +xy +4xy + 2y^2 = 2x^2 + 5xy + 2y^2$.  Note that this \textit{is not} the same as $\poly(\vct{X})$.

\begin{Definition}[Evaluation]\label{def:exp-poly-eval}
Given an expression tree $\etree$ and $\vct{v} \in \mathbb{R}^\numvar$, $\etree(\vct{v}) = poly(\etree)(\vct{v})$.
\end{Definition}

In the subsequent subsections we lay the groundwork to prove the following theorem.

\begin{Theorem}\label{lem:approx-alg}
For any query polynomial $\poly(\vct{X})$, an approximation of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ can be computed in $O\left(\treesize(\etree) + \frac{\log{\frac{1}{\conf}}\cdot \abs{\etree}^2(1,\ldots, 1)}{\error^2\cdot\rpoly^2(\prob_1,\ldots, \prob_\numvar)}\right)$, with $(\error,\delta)$-bounds, where $k$ denotes the degree of $\poly$.
\end{Theorem}

\subsection{Approximating $\rpoly$}

\subsubsection{Description}
Algorithm ~\ref{alg:mon-sam} approximates $\rpoly$ using the following steps.  First, a call to $\onepass$ on its input $\etree$ produces a non-biased weight distribution over the monomials of $\expandtree{\etree}$ and a correct count of $|\etree|(1,\ldots, 1)$, i.e., the number of monomials in $\expandtree{\etree}$.  Next, ~\cref{alg:mon-sam} calls $\sampmon$  to sample one monomial and its sign from $\expandtree{\etree}$.  The sampling is repeated $\ceil{\frac{2\log{\frac{2}{\delta}}}{\epsilon^2}}$ times, where each of the samples are evaluated over $\vct{p}$, multiplied by $1 \times sign$, and summed.  The final result is scaled accordingly returning an estimate of $\rpoly$ with the claimed $(\error, \conf)$-bound of ~\cref{lem:mon-samp}.

Kindly recall that the notaion $[x, y]$ denotes the range of values between $x$ and $y$ inclusive.  The notation $\{x, y\}$ denotes the set of values consisting of $x$ and $y$.
\subsubsection{Psuedo Code}

\begin{algorithm}[H]
	\caption{$\approxq$($\etree$, $\vct{p}$, $\conf$, $\error$)}
	\label{alg:mon-sam}
	\begin{algorithmic}[1]
		\Require \etree: Binary Expression Tree
		\Require $\vct{p} = (\prob_1,\ldots, \prob_\numvar)$ $\in [0, 1]^N$
		\Require $\conf$ $\in [0, 1]$
		\Require $\error$ $\in [0, 1]$
		\Ensure \vari{acc} $\in \mathbb{R}$
		\State $\accum \gets 0$\label{alg:mon-sam-global1}
		\State $\numsamp \gets \ceil{\frac{2 \log{\frac{2}{\conf}}}{\error^2}}$\label{alg:mon-sam-global2}
		\State $(\vari{\etree}_\vari{mod}, \vari{size}) \gets $ \onepass($\etree$)\label{alg:mon-sam-onepass}\Comment{$\onepass$ is ~\cref{alg:one-pass} \;and \sampmon \; is ~\cref{alg:sample}}
		\For{\vari{i} \text{ in } $1\text{ to }\numsamp$}\Comment{Perform the required number of samples}
			\State $(\vari{M}_\vari{i}, \vari{sgn}_\vari{i}) \gets  $ \sampmon($\etree_\vari{mod}$)
				\State $\vari{Y}_\vari{i} \gets 1$\label{alg:mon-sam-assign1}
				\For{$\vari{x}_{\vari{j}}$ \text{ in } $\vari{M}_{\vari{i}}$}
					\State $\vari{Y}_\vari{i} \gets \vari{Y}_\vari{i} \times \; \vari{\prob}_\vari{j}$\label{alg:mon-sam-product2} \Comment{$\vari{p}_\vari{j}$ is the assignment to $\vari{x}_\vari{j}$ from input $\vct{p}$}
				\EndFor
				\State $\vari{Y}_\vari{i} \gets \vari{Y}_\vari{i} \times\; \vari{sgn}_\vari{i}$\label{alg:mon-sam-product}
			\State $\accum \gets \accum + \vari{Y}_\vari{i}$\Comment{Store the sum over all samples}\label{alg:mon-sam-add}
		\EndFor

		\State  $\vari{acc} \gets \vari{acc} \times \frac{\vari{size}}{\numsamp}$\label{alg:mon-sam-global3}
		\State \Return \vari{acc}
	\end{algorithmic}
\end{algorithm}

\subsubsection{Correctness}

\begin{Theorem}\label{lem:mon-samp}
For any $\etree$ with $\degree(poly(|\etree|)) = k$, algorithm \ref{alg:mon-sam} outputs an estimate of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ within an additive $\error \cdot \abs{\etree}(1,\ldots, 1)$ error with probability $1 - \conf$, in $O\left(\treesize(\etree) + \left(\frac{\log{\frac{1}{\conf}}}{\error^2} \cdot k \cdot\log{k} \cdot depth(\etree)\right)\right)$ time.
\end{Theorem}

At the conclusion of $\onepass$, $\etree.\vari{partial}$ will hold sum of all coefficients in $\expandtree{\abs{\etree}}$, i.e. $\sum\limits_{(\monom, \coef) \in \expandtree{\abs{\etree}}}\coef$.  $\etree.\vari{weight}$ will hold the weighted probability that $\etree$ is sampled from from its parent $+$ node.

We state the lemmas for $\onepass$ and $\sampmon$, the auxiliary algorithms on which ~\cref{alg:mon-sam} relies.  Their proofs are subsequent.

\begin{Lemma}\label{lem:one-pass}
There exists an algorithm $\onepass$ which correctly computes $\abs{\vari{S}}(1,\ldots, 1)$ for each subtree $\vari{S}$ of $\etree$.  For the left child $\vari{S}_\lchild$ of $\vari{S}$ such that $\vari{S}.\val  = +$, it correctly computes the weighted distribution $\frac{\abs{\vari{S}_\lchild}(1,\ldots, 1)}{\abs{\vari{S}}(1,\ldots, 1)}$ and likewise for the right child.  All computations are performed in one traversal in $O(size(\etree))$ time.
\end{Lemma}

\begin{Lemma}\label{lem:sample}
For every $(\monom,\coef)$ in $\vari{E}(\abs{\etree})$, $k = \degree(poly(\abs{\etree})$, there exists an algorithm $\sampmon(\etree)$ that returns $\left(\monom, sign(\coef)\right)$ with probability $\frac{|\coef|}{\abs{\etree}(1,\ldots, 1)}$ in $O(\log{k} \cdot k \cdot depth(\etree))$ time.
\end{Lemma}

\begin{proof}[Proof of Theorem \ref{lem:mon-samp}]

Consider $\expandtree{\etree}{\etree}$ and let $(\monom, \coef)$ be an arbitrary tuple in $\expandtree{\etree}{\etree}$.  For convenience over alphabet $\Sigma$, define $\evalmp: \left(\{\nu^a~|~\nu \in \Sigma, a \in \mathbb{N}\}, [0, 1]^\numvar\right)\mapsto \mathbb{R}$, a function that takes a monomial $\monom$ and probability vector $\vct{p}$ as input and outputs the evaluation of $\monom$ over $\vct{p}$.  By ~\cref{lem:sample}, the sampling scheme samples $(\monom, \coef)$ in $\expandtree{\etree}$ with probability $\frac{|\coef|}{\abs{\etree}(1,\ldots, 1)}$.   Now consider $\rpoly$ and note that
 $\coef_i \cdot \evalmp(\monom, \vct{p})$
is the value of the $i^{th}$ monomial term in $\rpoly(\prob_1,\ldots, \prob_\numvar)$. 

Consider now a set of $\samplesize$ random variables $\vct{\randvar}$, where each $\randvar_i$ is distributed as described above.  Then for random variable $\randvar_i$, it is the case that

 $\expct\pbox{\randvar_i} = \sum\limits_{(\monom, \coef) \in \expandtree{\etree}}\frac{\coef \cdot \evalmp(\monom, p)}{\sum\limits_{(\monom, \coef) \in \expandtree{\etree}}|\coef|} = \frac{\rpoly(\prob_1,\ldots, \prob_\numvar)}{\abs{\etree}(1,\ldots, 1)}$.  Let $\empmean = \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\randvar_i$.  It is also true that 

\[\expct\pbox{\empmean} = \expct\pbox{ \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\randvar_i} = \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\expct\pbox{\randvar_i} = \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\sum\limits_{(\monom, \coef) \in \expandtree{\etree}}\frac{\coef \cdot \evalmp(\monom, \vct{p})}{\sum\limits_{(\monom, \coef) \in \expandtree{\etree}}|\coef|} = \frac{\rpoly(\prob_1,\ldots, \prob_\numvar)}{\abs{\etree}(1,\ldots, 1)}.\]

Hoeffding' inequality can be used to compute and upper bound on the number of samples $\samplesize$ needed to establish the $(\error, \conf)$-bound.  The inequality states that if we know that each $\randvar_i$ is strictly bounded by the intervals $[a_i, b_i]$, then it is true that
\begin{equation*}
P\left(\left|\empmean - \expct\pbox{\empmean}\right| \geq \error\right) \leq 2\exp{\left(-\frac{2\samplesize^2\error^2}{\sum_{i = 1}^{\samplesize}(b_i -a_i)^2}\right)}.
\end{equation*}

Note that Hoeffding is assuming the sum of random variables be divided by the number of variables.  Also see that to properly estimate $\rpoly$, it is necessary to multiply by the number of monomials in $\rpoly$, i.e. $\abs{\etree}(1,\ldots, 1)$.  Therefore it is the case that $\frac{acc}{N}$ gives the estimate of one monomial, and multiplying by $\abs{\etree}(1,\ldots, 1)$ yields the estimate of $\rpoly(\prob_1,\ldots, \prob_\numvar)$.  This scaling is performed in line ~\ref{alg:mon-sam-global3}.

Since, as shown on lines ~\ref{alg:mon-sam-product2} and ~\ref{alg:mon-sam-product}, each $\randvar_i$ has a coefficient in $\{-1, 1\}$ that is mulitplied with at most $\degree(\polyf(\abs{\etree}))$ factors from $\vct{p}$ such that each $p_i$ is in $[0, 1]$, the range for each $\randvar_i$ is then strictly bounded by $[-1, 1]$.  Bounding Hoeffding's results by $\conf$ ensures confidence no less than $1 - \conf$.  Then by upperbounding by Hoeffding with $\conf$, it is the case that 
\begin{equation*}
P\pbox{~\left| \empmean - \expct\pbox{\empmean} ~\right| \geq \error} \leq 2\exp{\left(-\frac{2\samplesize^2\error^2}{2^2 \samplesize}\right)} \leq \conf.
\end{equation*}

Solving for the number of samples $\samplesize$ we get
\begin{align}
&\conf \geq  2\exp{-\left(\frac{2\samplesize^2\error^2}{4\samplesize}\right)}\label{eq:hoeff-1}\\
%&\frac{\conf}{2} \geq \exp{-\left(\frac{2\samplesize^2\error^2}{4\samplesize}\right)}\label{eq:hoeff-2}\\
%&\frac{2}{\conf} \leq \exp{\left(\frac{2\samplesize^2\error^2}{4\samplesize}\right)}\label{eq:hoeff-3}\\
%&\log{\frac{2}{\conf}} \leq \left(\frac{2\samplesize^2\error^2}{4\samplesize}\right)\label{eq:hoeff-4}\\
%&\log{\frac{2}{\conf}} \leq \frac{\samplesize\error^2}{2}\label{eq:hoeff-5}\\
&\frac{2\log{\frac{2}{\conf}}}{\error^2} \leq \samplesize.\label{eq:hoeff-6}
\end{align}

By Hoeffding we obtain the number of samples necessary to acheive the claimed additive error bounds.  

This concludes the proof for the first claim of theorem ~\ref{lem:mon-samp}.

\paragraph{Run-time Analysis}
Note that lines ~\ref{alg:mon-sam-global1}, ~\ref{alg:mon-sam-global2}, and ~\ref{alg:mon-sam-global3} are $O(1)$ global operations.  The call to $\onepass$ in line ~\ref{alg:mon-sam-onepass} by lemma ~\ref{lem:one-pass} is $O(|\etree|)$ time.  
First, algorithm ~\ref{alg:mon-sam} calls \textsc{OnePass} which takes $O(|\etree|)$ time.  Then for $\numsamp = \ceil{\frac{2 \log{\frac{2}{\conf}}}{\error^2}}$, the $O(1)$ assignment, product, and addition operations occur.  Over the same $\numsamp$ iterations, $\sampmon$ is called, with a runtime of $O(\log{k}\cdot k \cdot depth(\etree)$ by lemma ~\ref{lem:sample}.  Finally, over the same iterations, because $\degree(\polyf(\abs{\etree})) = k$, the assignment and product operations of line ~\ref{alg:mon-sam-product2} are called at most $k$ times.  

Thus we have $O(\treesize(\etree)) + O(\frac{\log{\frac{1}{\conf}}}{\error^2} \cdot \left(k + \log{k}\cdot k \cdot depth(\etree)\right) = O\left(\treesize(\etree) + \left(\frac{\log{\frac{1}{\conf}}}{\error^2} \cdot \left(k \cdot\log{k} \cdot depth(\etree)\right)\right)\right)$ overall running time.
\end{proof}

\qed

\begin{proof}[Proof of Theorem \ref{lem:approx-alg}]
%\begin{Corollary}\label{cor:adj-err}
Setting $\error = \error \cdot \frac{\rpoly(\prob_1,\ldots, \prob_\numvar)}{\abs{\etree}(1,\ldots, 1)}$ achieves $1 \pm \epsilon$ multiplicative error bounds, in $O\left(\treesize(\etree) + \frac{\log{\frac{1}{\conf}}\cdot \abs{\etree}^2(1,\ldots, 1)}{\error^2\cdot\rpoly^2(\prob_1,\ldots, \prob_\numvar)}\right)$.
%\end{Corollary}

Since it is the case that we have $\error \cdot \abs{\etree}(1,\ldots, 1)$ additive error, one can set $\error = \error \cdot \frac{\rpoly(\prob_1,\ldots, \prob_\numvar)}{\abs{\etree}(1,\ldots, 1)}$, yielding a multiplicative error proportional to $\rpoly(\prob_1,\ldots, \prob_\numvar)$.  This only effects the runtime in the number of samples taken, changing the first factor of the second summand of the original runtime accordingly.  

The derivation over the number of samples is then 
\begin{align*}
&\frac{2\log{\frac{2}{\conf}}}{\error^2 \left(\frac{\rpoly(\prob_1,\ldots, \prob_N)}{\abs{\etree}(1,\ldots, 1)}\right)^2}\\
= &\frac{2\log{\frac{2}{\conf}}\cdot \abs{\etree}^2(1,\ldots, 1)}{\error^2 \cdot \rpoly^2(\prob_1,\ldots, \prob_\numvar)},
\end{align*}
and the runtime then follows, thus upholding ~\cref{lem:approx-alg}.
\end{proof}

\qed


\subsection{OnePass Algorithm}
\subsubsection{Description}
Algorithm ~\ref{alg:one-pass} satisfies the requirements of lemma ~\ref{lem:one-pass}.

$\abs{\etree}(1,\ldots, 1)$ can be defined recursively, as follows:
\begin{align*}
	&\etree.\type = \times \mapsto&& \etree_\lchild \times \etree_\rchild\\
	&\etree.\type = + \mapsto&& \etree_\lchild + \etree_\rchild\\
	&\etree.\type = \tnum \mapsto&& |\etree.\val|\\
	&\etree.\type = \var \mapsto&& 1.
\end{align*} 

In the same fashion the weighted distribution can be described with the additional action at a $+$ node: 
\begin{align*}
	\etree.\type = + \mapsto&& \etree_\lchild + \etree_\rchild; \etree_\lchild.\vari{weight} = \frac{\etree_\lchild}{\etree_\lchild + \etree_\rchild}, \etree_\rchild.\vari{weight} = \frac{\etree_\rchild}{\etree_\lchild + \etree_\rchild}
\end{align*}

Algorithm ~\ref{alg:one-pass} essentially implements the above definitions.


\subsubsection{Psuedo Code}
See algorithm ~\ref{alg:one-pass} for details.
\begin{algorithm}[h!]
	\caption{\onepass$(\etree)$}
	\label{alg:one-pass}
\begin{algorithmic}[1]
	\Require \etree: Binary Expression Tree
	\Ensure \etree: Binary Expression Tree
	\Ensure \vari{sum} $\in \mathbb{R}$
	\If{$\etree.\type = +$}\label{alg:one-pass-equality1}
		\State $\accum \gets 0$\label{alg:one-pass-plus-assign1}
		\For{$child$ in $\etree.\vari{children}$}\Comment{Sum up all children coefficients}
			\State $(child, \vari{s}) \gets \onepass(child)$
			\State $\accum \gets \accum + \vari{s}$\label{alg:one-pass-plus-add}
		\EndFor
		\State $\etree.\vari{partial} \gets \accum$\label{alg:one-pass-plus-assign2}
		\For{$child$ in $\etree.\vari{children}$}\Comment{Record distributions for each child}
			\State $child.\vari{weight} \gets \frac{\vari{child.partial}}{\etree.\vari{partial}}$\label{alg:one-pass-plus-prob}
		\EndFor
		\State $\vari{sum} \gets \etree.\vari{partial}$\label{alg:one-pass-plus-assign3}
		\State \Return (\etree, \vari{sum})
	\ElsIf{$\etree.\type = \times$}\label{alg:one-pass-equality2}
		\State $\accum \gets 1$\label{alg:one-pass-times-assign1}
		\For{$child \text{ in } \etree.\vari{children}$}\Comment{Compute the product of all children coefficients}
			\State $(child, \vari{s}) \gets \onepass(child)$
			\State $\accum \gets \accum \times \vari{s}$\label{alg:one-pass-times-product}
		\EndFor
		\State $\etree.\vari{partial}\gets \accum$\label{alg:one-pass-times-assign2}
		\State $\vari{sum} \gets \etree.\vari{partial}$\label{alg:one-pass-times-assign3}
		\State \Return (\etree, \vari{sum})
	\ElsIf{$\etree.\type = numeric$}\Comment{Base case}\label{alg:one-pass-equality3}
		\State $\vari{sum} \gets |\etree.\val|$\label{alg:one-pass-leaf-assign1}\Comment{This step effectively converts $\etree$ into $\abs{\etree}$}
		\State \Return (\etree, \vari{sum})
	\Else\Comment{$\etree.\type = \var$}\label{alg:one-pass-equality4}
		\State $\vari{sum} \gets 1$\label{alg:one-pass-global-assign}
		\State \Return (\etree, \vari{sum})
	\EndIf
\end{algorithmic}
\end{algorithm}

\begin{Example}\label{example:one-pass}
Consider the when $\etree$ is $+\left(\times\left(+\left(\times\left(1, x_1\right), \times\left(1, x_2\right)\right), +\left(\times\left(1, x_1\right) as seen in ~\cref{fig:expr-tree-T-wght}, \times\left(-1, x_2\right)\right)\right), \times\left(\times\left(1, x_2\right), \times\left(1, x_2\right)\right)\right)$, which encodes the expression $(x_1 + x_2)(x_1 - x_2) + x_2^2$.  After one pass, \cref{alg:one-pass} would have computed the following weight distribution.  For the two children of the root $+$ node $\etree$, $\etree_\lchild.\wght = \frac{4}{5}$ and $\etree_\rchild.\wght = \frac{1}{5}$.  Similarly, $\stree \gets \etree_\lchild$, $\stree_\lchild.\wght = \stree_\rchild.\wght = \frac{1}{2}$.  Note that in this example, the sampling probabilities for the children of each inner $+$ node of $\stree$ are equal to one another because both parents have the same number of children, and, in each case, the children of each parent $+$ node share the same $|\coef_i|$.
\end{Example}

\begin{figure}[h!]
	\begin{tikzpicture}[thick, every tree node/.style={default_node, thick, draw=black, black, circle, text width=0.3cm, font=\bfseries, minimum size=0.65cm}, every child/.style={black}, edge from parent/.style={draw, thick},
level 1/.style={sibling distance=2.5cm},
level 2/.style={sibling distance=1.25cm},
%level 2+/.style={sibling distance=0.625cm}
%level distance = 1.25cm,
%sibling distance = 1cm,
%every node/.append style = {anchor=center}
]
	
	\Tree [.\node(root){$\boldsymbol{+}$}; 
			\edge [wght_color] node[midway, auto= right, font=\bfseries] {$\bsym{\frac{4}{5}}$}; [.\node[highlight_color](tl){$\boldsymbol{\times}$}; 
				[.\node(s){$\bsym{+}$}; 
					\edge[wght_color] node[pos=0.35, left, font=\bfseries]{$\bsym{\frac{1}{2}}$}; [.\node[highlight_color](sl){$\bsym{x_1}$}; ]
					\edge[wght_color] node[pos=0.35, right, font=\bfseries]{$\bsym{\frac{1}{2}}$}; [.\node[highlight_color](sr){$\bsym{x_2}$}; ]
					]
				[.\node(sp){$\bsym{+}$}; 
					\edge[wght_color] node[pos=0.35, left, font=\bfseries]{$\bsym{\frac{1}{2}}$}; [.\node[highlight_color](spl){$\bsym{x_1}$}; ]
					\edge[wght_color] node[pos=0.35, right, font=\bfseries]{$\bsym{\frac{1}{2}}$}; [.\node[highlight_color](spr){$\bsym{\times}$}; 
						[.$\bsym{-1}$ ] [.$\bsym{x_2}$ ]
						]
					]				
				]
			\edge [wght_color] node[midway, auto=left, font=\bfseries] {$\bsym{\frac{1}{5}}$}; [.\node[highlight_color](tr){$\boldsymbol{\times}$}; 
				[.$\bsym{x_2}$ 
					\edge [draw=none]; [.\node[draw=none]{}; ]
					\edge [draw=none]; [.\node[draw=none]{}; ] 
				] 
				[.$\bsym{x_2}$ ] ] 
	]
%	labels for plus node children, with arrows
	\node[left=2pt of sl, highlight_color, inner sep=0pt] (sl-label) {$\stree_\lchild$};
	\draw[highlight_color] (sl) -- (sl-label);
	\node[right=2pt of sr, highlight_color, inner sep=0pt] (sr-label) {$\stree_\rchild$};
	\draw[highlight_color] (sr) -- (sr-label);
	\node[left=2pt of spl, inner sep=0pt, highlight_color](spl-label) {$\stree_\lchild'$};
	\draw[highlight_color] (spl) -- (spl-label);
	\node[right=2pt of spr, highlight_color, inner sep=0] (spr-label) {$\stree_\rchild'$};
	\draw[highlight_color] (spr) -- (spr-label);
	\node[above left=2pt of tl, inner sep=0pt, highlight_color] (tl-label) {$\etree_\lchild$};
	\draw[highlight_color] (tl) -- (tl-label);
	\node[above right=2pt of tr, highlight_color, inner sep=0pt] (tr-label) {$\etree_\rchild$};
	\node[above = 2pt of root, highlight_color, inner sep=0pt, font=\bfseries] (root-label) {$\etree$};
	\node[above = 2pt of s, highlight_color, inner sep=0pt, font=\bfseries] (s-label) {$\stree$};
	\node[above = 2pt of sp, highlight_color, inner sep=0pt, font=\bfseries] (sp-label) {$\stree'$};
	\draw[highlight_color] (tr) -- (tr-label);
%	\draw[<-|, highlight_color] (s) -- (s-label);
%	\draw[<-|, highlight_color] (sp) -- (sp-label);
%	\draw[<-|, highlight_color]  (root) -- (root-label);
%\node[above right=0.7cm of TR, highlight_color, inner sep=0pt, font=\bfseries] (tr-comment) {$\etree_\rchild$};
%		\draw[<-|, highlight_color] (TR) -- (tr-comment);
	\end{tikzpicture}


%	\begin{tikzpicture}[thick, level distance=1.2cm, level 1/.style={sibling distance= 5cm}, level 2/.style={sibling distance=3cm}, level 3/.style={sibling distance=1.5cm}, level 4/.style={sibling distance= 1cm}, every child/.style={black}]
%		\node[tree_node](root) {$\boldsymbol{+}$}
%			child[red]{node[tree_node](tl) {$\boldsymbol{\times}$}
%				child{node[tree_node] {$\boldsymbol{+}$} 
%					child{node[tree_node]{$\boldsymbol{x_1}$}	}
%					child{node[tree_node] {$\boldsymbol{x_2}$}}
%					}					
%				child{node[tree_node] {$\boldsymbol{+}$}
%					child{node[tree_node] {$\boldsymbol{x_1}$}}
%						%child[missing]{node[tree_node] {$\boldsymbol{1}$}}
%					child[red]{node[tree_node] {$\boldsymbol{\times}$}
%						child{node[tree_node] {$\boldsymbol{-1}$}}
%						child{node[tree_node] {$\boldsymbol{x_2}$}}
%						}
%					}			
%			}
%			child{node[tree_node] {$\boldsymbol{\times}$} edge from parent [red]
%				child{node[tree_node] {$\boldsymbol{x_2}$}}
%				child{node[tree_node] {$\boldsymbol{x_2}$}}
%				};
%		\node[font=\bfseries, red] at (-2.8, -0.2) {$\etree_\lchild.\wght \boldsymbol{= \frac{4}{5} } $};
%	\end{tikzpicture}
	\caption{Weights computed by $\onepass$ in ~\cref{example:one-pass}.}
	\label{fig:expr-tree-T-wght}
\end{figure}


\subsubsection{Correctness of Algorithm ~\ref{alg:one-pass}}

\begin{proof}[Proof of Lemma ~\ref{lem:one-pass}]
We prove the first part of lemma ~\ref{lem:one-pass}, i.e., correctness, by structural induction over the depth $d$ of the binary tree $\etree$.

For the base case, $d = 0$, it is the case that the root node is a leaf and therefore by definition ~\ref{def:express-tree} must be a variable or coefficient.  When it is a variable, \textsc{OnePass} returns $1$, and we have in this case that $\polyf(\etree) = X_i = \polyf(\abs{\etree})$ for some $i$ in $[\numvar]$, and this evaluated at all $1$'s indeed gives $1$, verifying the correctness of the returned value of $\abs{\etree}(1,\ldots, 1) = 1$.  When the root is a coefficient, the absolute value of the coefficient is returned, which is indeed $\abs{\etree}(1,\ldots, 1)$.  Since the root node cannot be a $+$ node, this proves the base case.

Let the inductive hypothesis be the assumption that for $d \leq k$ for $k \geq 1$, lemma ~\ref{lem:one-pass} is true for algorithm ~\ref{alg:one-pass}.  

Now prove that lemma ~\ref{lem:one-pass} holds for $k + 1$.  Notice that the root of $\etree$ has at most two children, $\etree_\lchild$ and $\etree_\rchild$.  Note also, that for each child, it is the case that $d \leq k$.  Then, by inductive hypothesis, lemma ~\ref{lem:one-pass} holds for each existing child, and we are left with two possibilities for the root node.  The first case is when the root node is a $+$ node.  When this happens, algorithm ~\ref{alg:one-pass} computes $|T_\lchild|(1,\ldots, 1) + |T_\rchild|(1,\ldots, 1)$ on line ~\ref{alg:one-pass-plus-add} which by definition is $\abs{\etree}(1,\ldots, 1)$ and hence the inductive hypothesis holds in this case.  For the distribution of the children of $+$, algorithm ~\ref{alg:one-pass} computes $P(\etree_i) = \frac{|T_i|(1,\ldots, 1)}{|T_\lchild|(1,\ldots, 1) + |T_\rchild|(1,\ldots, 1)}$ which is indeed the case.  The second case is when the $\etree.\val = \times$.  By inductive hypothesis, it is the case that both $\abs{\etree_\lchild}\polyinput{1}{1}$ and $\abs{\etree_\rchild}\polyinput{1}{1}$ have been correctly computed.  On line~\ref{alg:one-pass-times-product} algorithm ~\ref{alg:one-pass} then computes the product of the subtree partial values, $|T_\lchild|(1,\ldots, 1) \times |T_\rchild|(1,\ldots, 1)$ which by definition is $\abs{\etree}(1,\ldots, 1)$.

That $\onepass$ makes exactly one traversal of $\etree$ follows by noting for lines ~\ref{alg:one-pass-equality1} and ~\ref{alg:one-pass-equality2} are the checks for the non-base cases, where in each matching exactly one recursive call is made on each of $\etree.\vari{children}$.  For the base cases, lines ~\ref{alg:one-pass-equality3} and ~\ref{alg:one-pass-equality4} both return values without making any further recursive calls.  Since all nodes are covered by the cases, and the base cases cover only leaf nodes, it follows that algorithm ~\ref{alg:one-pass} then terminates after it visits every node exactly one time.

To conclude, note that when $\etree.\type = +$, the compuatation of $\etree_\lchild.\wght$ and $\etree_\rchild.\wght$ are solely dependent on the correctness of $\abs{\etree}\polyinput{1}{1}$, $\abs{\etree_\lchild}\polyinput{1}{1}$, and $\abs{\etree_\rchild}\polyinput{1}{1}$, which have already been argued to be correct.  

\paragraph{Run-time Analysis}
The runtime for \textsc{OnePass} is fairly straight forward.  Note that line ~\ref{alg:one-pass-equality1}, ~\ref{alg:one-pass-equality2}, and ~\ref{alg:one-pass-equality3} give a constant number of equality checks per node.  Then, for $+$ nodes, lines ~\ref{alg:one-pass-plus-add} and ~\ref{alg:one-pass-plus-prob} (note there is a \textit{constant} factor of $2$ here) perform a constant number of arithmetic operations, while ~\ref{alg:one-pass-plus-assign1} ~\ref{alg:one-pass-plus-assign2}, and ~\ref{alg:one-pass-plus-assign3} all have $O(1)$ assignments.  Similarly, when a $\times$ node is visited, lines \ref{alg:one-pass-times-assign1}, \ref{alg:one-pass-times-assign2}, and \ref{alg:one-pass-times-assign3} have $O(1)$ assignments, while line ~\ref{alg:one-pass-times-product} has $O(1)$ product operations per node.  For leaf nodes, ~\cref{alg:one-pass-leaf-assign1} and ~\cref{alg:one-pass-global-assign} are both $O(1)$ assignment.

Thus, the algorithm visits each node of $\etree$ one time, with a constant number of operations for all of the $+$, $\times$, and leaf nodes, leading to a runtime of $O\left(\treesize(\etree)\right)$, and completes the proof.
\end{proof}

\qed


\subsection{Sample Algorithm}

Algorithm ~\ref{alg:sample} takes $\etree$ as input, samples an arbitrary $(\monom, \coef)$ from $\expandtree{\etree}$ with probabilities $\stree_\lchild.\wght$ and $\stree_\rchild.\wght$ for each subtree $\stree$ with $\stree.\type = +$, outputing the tuple $(\monom, \sign(\coef))$.  While one cannot compute $\expandtree{\etree}$ in time better than $O(N^k)$, the algorithm, similar to \textsc{OnePass}, uses a technique on $\etree$ which produces a sample from $\expandtree{\etree}$ without ever materializing $\expandtree{\etree}$.

Algorithm ~\ref{alg:sample} selects a monomial from $\expandtree{\etree}$ by the following top-down traversal.  For a parent $+$ node, a subtree is chosen over the previously computed weighted sampling distribution.  When a parent $\times$ node is visited, both children are visited.  All variable leaf nodes of the subgraph traversal are added to a set.  Additionally, the product of signs over all coefficient leaf nodes of the subgraph traversal is computed.  The algorithm returns a set of the distinct variables of which the monomial is composed and the monomial's sign.

\begin{Definition}[TreeSet]
A TreeSet is a datastructure whose elements form a set, each of which are stored in a binary tree.
\end{Definition}

Note that as stated, a TreeSet then facilitates logarithmic insertion.

\subsubsection{Pseudo Code}
See algorithm ~\ref{alg:sample} for the details of $\sampmon$ algorithm.


\begin{algorithm}
	\caption{\sampmon(\etree)}
	\label{alg:sample}
	\begin{algorithmic}[1]
		\Require \etree: Binary Expression Tree
		\Ensure \vari{vars}: TreeSet
		\Ensure \vari{sgn} $\in \{-1, 1\}$
		\Comment{Algorithm ~\ref{alg:one-pass} should have been run before algorithm ~\ref{alg:sample}}
		\State $\vari{vars} \gets new$ $TreeSet()$\label{alg:sample-global1}
		\State $\vari{sgn} \gets 1$\label{alg:sample-global2}
		\If{$\etree.\type = +$}\Comment{Sample at every $+$ node}
			\State $\etree_{\vari{samp}} \gets$ Sample from left subtree ($\etree_{\lchild}$) and right subtree ($\etree_{\rchild}$) w.p. $\etree_\lchild.\wght$ and $\etree_\rchild.\wght$. \label{alg:sample-plus-bsamp}
			\State $(\vari{v}, \vari{s}) \gets \sampmon(\etree_{\vari{samp}})$\label{alg:sample-plus-traversal}
				\State $\vari{vars} \gets \vari{vars} \;\cup \;\{\vari{v}\}$\label{alg:sample-plus-union}
				\State $\vari{sgn} \gets \vari{sgn} \times \vari{s}$\label{alg:sample-plus-product}
			\State $\Return ~(\vari{vars}, \vari{sgn})$
		\ElsIf{$\etree.\type = \times$}\Comment{Multiply the sampled values of all subtree children}
			\For {$child$ in $\etree.\vari{children}$}				
				\State $(\vari{v}, \vari{s}) \gets \sampmon(child)$
				\State $\vari{vars} \gets \vari{vars} \cup \{\vari{v}\}$\label{alg:sample-times-union}
				\State $\vari{sgn} \gets \vari{sgn} \times \vari{s}$\label{alg:sample-times-product}
			\EndFor
			\State $\Return ~(\vari{vars}, \vari{sgn})$
		\ElsIf{$\etree.\type = numeric$}\Comment{The leaf is a coefficient}
			\State $\vari{sgn} \gets \vari{sgn} \times sign(\etree.\val)$
			\State $\Return ~(\vari{vars}, \vari{sgn})$\label{alg:sample-num-return}
		\ElsIf{$\etree.\type = var$}
			\State $\vari{vars} \gets \vari{vars} \; \cup \; \{\;\etree.\val\;\}\label{alg:sample-var-union}$\Comment{Add the variable to the set}
			\State $\Return~(\vari{vars}, \vari{sgn})$\label{alg:sample-var-return}
		\EndIf
	\end{algorithmic}
\end{algorithm}

\subsubsection{Correctness of Algorithm ~\ref{alg:sample}}

\begin{proof}[Proof of Lemma ~\ref{lem:sample}]
First, we need to show that $\sampmon$ indeed returns a monomial $\monom$, such that $(\monom, \coef)$ is in $\expandtree{\etree}$.

For the base case of the depth $d$ of $\etree$ is $0$, we have that the root node is either a constant $\coef$ for which case lines ~\ref{alg:sample-global1} and ~\ref{alg:sample-num-return} we return $\{~\}$, or we have that $\etree.\type = \var$ and $\etree.\val = x$, in which case by lines ~\ref{alg:sample-var-union} and ~\ref{alg:sample-var-return} we return $\{x\}$.  Both cases satisfy the definition of a monomial, and the base case is proven.

By inductive hyptothesis, assume that for $d \leq k$ for $k \geq 0$, that it is indeed the case that $\sampmon$ returns a monomial.

For the inductive step, let us take a tree $\etree$ with $d = k + 1$.  Note that each child has depth $d \leq k$, and by inductive hyptothesis both of them return a valid monomial.  Then the root can be either a $+$ or $\times$ node.  For the case of a $+$ root node, line ~\ref{alg:sample-plus-bsamp} of $\sampmon$ will choose one of the children of the root.  Since by hypothesis it is the case that a monomial is being returned from either child, and only one of these monomials is selected, we have for the case of $+$ root node that a valid monomial is returned by $\sampmon$.  When the root is a $\times$ node, lines ~\ref{alg:sample-times-union} and ~\ref{alg:sample-times-product} multiply the monomials returned by the two children of the root, and by definition ~\ref{def:monomial} the product of two monomials is also a monomial, which means that $\sampmon$ returns a vaild monomial for the $\times$ root node, thus concluding the fact that $\sampmon$ indeed returns a monomial.	

%Note that for any monomial sampled by algorithm ~\ref{alg:sample}, the nodes traversed form a subgraph of $\etree$ that is \textit{not} a subtree in the general case.  We thus seek to prove that the subgraph traversed produces the correct probability corresponding to the monomial sampled.

We seek to prove by induction on the depth $d$ of $\etree$ that the subgraph traversed by $\sampmon$ has a probability that is in accordance with the monomial sampled, $\frac{|\coef|}{\abs{\etree}\polyinput{1}{1}}$.  

For the base case $d = 0$, by definition ~\ref{def:express-tree} we know that the root has to be either a coefficient or a variable.  For either case, the probability of the value returned is $1$ since there is only one value to sample from.  When the root is a variable $x$ the algorithm correctly returns $(\{x\}, 1 )$.  When the root is a coefficient, \sampmon ~correctly returns $(\{~\}, sign(\coef_i) \times 1)$.  

For the inductive hypothesis, assume that for $d \leq k$ and $k \geq 0$ $\sampmon$ indeed samples $\monom$ in $(\monom, \coef)$ in $\expandtree{\etree}$ with probability $\frac{|\coef|}{\abs{\etree}\polyinput{1}{1}}$.%bove is true.%lemma ~\ref{lem:sample} is true.

Prove now, that when $d = k + 1$ the correctness holds.  It is the case that the root of $\etree$ has up to two children $\etree_\lchild$ and $\etree_\rchild$.  Since $\etree_\lchild$ and $\etree_\rchild$ are both depth $d \leq k$, by inductive hypothesis correctness holds for both of them, thus, $\sampmon$ has sampled both monomials $\monom_\lchild$ in $(\monom_\lchild, \coef_\lchild)$ of $\expandtree{\etree_\lchild}$ and $\monom_\rchild$ in $(\monom_\rchild, \coef_\rchild)$ of $\expandtree{\etree_\rchild}$, from $\etree_\lchild$ and $\etree_\rchild$ with probability $\frac{|\coef_\lchild|}{\abs{\etree_\lchild}\polyinput{1}{1}}$ and $\frac{|\coef_\rchild|}{\abs{\etree_\rchild}\polyinput{1}{1}}$. 

Then the root has to be either a $+$ or $\times$ node.  

Consider the case when the root is $\times$.  Note that we are sampling a term from $\expandtree{\etree}$.  Consider $(\monom, \coef)$ in $\expandtree{\etree}$, where $\monom$ is the sampled monomial.  Notice also that it is the case that $\monom = \monom_\lchild \times \monom_\rchild$, where $\monom_\lchild$ is coming from $\etree_\lchild$ and $\monom_\rchild$ from $\etree_\rchild$.  The probability that \sampmon$(\etree_{\lchild})$ returns $\monom_\lchild$ is $\frac{|\coef_{\monom_\lchild}|}{|\etree_\lchild|(1,\ldots, 1)}$ and $\frac{|\coef_{\monom_\lchild}|}{\abs{\etree_\rchild}\polyinput{1}{1}}$ for $\monom_R$.  Since both $\monom_\lchild$ and $\monom_\rchild$ are sampled with independent randomness, the final probability for sample $\monom$ is then $\frac{|\coef_{\monom_\lchild}| \cdot |\coef_{\monom_R}|}{|\etree_\lchild|(1,\ldots, 1) \cdot |\etree_\rchild|(1,\ldots, 1)}$.  For $(\monom, \coef)$ in \expandtree{\etree}, it is indeed the case that $|\coef_i| = |\coef_{\monom_\lchild}| \cdot |\coef_{\monom_\rchild}|$ and that $\abs{\etree}(1,\ldots, 1) = |\etree_\lchild|(1,\ldots, 1) \cdot |\etree_\rchild|(1,\ldots, 1)$, and therefore $\monom$ is sampled with correct probability $\frac{|\coef_i|}{\abs{\etree}(1,\ldots, 1)}$. 

For the case when $\etree.\val = +$, \sampmon ~will sample monomial $\monom$ from one of its children.  By inductive hypothesis we know that any $\monom_\lchild$ in $\expandtree{\etree_\lchild}$ and any $\monom_\rchild$ in $\expandtree{\etree_\rchild}$ will both be sampled with correct probability $\frac{|\coef_{\monom_\lchild}|}{\etree_{\lchild}(1,\ldots, 1)}$ and $\frac{|\coef_{\monom_\rchild}|}{|\etree_\rchild|(1,\ldots, 1)}$, where either $\monom_\lchild$ or $\monom_\rchild$ will equal $\monom$, depending on whether $\etree_\lchild$ or $\etree_\rchild$ is sampled.  Assume that $\monom$ is sampled from $\etree_\lchild$, and note that a symmetric argument holds for the case when $\monom$ is sampled from $\etree_\rchild$.  Notice also that the probability of sampling $\etree_\lchild$ from $\etree$ is $\frac{\abs{\etree_\lchild}\polyinput{1}{1}}{\abs{\etree_\lchild}\polyinput{1}{1} + \abs{\etree_\rchild}\polyinput{1}{1}}$ as computed by $\onepass$.  Then, since $\sampmon$ goes top-down, and each sampling choice is independent (which follows from the randomness in the root of $\etree$ being independent from the randomness used in its subtrees), the probability for $\monom$ to be sampled from $\etree$ is equal to the product of the probability that $\etree_\lchild$ is sampled from $\etree$ and $\monom$ is sampled in $\etree_\lchild$, and
\begin{align*}
P(\sampmon(\etree) = \monom) = &P(\sampmon(\etree_\lchild) = \monom) \cdot P(SampledChild(\etree) = \etree_\lchild)\\
= &\frac{|\coef_\monom|}{|\etree_\lchild|(1,\ldots, 1)} \cdot \frac{\abs{\etree_\lchild}(1,\ldots, 1)}{|\etree_\lchild|(1,\ldots, 1) + |\etree_\rchild|(1,\ldots, 1)}\\
= &\frac{|\coef_\monom|}{\abs{\etree}(1,\ldots, 1)},
\end{align*}
and we obtain the desired result.


\paragraph{Run-time Analysis}
We now bound the number of recursive calls in $\sampmon$ by $O\left(k\cdot depth(\etree)\right)$.  Take an arbitrary sample subgraph of expression tree $\etree$ of degree $k$ and pick an arbitrary level $i$.  Call the number of $\times$ nodes in this level $y_i$, and the total number of nodes $x_i$.  Given that both children of a $\times$ node are traversed in $\sampmon$ while only one child is traversed for a $+$ parent node, note that the number of nodes on level $i + 1$ in the general case is at most $y_i + x_i$, and the increase in the number of nodes from level $i$ to level $i + 1$ is upperbounded by $x_{i + 1} - x_i \leq y_i$.  

Now, we prove by induction on the depth $d$ of tree $\etree$ the following claim.
\begin{Claim}\label{claim:num-nodes-level-i}
The number of nodes in expression tree $\etree$ at arbitrary level $i$ is bounded by the sum of all $\times$ nodes in levels $[0, i - 1] and 1$.
\end{Claim}

\begin{proof}[Proof of Claim ~\ref{claim:num-nodes-level-i}]
For the base case, $d = 0$, we have the following cases.  For both cases, when $\etree.\type = \tnum$ and when $\etree.\type = \var$, it is trivial to see that the number of nodes on level $0$ = 1, which satisfies the identity of ~\cref{claim:num-nodes-level-i}, i.e., the number of $\times$ nodes in previous levels $+ 1$ = 1, and the base case is upheld.

Assume that for $d \leq k$ for $k \geq 0$ that ~\cref{claim:num-nodes-level-i} holds.

The inductive step is to show that for arbitrary $\etree$ with depth = $d + 1 \leq k + 1$ the claim still holds.  Note that we have two possibilities for the root of $\etree$.  First, $\etree.\type = +$, and it is the case in ~\cref{alg:sample-plus-traversal} that only one of $\etree_\lchild$ or $\etree_\rchild$ are part of the subgraph traversed by $\sampmon$.  By inductive hypothesis, both subtrees satisfy the claim.  Since only one child is part of the subgraph, there is exactly one node at level 1, which, as in the base case analysis, satisfies ~\cref{claim:num-nodes-level-i}.  For the second case, $\etree.\type = \times$, $\sampmon$ traverses both children, and the number of nodes at level $1$ in the subgraph is then $2$, which satisfies ~\cref{claim:num-nodes-level-i} since the sum of $\times$ nodes in previous levels (level $0$) is $1$, and $1 + 1 = 2$, proving the claim. 
\end{proof}

\qed

By ~\cref{def:degree}, a sampled monomial will have $O(k)$ $\times$ nodes, and this implies $O(k)$ nodes at $\leq$ $depth(\etree)$ levels of the $\sampmon$ subgraph, bounding the number of recursive calls to $O(k \cdot depth(\etree))$.

Globally, lines ~\ref{alg:sample-global1} and ~\ref{alg:sample-global2} are $O(1)$ time.  For the $+$ node, line ~\ref{alg:sample-plus-bsamp} has $O(1)$ time by the fact that $\etree$ is binary.  Line ~\ref{alg:sample-plus-union} has $O(\log{k})$ time by nature of the TreeSet datastructure and the fact that by definition any monomial sampled from $\expandtree{\etree}$ has degree $\leq k$ and hence at most $k$ distinct variables, which in turn implies that the TreeSet has $\leq k$ elements in it at any time.

Finally, line ~\ref{alg:sample-plus-product} is in $O(1)$ for a product and an assignment operation.  When a times node is visited, the same union, product, and assignment operations take place, and we again have $O(\log{k})$ runtime.  When a variable leaf node is traversed, the same union operation occurs with $O(\log{k})$ runtime, and a constant leaf node has the above mentioned product and assignment operations.  Thus for each node visited, we have $O(\log{k})$ runtime, and the final runtime for $\sampmon$ is $O(\log{k} \cdot k \cdot depth(\etree))$.

\end{proof}
\qed