paper-BagRelationalPDBsAreHard/approx_alg.tex

%root: main.tex
\section{$1 \pm \epsilon$ Approximation Algorithm}
\label{sec:algo}

In~\cref{sec:hard}, we showed that computing the expected multiplicity of a compressed representation of a bag polynomial for TIDB (even just based on project-join queries) is unlikely to be possible in linear time (\cref{thm:mult-p-hard-result}), even if all tuples have the same probability of being present (\cref{cor:single-p-hard}). Given this, in this section we will design an approximation algorithm for our problem that runs in {\em linear time}. Unlike the results in~\cref{sec:hard} our approximation algorithm works for BIDB though our bounds are more meaningful for a non-trivial subclass of BIDB that includes TIDB as well as PDB benchmarks (\cref{sec:experiments}).
%it is then desirable to have an algorithm to approximate the multiplicity in linear time, which is what we describe next.

\subsection{Preliminaries and some more notation}

First, let us introduce some useful definitions and notation related to polynomials and their representations.  For illustrative purposes in the definitions below, we will use the following {\em bivariate} polynomial: 
\begin{equation}
\label{eq:poly-eg}
\poly(x,y) = 2x^2 + 3xy - 2y^2.
\end{equation}

\AR{The definition from this and my next comments are "new"-- they might be better off in the prelims section and moved to later in this section. Am keeping all of them in one place for easy lookup for now.}

\begin{Definition}[Variables in a monomial]\label{def:vars}
 Given a monomial $v$, we use $\var(v)$ to denote the set of variables in $v$.
\end{Definition}
For example the monomial $3xy$ in the polynomial in~\cref{eq:poly-eg} has $\var(3xy)=\inset{x,y}$.

%\AH{@atri, just a heads up.  I had defined \emph{monomial} as the variables exclusively, i.e., without the coefficient.  However, it appears that @lordpretzel removed that definition, but I just wanted to mention this so that \emph{hopefully} we are consistent in our language as to what we mean by the term monomial.}
%\AR{I think its OK: it not a big difference and I don't think the readers will get confused-- if we are worried we can always add a disclaimer saying we might include the coefficient or nont dependning on the context}.

\begin{definition}[Modding with a set]\label{def:mod-set}
Let $S$ be a {\em set} of polynomials over $\vct{X}$. Then $\poly(\vct{X})\mod{S}$ is the polynomial obtained by taking the mod of $\poly(\vct{X})$ over {\em all} polynomials in $S$ (the order does not matter).
\end{definition}
For example when $S_0=\inset{x^2-x,y^2-y}$, taking the polynomial in~\cref{eq:poly-eg} mod $S_0$, we get $2x+3xy-2y$.

\begin{Definition}\label{def:mod-set-polys}
Given the set of BIDB variables $\inset{X_{b,i}}$, define
\[\mathcal{B}=\inset{X_{b,i}\cdot X_{b,j}|\text{ for every block } b \text{and } i\ne j},\]
\[\mathcal{T}=\inset{X_{b,i}^2-X_{b,i}|\text{ for every block } b \text{and } i}.\]
\end{Definition}

\AR{Something to check/square out: we have been using both $X_{b,j}$ and $X_1,\dots,X_n$ for vars in BIDB-- I think this is OK as long as we explicitly talk about these two notations and how we might switch between them. Or we decide not to...}

\AR{Some of these definitions have been pulled to the prelims section. Another pass is needed to sync up these occurrences. Leaving them in for now.}

\begin{Definition}[Expression Tree]\label{def:express-tree}
An expression tree $\etree$ is a binary %an ADT logically viewed as an n-ary
tree, whose internal nodes are from the set $\{+, \times\}$, with leaf nodes being either from the set $\mathbb{R}$ $(\tnum)$ or from the set of monomials $(\var)$.  The members of $\etree$ are \type, \val, \vari{partial}, \vari{children}, and \vari{weight}, where \type is the type of value stored in the node $\etree$ (i.e. one of $\{+, \times, \var, \tnum\}$, \val is the value stored, and \vari{children} is the list of $\etree$'s children where $\etree_\lchild$ is the left child and $\etree_\rchild$ the right child.  Remaining fields hold values whose semantics we will fix later.  When $\etree$ is used as input of ~\Cref{alg:mon-sam} and ~\Cref{alg:one-pass}, the values of \vari{partial} and \vari{weight} will not be set. %SEMANTICS FOR \etree:   \vari{partial} is the sum of $\etree$'s coefficients , n, and \vari{weight} is the probability of $\etree$ being sampled.
\end{Definition}

Note that $\etree$ need not encode an expression in the standard monomial basis.  For instance, $\etree$ could represent a compressed form of the polynomial in~\cref{eq:poly-eg}, such as $(x + 2y)(2x - y)$.

\begin{Definition}[$\polyf(\cdot)$]\label{def:poly-func}
Denote $\polyf(\etree)$ to be the function that takes as input expression tree $\etree$ and outputs its corresponding polynomial.  $poly(\cdot)$ is recursively defined on $\etree$ as follows, where $\etree_\lchild$ and $\etree_\rchild$ denote the left and right child of $\etree$ respectively.

%	\begin{align*}
%		&\etree.\type = +\mapsto&& \polyf(\etree_\lchild) + \polyf(\etree_\rchild)\\
%		&\etree.\type = \times\mapsto&& \polyf(\etree_\lchild) \cdot \polyf(\etree_\rchild)\\
%		&\etree.\type =  \var \text{ OR } \tnum\mapsto&& \etree.\val
%	\end{align*}


\begin{equation*}
	\polyf(\etree) = \begin{cases}
					\polyf(\etree_\lchild) + \polyf(\etree_\rchild)			&\text{ if \etree.\type } = +\\
					\polyf(\etree_\lchild) \cdot \polyf(\etree_\rchild)		&\text{ if \etree.\type } = \times\\
					\etree.\val									&\text{ if \etree.\type } = \var \text{ OR } \tnum.
				\end{cases}
\end{equation*}
\end{Definition}

Note that addition and multiplication above follow the standard interpretation over polynomials.
%Specifically, when adding two monomials whose variables and respective exponents agree, the coefficients corresponding to the monomials are added and their sum is multiplied to the monomial.  Multiplication here is denoted by concatenation of the monomial and coefficient.  When two monomials are multiplied, the product of each corresponding coefficient is computed, and the variables in each monomial are multiplied, i.e., the exponents of like variables are added.  Again we notate this by the direct product of coefficient product and all disitinct variables in the two monomials, with newly computed exponents.

\begin{Definition}[Expression Tree Set]\label{def:express-tree-set}$\etreeset{\smb}$ is the set of all possible expression trees $\etree$, such that $poly(\etree) = \poly(\vct{X})$.
\end{Definition}

For the polynomial in~\cref{eq:poly-eg}, $\etreeset{\smb}$ would include the following (represented as their corresponding expression trees): $2x^2 + 3xy - 2y^2, (x + 2y)(2x - y), x(2x - y) + 2y(2x - y), 2x(x + 2y) - y(x + 2y)$.  Note that \cref{def:express-tree-set} implies that for any expression tree $\etree$, we have $\etree \in \etreeset{poly(\etree)}$.


\begin{Definition}[Expanded T]\label{def:expand-tree}
$\expandtree{\etree}$ is the (pure) sum of products expansion of $\etree$, which we formally define next.  The logical view of \expandtree{\etree} ~is a list of tuples $(\monom, \coef)$, where $\monom$ is a monomial and $\coef$ is in $\mathbb{R}$.  \expandtree{\etree} has the following recursive definition (where $\circ$ is list concatenation).
\end{Definition}

% recursively defined as
%	\begin{align*}
%		&\etree.\type =  +  \mapsto&& \elist{\expandtree{\etree_\lchild}, \expandtree{\etree_\rchild}}\\
%		&\etree.\type =  \times \mapsto&& \elist{\expandtree{\etree_\lchild} \otimes \expandtree{\etree_\rchild}}\\
%		&\etree.\type = \tnum \mapsto&& \elist{(\emptyset, \etree.\val)}\\
%		&\etree.\type = \var \mapsto&& \elist{(\etree.\val, 1)}
%	\end{align*}
\begin{align*}
&\expandtree{\etree} = \\
&\begin{cases}
					\expandtree{\etree_\lchild} \circ \expandtree{\etree_\rchild}		&\textbf{ if }\etree.\type = +\\
					\left\{(\monom_\lchild \cup \monom_\rchild, \coef_\lchild \cdot \coef_\rchild) ~|~\right.&\\ 							\left.(\monom_\lchild, \coef_\lchild) \in \expandtree{\etree_\lchild}, (\monom_\rchild, \coef_\rchild) \in \expandtree{\etree_\rchild}\right\} 		&\textbf{ if }\etree.\type = \times\\
					\elist{(\emptyset, \etree.\val)}								&\textbf{ if }\etree.\type = \tnum\\
					\elist{(\{\etree.\val\}, 1)}									&\textbf{ if }\etree.\type = \var.\\
				\end{cases}
\end{align*}
%where that the multiplication of two tuples %is the standard multiplication over monomials and the standard multiplication over coefficients to produce the product tuple, as in
%is their direct product $(\monom_1, \coef_1) \cdot (\monom_2, \coef_2) = (\monom_1 \cdot \monom_2, \coef_1 \times \coef_2)$ such that monomials $\monom_1$ and $\monom_2$ are concatenated in a product operation, while the standard product operation over reals applies to $\coef_1 \times \coef_2$.  The product of $\expandtree{\etree_\lchild} \cdot \expandtree{\etree'_\rchild}$ is then the cross product of the multiplication of all such tuples returned to both $\expandtree{\etree_\lchild}$ and $\expandtree{\etree_\rchild}$.  %The operator $\otimes$ is defined as the cross-product tuple multiplication of all such tuples returned by both $\expandtree{\etree_\lchild}$ and $\expandtree{\etree_\rchild}$.


\begin{Example}\label{example:expr-tree-T}
Consider the factorized representation $(x + 2y)(2x - y)$ of the polynomial in~\cref{eq:poly-eg}.  Its expression tree $\etree$ is illustrated in Figure ~\ref{fig:expr-tree-T}.  The pure expansion of the product is $2x^2 - xy + 4xy - 2y^2$ and the $\expandtree{\etree}$ is $[(2, x^2), (-1, xy), (4, xy), (-2, y^2)]$.
\end{Example}


\begin{figure}[h!]

\begin{tikzpicture}[thick, level distance=0.9cm,level 1/.style={sibling distance=3.55cm}, level 2/.style={sibling distance=1.8cm}, level 3/.style={sibling distance=0.8cm}]% level/.style={sibling distance=6cm/(#1 * 1.5)}]
	\node[tree_node](root){$\boldsymbol{\times}$}
		child{node[tree_node]{$\boldsymbol{+}$}
			child{node[tree_node]{x}
				%child[missing]{node[tree_node]{}}
				%child{node[tree_node]{x}}
				}
			child{node[tree_node]{$\boldsymbol{\times}$}
				child{node[tree_node]{2}}
				child{node[tree_node]{y}}
				}
			}
		child{node[highlight_treenode] (TR) {$\boldsymbol{+}$}
			child{node[tree_node]{$\boldsymbol{\times}$}
				child{node[tree_node]{2}}
				child{node[tree_node]{x}}
				}
			child{node[tree_node]{$\boldsymbol{\times}$}
				child{node[tree_node] (neg-leaf) {-1}}
				child{node[tree_node]{y}}
				}
			%child[sibling distance= 0cm, grow=north east, red]{node[tree_node]{$\etree_\rchild$}}
			};
%		\node[below=2pt  of neg-leaf, inner sep=1pt, blue] (neg-comment) {\textbf{Negation pushed to leaf nodes}};
%		\draw[<-|, blue] (neg-leaf) -- (neg-comment);
		\node[above right=0.7cm of TR, highlight_color, inner sep=0pt, font=\bfseries] (tr-label) {$\etree_\rchild$};
		\node[above right=0.7cm of root, highlight_color, inner sep=0pt, font=\bfseries] (t-label) {$\etree$};
		\draw[<-|, highlight_color] (TR) -- (tr-label);
		\draw[<-|, highlight_color] (root) -- (t-label);
\end{tikzpicture}

\caption{Expression tree $\etree$ for the product $\boldsymbol{(x + 2y)(2x - y)}$.}
\label{fig:expr-tree-T}
\end{figure}


\begin{Definition}[Positive T]\label{def:positive-tree}
For any expression tree $\etree$, the corresponding
{\em positive tree}, denoted $\abs{\etree}$ obtained from $\etree$ as follows. For each leaf node $\ell$ of $\etree$ where $\ell.\type$ is $\tnum$, update $\ell.\vari{value}$ to $|\ell.\vari{value}|$. %value $\coef$ of each coefficient leaf node in $\etree$ is set to %$\coef_i$ in $\etree$ is exchanged with its absolute value$|\coef|$.
\end{Definition}

Using the same factorization from ~\cref{example:expr-tree-T}, $poly(\abs{\etree}) = (x + 2y)(2x + y) = 2x^2 +xy +4xy + 2y^2 = 2x^2 + 5xy + 2y^2$.  Note that this \textit{is not} the same as the polynomial from~\cref{eq:poly-eg}.

\begin{Definition}[Evaluation]\label{def:exp-poly-eval}
Given an expression tree $\etree$ and $\vct{v} \in \mathbb{R}^\numvar$, $\etree(\vct{v}) = poly(\etree)(\vct{v})$.
\end{Definition}

\subsection{Our main result}


In the subsequent subsections we will prove the following theorem.

\begin{Theorem}\label{lem:approx-alg}
Let $\etree$ be an expression tree for a UCQ over BIDB and define $\poly(\vct{X})=\polyf(\etree)$ and let $k=\degree(\poly)$
%Let $\poly(\vct{X})$ be a query polynomial corresponding to the output of a UCQ in a BIDB. 
An estimate $\mathcal{E}$ %=\approxq(\etree, (p_1,\dots,p_\numvar), \conf, \error')$ 
 of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ can be computed in time 
\[O\left(\treesize(\etree) + \frac{\log{\frac{1}{\conf}}\cdot \abs{\etree}^2(1,\ldots, 1)\cdot  k\cdot \log{k} \cdot depth(\etree))}{\inparen{\error'}^2\cdot\rpoly^2(\prob_1,\ldots, \prob_\numvar)}\right),\] 
such that
\begin{equation}
\label{eq:approx-algo-bound}
P\left(\left|\mathcal{E} - \rpoly(\prob_1,\dots,\prob_\numvar)\right|> \error' \cdot \rpoly(\prob_1,\dots,\prob_\numvar)\right) \leq \conf.
\end{equation}
%with multiplicative $(\error,\delta)$-bounds, where $k$ denotes the degree of $\poly$.
\end{Theorem}

It turns out that to get linear runtime results from~\cref{lem:approx-alg}, we will need to define another parameter (which roughly counts the (weighted) number of monomials in $\expandtree{\etree}$ that get `canceled' when modded with $\mathcal{B}$):
\begin{Definition}[Parameter $\gamma$]\label{def:param-gamma}
Given an expression tree $\etree$, define
\[\gamma(\etree)=\frac{\sum_{(\monom, \coef)\in \expandtree{\etree}} \abs{\coef}\cdot \onesymbol\inparen{\monom\mod{\mathcal{B}}\equiv 0}}{\abs{\etree}(1,\ldots, 1)}\]
\end{Definition}
%\AH{This....combined with \Cref{def:mod-set-polys} is \emph{really} nice notation!}
\AR{Need to make sure use of indicator variable $\onesymbol$ above is consistent with the rest of the paper.}

We next present couple of corollaries of~\Cref{lem:approx-alg}.
\begin{Corollary}
\label{cor:approx-algo-const-p}
Let $\poly(\vct{X})$ be as in~\Cref{lem:approx-alg} and let $\gamma=\gamma(\etree)$. Further let it be the case that $p_i\ge p_0$ for all $i\in[\numvar]$. Then an estimate $\mathcal{E}$  of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ satisfying~\cref{eq:approx-algo-bound} can be computed in time
\[O\left(\treesize(\etree) + \frac{\log{\frac{1}{\conf}}\cdot k\cdot \log{k} \cdot depth(\etree))}{\inparen{\error'}^2\cdot(1-\gamma)^2\cdot p_0^{2k}}\right)\]
In particular, if $p_0>0$ and $\gamma<1$ are absolute constants then the above runtime simplifies to $O_k\left(\frac 1{\eps^2}\cdot\treesize(\etree)\cdot \log{\frac{1}{\conf}}\right)$. 
\end{Corollary}
We note that the restriction on $\gamma$ is satisfied by TIDB (where $\gamma=0$) and for some BIDB benchmarks (see~\Cref{sec:experiments} for more on this claim).
\AH{I am thinking that perhaps the terminology and presentation of~\Cref{sec:experiments} may need word-smithing to clearly illustrate the $\bi$ benchmarks satisfied--although the substance is already written there.}
\AR{Yes! E.g. $\gamma$ is not used at all in~\Cref{sec:experiments}}
\AR{{\bf Boris/Oliver:} Is there a way to claim that all probabilities in practice are actually constants: i.e. they do not increase with the number of  tuples?}

\begin{proof}[Proof of~\Cref{cor:approx-algo-const-p}]
The result follows by first noting that by definition of $\gamma$, we have 
%\AH{Just wondering why you use $\geq$ as opposed to $=$?}
%\AR{Ah, right-- fixed}
\[\rpoly(1,\dots,1)= (1-\gamma)\cdot \abs{\etree}(1,\dots,1).\] 
Further, since each $p_i\ge p_0$ and $\poly(\vct{X})$ (and hence $\rpoly(\vct{X})$) has degree at most $k$, we have that
\[ \rpoly(1,\dots,1) \ge p_0^k\cdot \rpoly(1,\dots,1).\]
The above two inequalities implies $\rpoly(1,\dots,1) \ge p_0^k\cdot (1-\gamma)\cdot \abs{\etree}(1,\dots,1)$.
%\AH{This looks really nice!}
Applying this bound in the runtime bound in~\Cref{lem:approx-alg} gives the first claimed runtime. The final runtime of $O_k\left(\frac 1{\eps^2}\cdot\treesize(\etree)\cdot \log{\frac{1}{\conf}}\right)$ follows by noting that $depth(\etree)\le \treesize(\etree)$ and absorbing all factors that just depend on $k$.
\end{proof}

\subsection{Approximating $\rpoly$}

The algorithm to prove~\Cref{lem:approx-alg} follows from the following observation. Given a query polynomial $\poly(\vct{X})=poly(\etree)$ for expression tree $\etree$ over $\bi$, we note that we can exactly represent $\rpoly(\vct{X}$ as follows:
\begin{equation}
\label{eq:tilde-Q-bi}
\rpoly\inparen{X_1,\dots,X_\numvar}=\sum_{(v,c)\in \expandtree{\etree}} \onesymbol\inparen{\monom\mod{\mathcal{B}}\not\equiv 0}\cdot c\cdot\prod_{X_i\in \var\inparen{v}} X_i.
\end{equation}
Given the above, the algorithm is a sampling based algorithm for the above sum: we sample $(v,c)\in \expandtree{\etree}$ with probability proportional\footnote{We could have also uniformly sampled from $\expandtree{\etree}$ but this gives better parameters.}
%\AH{Regarding the footnote, is there really a difference?  I \emph{suppose} technically, but in this case they are \emph{effectively} the same.  Just wondering.} 
%\AR{Yes, there is! If we used uniform distribution then in our bounds we will have a parameter that depends on the largest $\abs{coef}$, which e.g. could be dependent on $n$. But with the weighted probability distribution, we avoid paying this price. Though I guess perhaps we can say for the kinds of queries we consider thhese coefficients are all constants?}
to $\abs{c}$ and compute $Y=\onesymbol\inparen{\monom\mod{\mathcal{B}}\not\equiv 0}\cdot \prod_{X_i\in \var\inparen{v}} p_i$. Taking enough samples and computing the average of $Y$ gives us our final estimate. Algorithm~\ref{alg:mon-sam} has the details.

%We state the approximation algorithm in terms of a $\bi$.
%\subsubsection{Description}
%Algorithm ~\ref{alg:mon-sam} approximates $\rpoly$ using the following steps.  First, a call to $\onepass$ on its input $\etree$ produces a non-biased weight distribution over the monomials of $\expandtree{\etree}$ and a correct count of $|\etree|(1,\ldots, 1)$, i.e., the number of monomials in $\expandtree{\etree}$.  Next, ~\cref{alg:mon-sam} calls $\sampmon$  to sample one monomial and its sign from $\expandtree{\etree}$.  The sampling is repeated $\ceil{\frac{2\log{\frac{2}{\delta}}}{\epsilon^2}}$ times, where each of the samples are evaluated with input $\vct{p}$, multiplied by $1 \times sign$, and summed.  The final result is scaled accordingly returning an estimate of $\rpoly$ with the claimed $(\error, \conf)$-bound of ~\cref{lem:mon-samp}.

%\AR{Seems like the notation below belongs to the notation section (if we decide to state this explicitly at all)?}
%\AH{Yes, I only included this per your request a few months ago.  Based on @lordpretzel removing my definition of monomial, perhaps we can assume that the reader understands the notation below.  I \emph{think} this should be a reasonable assumption.}
%Recall that the notation $[x, y]$ denotes the range of values between $x$ and $y$ inclusive.  The notation $\{x, y\}$ denotes the set of values consisting of $x$ and $y$.
%\subsubsection{Psuedo Code}

%Original TIDB Algorithm
%\begin{algorithm}[H]
%	\caption{$\approxq$($\etree$, $\vct{p}$, $\conf$, $\error$)}
%	\label{alg:mon-sam}
%	\begin{algorithmic}[1]
%		\Require \etree: Binary Expression Tree
%		\Require $\vct{p} = (\prob_1,\ldots, \prob_\numvar)$ $\in [0, 1]^N$
%		\Require $\conf$ $\in [0, 1]$
%		\Require $\error$ $\in [0, 1]$
%		\Ensure \vari{acc} $\in \mathbb{R}$
%		\State $\accum \gets 0$\label{alg:mon-sam-global1}
%		\State $\numsamp \gets \ceil{\frac{2 \log{\frac{2}{\conf}}}{\error^2}}$\label{alg:mon-sam-global2}
%		\State $(\vari{\etree}_\vari{mod}, \vari{size}) \gets $ \onepass($\etree$)\label{alg:mon-sam-onepass}\Comment{$\onepass$ is ~\cref{alg:one-pass} \;and \sampmon \; is ~\cref{alg:sample}}\newline
%		\For{\vari{i} \text{ in } $1\text{ to }\numsamp$}\Comment{Perform the required number of samples}
%			\State $(\vari{M}_\vari{i}, \vari{sgn}_\vari{i}) \gets  $ \sampmon($\etree_\vari{mod}$)\label{alg:mon-sam-sample}
%				\State $\vari{Y}_\vari{i} \gets 1$\label{alg:mon-sam-assign1}
%				\For{$\vari{x}_{\vari{j}}$ \text{ in } $\vari{M}_{\vari{i}}$}
%					\State $\vari{Y}_\vari{i} \gets \vari{Y}_\vari{i} \times \; \vari{\prob}_\vari{j}$\label{alg:mon-sam-product2} \Comment{$\vari{p}_\vari{j}$ is the assignment to $\vari{x}_\vari{j}$ from input $\vct{p}$}
%				\EndFor
%				\State $\vari{Y}_\vari{i} \gets \vari{Y}_\vari{i} \times\; \vari{sgn}_\vari{i}$\label{alg:mon-sam-product}
%			\State $\accum \gets \accum + \vari{Y}_\vari{i}$\Comment{Store the sum over all samples}\label{alg:mon-sam-add}
%		\EndFor
%
%		\State  $\vari{acc} \gets \vari{acc} \times \frac{\vari{size}}{\numsamp}$\label{alg:mon-sam-global3}
%		\State \Return \vari{acc}
%	\end{algorithmic}
%\end{algorithm}

%BIDB Version of Approximation Algorithm


\begin{algorithm}[H]
	\caption{$\approxq(\etree, \vct{p}, \conf, \error)$}
	\label{alg:mon-sam}
	\begin{algorithmic}[1]
		\Require \etree: Binary Expression Tree
		\Require $\vct{p} = (\prob_1,\ldots, \prob_\numvar)$ $\in [0, 1]^N$
		\Require $\conf$ $\in [0, 1]$
		\Require $\error$ $\in [0, 1]$
		%\Require $\abs{\block} \in \mathbb{N}$%\bivec$ $\in [0, 1]^{\abs{\block}}$
		\Ensure \vari{acc} $\in \mathbb{R}$

		%\State $\vari{sample}_\vari{next} \gets 0$
		\State $\accum \gets 0$\label{alg:mon-sam-global1}
		\State $\numsamp \gets \ceil{\frac{2 \log{\frac{2}{\conf}}}{\error^2}}$\label{alg:mon-sam-global2}
		\State $(\vari{\etree}_\vari{mod}, \vari{size}) \gets $ \onepass($\etree$)\label{alg:mon-sam-onepass}\Comment{$\onepass$ is ~\cref{alg:one-pass}}
		%\newline
		%\State $\vari{i} \gets 1$
		\For{$\vari{i} \in 1 \text{ to }\numsamp$}\label{alg:sampling-loop}\Comment{Perform the required number of samples}
			%\State $\bivec \gets [0]^{\abs{\block}}$\Comment{$\bivec$ is an array whose size is the number of blocks, used to check for cross-terms}\newline
			\State $(\vari{M}, \vari{sgn}_\vari{i}) \gets  $ \sampmon($\etree_\vari{mod}$)\label{alg:mon-sam-sample}\Comment{\sampmon \; is ~\cref{alg:sample}}
			%\For{$\vari{x}_\vari{\block,i}$ \text{ in } $\vari{M}$}
		%		\If{$\bivec[\block] = 1$}\label{alg:mon-sam-check}\Comment{If we have already had a variable from this block, $\rpoly$ drops the sample.}
		%		\newline
		%			\State $\vari{sample}_{\vari{next}} \gets 1$
		%			\State break
		%		\Else
		%			\State $\bivec[\block] = 1$
%				\State $\vari{sum} = 0$
%				\For{$\ell \in [\abs{\block}]$}
%					\State $\vari{sum} = \vari{sum} + \bivec[\block][\ell]$
%				\EndFor
%				\If{$\vari{sum} \geq 2$}
%					\State $\vari{sample}_{\vari{next}} \gets 1$
%					\State continue\Comment{Not sure for psuedo code the best way to state this, but this is analogous to C language continue statement.}
		%		\EndIf
		%	\EndFor
		%	\If{$\vari{sample}_{\vari{next}} = 1$}\label{alg:mon-sam-drop}
		%		\State $\vari{sample}_{\vari{next}} \gets 0$\label{alg:mon-sam-resamp}
		%	\Else
                        \If{$\vari{M}$ has at most one variable from each block}\label{alg:check-duplicate-block}
				\State $\vari{Y}_\vari{i} \gets \prod_{X_j\in\var\inparen{\vari{M}}}p_j$\label{alg:mon-sam-assign1}%\newline
				%\For{$\vari{x}_{\vari{j}}$ \text{ in } $\vari{M}$}%_{\vari{i}}$}
				%	\State $\vari{Y}_\vari{i} \gets \vari{Y}_\vari{i} \times \; \vari{\prob}_\vari{j}$\label{alg:mon-sam-product2} \Comment{$\vari{p}_\vari{j}$ is the assignment to $\vari{x}_\vari{j}$ from input $\vct{p}$}
				%\EndFor
				\State $\vari{Y}_\vari{i} \gets \vari{Y}_\vari{i} \times\; \vari{sgn}_\vari{i}$\label{alg:mon-sam-product}
			\State $\accum \gets \accum + \vari{Y}_\vari{i}$\Comment{Store the sum over all samples}\label{alg:mon-sam-add}
			%\State $\vari{i} \gets \vari{i} + 1$
			\EndIf
		\EndFor

		%\State $\gamma \gets $ $\algname{Estimate}$ $\gamma(\etree, \numsamp, \abs{\block})$
		\State  $\vari{acc} \gets \vari{acc} \times \frac{\vari{size}}{\numsamp}$\label{alg:mon-sam-global3}
		\State \Return \vari{acc}
	\end{algorithmic}
\end{algorithm}

%\begin{algorithm}[H]
%	\caption{$\algname{Estimate}$ $\gamma(\etree, \numsamp, \abs{\block})$}
%	\label{alg:est-gamma}
%	\begin{algorithmic}[1]
%		\Require \etree: Binary Expression Tree
%		\Require $\numsamp \in \mathbb{N}$
%		\Require $\abs{\block} \in \mathbb{N}$
%		\Ensure \vari{cTerms} $]in \mathbb{R}$
%
%		\State $\vari{cTerms} \gets 0$
%		\State $\vari{isCross} \gets 0$
%		\For{$\vari{i} \text{ in } 1 \text{ to } \numsamp$}
%			\State $\bivec \gets [0]^{\abs{\block}}$
%			\State $(\vari{M}, \vari{sgn}) \gets  $ \sampmon($\etree_\vari{mod}$)
%			\For{$\vari{x}_{\vari{b}, \vari{j}} \text{ in } \vari{M}$}
%				\If{$\bivec[b] = 1$}
%					\State $\vari{isCross} \gets 1$
%					\State Break
%				\Else
%					\State $\bivec[b] \gets 1$
%				\EndIf
%			\EndFor
%			\If{$\vari{isCross} = 1$}
%				\State $\vari{cTerms} \gets \vari{cTerms} + 1$
%				\State $\vari{isCross} \gets 0$
%			\EndIf
%		\EndFor
%		\State \Return $\frac{\vari{cTerms}}{\numsamp}$
%	\end{algorithmic}
%\end{algorithm}


\subsubsection{Correctness}

In order to prove~\Cref{lem:approx-alg}, we will need to argue the correctness of~\cref{alg:mon-sam}. Before we formally do that,
we first state the lemmas that summarize the relevant properties of $\onepass$ and $\sampmon$, the auxiliary algorithms on which ~\cref{alg:mon-sam} relies.  Their proofs are given in~\Cref{sec:onepass} and~\Cref{sec:samplemonomial} respectively.


\begin{Lemma}\label{lem:one-pass}
The $\onepass$ function completes in $O(size(\etree))$ time.  After $\onepass$ returns the following post conditions hold.  First, for each subtree $\vari{S}$ of $\etree$, we have that $\vari{S}.\vari{partial}$ is set to $\abs{\vari{S}}(1,\ldots, 1)$.  Second, when $\vari{S}.\val  = +$, each $\vari{child}$ of $\vari{S}$, $\vari{child}.\vari{weight}$ is set to $\frac{\abs{\vari{S}_{\vari{child}}}(1,\ldots, 1)}{\abs{\vari{S}}(1,\ldots, 1)}$. % is correctly computed for each child of $\vari{S}.$
\end{Lemma}
In proving correctness of~\Cref{alg:mon-sam}, we will only use the following fact (which follows from the above lemma), $\etree_{\vari{mod}}.\vari{partial}=\abs{\etree}(1,\dots,1)$.
%\AH{I'm wondering if there is a better notation to use here.  I myself got confused by my own notation of $\etree_{\vari{mod}}$.  \emph{But}, we need to to be referencing the modified $\etree$ returned by $\onepass$ in the algorithm, so maybe this is the best we can do?}
%\AR{yeah, I think this is fine.}
%At the conclusion of $\onepass$, $\etree.\vari{partial}$ will hold the sum of all coefficients in $\expandtree{\abs{\etree}}$, i.e., $\sum\limits_{(\monom, \coef) \in \expandtree{\abs{\etree}}}\coef$.  $\etree.\vari{weight}$ will hold the weighted probability that $\etree$ is sampled from from its parent $+$ node.

\begin{Lemma}\label{lem:sample}
The function $\sampmon$ completes in $O(\log{k} \cdot k \cdot depth(\etree))$ time, where $k = \degree(poly(\abs{\etree})$.  Upon completion, every $\left(\monom, sign(\coef)\right)\in \expandtree{\abs{\etree}}$ is returned with probability $\frac{|\coef|}{\abs{\etree}(1,\ldots, 1)}$. %, $\sampmon$ returns the sampled term $\left(\monom, sign(\coef)\right)$ from $\expandtree{\abs{\etree}}$.
\end{Lemma}

Armed with the above two lemmas, we are ready to argue the following result:
\begin{Theorem}\label{lem:mon-samp}
%If the contracts for $\onepass$ and $\sampmon$ hold, then 
For any $\etree$ with $\degree(poly(|\etree|)) = k$, algorithm \ref{alg:mon-sam} outputs an estimate $\vari{acc}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ such that %$\expct\pbox{\empmean} = \frac{\rpoly(\prob_1,\ldots, \prob_\numvar)\cdot(1 - \gamma)}{\abs{\etree}(1,\ldots, 1)}$.  %within an additive $\error \cdot \abs{\etree}(1,\ldots, 1)$ error with
$\empmean$ has bounds 
\[P\left(\left|\vari{acc} - \rpoly(\prob_1,\ldots, \prob_\numvar)\right|> \error \cdot \abs{\etree}(1,\ldots, 1)\right) \leq \conf,\]
 in $O\left(\treesize(\etree)\right.$ $+$ $\left.\left(\frac{\log{\frac{1}{\conf}}}{\error^2} \cdot k \cdot\log{k} \cdot depth(\etree)\right)\right)$ time.
\end{Theorem}

Before proving~\Cref{lem:mon-samp}, we use it to argue our main result:
\begin{proof}[Proof of Theorem \ref{lem:approx-alg}]
%\begin{Corollary}\label{cor:adj-err}
Set $\mathcal{E}=\approxq(\etree, (p_1,\dots,p_\numvar),$ $\conf, \error')$, where
\[\error' = \error \cdot \frac{\rpoly(\prob_1,\ldots, \prob_\numvar)\cdot (1 - \gamma)}{\abs{\etree}(1,\ldots, 1)},\]
 which achieves the claimed accuracy bound on $\mathcal{E}$.
% achieves $1 \pm \epsilon$ multiplicative error bounds, in $O\left(\treesize(\etree) + \frac{\log{\frac{1}{\conf}}\cdot \abs{\etree}^2(1,\ldots, 1)}{\error^2\cdot\rpoly^2(\prob_1,\ldots, \prob_\numvar)(1 - \gamma)^2}\right)$.
%\end{Corollary}

%Since it is the case that we have $\error \cdot \abs{\etree}(1,\ldots, 1)$ additive error, one can set $\error = \error \cdot \frac{\rpoly(\prob_1,\ldots, \prob_\numvar)\cdot (1 - \gamma)}{\abs{\etree}(1,\ldots, 1)}$, yielding a multiplicative error proportional to $\rpoly(\prob_1,\ldots, \prob_\numvar)$.  This only affects the runtime in the number of samples taken, changing the first factor of the second summand of the original runtime accordingly.

%The derivation over the number of samples is then
The claim on the runtime follows since
\begin{align*}
\frac 1{\inparen{\error'}^2}\cdot \log\inparen{\frac 1\conf}=&\frac{\log{\frac{1}{\conf}}}{\error^2 \left(\frac{\rpoly(\prob_1,\ldots, \prob_N)}{\abs{\etree}(1,\ldots, 1)}\right)^2}\\
= &\frac{\log{\frac{1}{\conf}}\cdot \abs{\etree}^2(1,\ldots, 1)}{\error^2 \cdot \rpoly^2(\prob_1,\ldots, \prob_\numvar)},
\end{align*}
%and the runtime then follows, thus upholding ~\cref{lem:approx-alg}.
which completes the proof.
\end{proof}

\qed

We now return to the proof of~\Cref{lem:mon-samp}:
\begin{proof}[Proof of Theorem \ref{lem:mon-samp}]
%As previously noted, by lines ~\ref{alg:mon-sam-check} and ~\ref{alg:mon-sam-drop} the algorithm will resample when it encounters a sample with variables from the same block.  The probability of sampling such a monomial is $\gamma$.

%Now, consider $\expandtree{\etree}$ and let $(\monom, \coef)$ be an arbitrary tuple in $\expandtree{\etree}$.  For convenience, over an alphabet $\Sigma$ of size $\numvar$, define
%\begin{equation*}
%\evalmp: \left(\left\{\monom^a~|~\monom \in \Sigma^b, a \in \mathbb{N}, b \in [k]\right\}, [0, 1]^\numvar\right)\mapsto \mathbb{R},
%\end{equation*}
%a function that takes a monomial $\monom$ in $\left\{\monom^a ~|~ \monom \in \Sigma^b, a \in \mathbb{N}, b \in [k]\right\}$ and probability vector $\vct{p}$ (introduced in ~\cref{subsec:def-data}) as input and outputs the evaluation of $\monom$ over $\vct{p}$.  By ~\cref{lem:sample}, the sampling scheme samples $(\monom, \coef)$ in $\expandtree{\etree}$ with probability $\frac{|\coef|}{\abs{\etree}(1,\ldots, 1)}$.   Note that $\coef \cdot \evalmp(\monom, \vct{p})$ is the value of $(\monom, \coef)$ in $\expandtree{\etree}$ when all variables in $\monom$ are assigned their corresponding probabilities.

%Let $Vars(\monom) = \{X_{\block, i} \st X_{\block, i} \in \monom\}$.  Define the set of elements containing no cross-terms in $\expandtree{\etree}$ as $\expandtree{\etree}' = \{(\monom, \coef) \st \forall (\monom, \coef) \in \expandtree{\etree}, \forall X_{\block, i}, X_{\block', j} \in Vars(\monom), \block \neq \block'\}$.

%Note again that the sum of $\coef \cdot \evalmp(\monom, \vct{p})$ over $\expandtree{\etree}'$ is equivalently $\rpoly(\prob_1,\ldots, \prob_\numvar)$.

Consider now the random variables $\randvar_1,\dots,\randvar_\numvar$, where each $\randvar_i$ is the value of $\vari{Y}_{\vari{i}}$ after~\Cref{alg:mon-sam-product} is executed. In particular, note that we have
\[Y_i= \onesymbol\inparen{\monom\mod{\mathcal{B}}\not\equiv 0}\cdot \prod_{X_i\in \var\inparen{v}} p_i,\]
where the indicator variable handles the check in~\Cref{alg:check-duplicate-block} 
Then for random variable $\randvar_i$, it is the case that

\[\expct\pbox{\randvar_i} = \sum\limits_{(\monom, \coef) \in \expandtree{\etree} }\frac{\onesymbol\inparen{\monom\mod{\mathcal{B}}\not\equiv 0}\cdot c\cdot\prod_{X_i\in \var\inparen{v}} p_i }{\abs{\etree}(1,\dots,1)} = \frac{\rpoly(\prob_1,\ldots, \prob_\numvar)}{\abs{\etree}(1,\ldots, 1)},\]
where in the first equality we use the fact that $\vari{sgn}_{\vari{i}}\cdot \abs{\coef}=\coef$ and the second equality follows from~\cref{eq:tilde-Q-bi} with $X_i$ substituted by $\prob_i$.
%\AH{I have always kind of 'tripped' when folks talk like this.  Isn't it more accurate to say that the last equality follows by the \emph{construction} of~\cref{eq:tilde-Q-bi}, and this construction is equivalent of $\rpoly(\prob_1,\ldots, \prob_\numvar)$?}
%\AR{Added that the $X_i$ are subtituted by $p_i$. But it seems like you are tripping on something else. I'm not sure what you mean by \emph{construction} of~\cref{eq:tilde-Q-bi}? \cref{eq:tilde-Q-bi} is an identity and we just use it here.}
% = \frac{\rpoly(\prob_1,\ldots, \prob_\numvar)\cdot (1 - \gamma)}{\abs{\etree}(1,\ldots, 1)}.\]  

Let $\empmean = \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\randvar_i$.  It is also true that

\[\expct\pbox{\empmean}  %\expct\pbox{ \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\randvar_i} 
= \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\expct\pbox{\randvar_i}
%&= \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\sum\limits_{(\monom, \coef) \in \expandtree{\etree}'}\frac{\coef \cdot \evalmp(\monom, \vct{p})}{\sum\limits_{(\monom, \coef) \in \expandtree{\etree}'}} 
= \frac{\rpoly(\prob_1,\ldots, \prob_\numvar)}{\abs{\etree}(1,\ldots, 1)}.\]

Hoeffding's inequality %can be used to compute an upper bound on the number of samples $\samplesize$ needed to establish the $(\error, \conf)$-bound.  The inequality 
states that if we know that each $\randvar_i$ (which are all independent) always lie in the intervals $[a_i, b_i]$, then it is true that
\begin{equation*}
P\left(\left|\empmean - \expct\pbox{\empmean}\right| \geq \error\right) \leq 2\exp{\left(-\frac{2\samplesize^2\error^2}{\sum_{i = 1}^{\samplesize}(b_i -a_i)^2}\right)}.
\end{equation*}

%As implied above, Hoeffding is assuming the sum of random variables be divided by the number of variables.  Since $\rpoly(\prob_1,\ldots, \prob_\numvar)\cdot(1 - \gamma) = \expct\pbox{\empmean} \cdot \abs{\etree}(1,\ldots, 1)$, then our estimate is the sum of random samples multiplied by $\frac{\abs{\etree}(1,\ldots, 1)}{\samplesize \cdot (1 - \gamma)}$.  This computation is performed on ~\cref{alg:mon-sam-global3}.
%Also see that to properly estimate $\rpoly$, it is necessary to multiply by the number of monomials in $\rpoly$, i.e. $\abs{\etree}(1,\ldots, 1)$.  Therefore it is the case that $\frac{acc}{N}$ gives the estimate of one monomial, and multiplying by $\abs{\etree}(1,\ldots, 1)$ yields the estimate of $\rpoly(\prob_1,\ldots, \prob_\numvar)$.  This scaling is performed in line ~\ref{alg:mon-sam-global3}.

Line ~\ref{alg:mon-sam-sample} shows that $\vari{sgn}_\vari{i}$ has a value in $\{-1, 1\}$ that is multiplied with $O(k)$ %at most $\degree(\polyf(\abs{\etree}))$ factors from $\vct{p}$ (\cref{alg:mon-sam-product2}) such that each 
$p_i\in [0, 1]$, the range for each $\randvar_i$ is $[-1, 1]$. % Bounding Hoeffding's results by $\conf$ ensures confidence no less than $1 - \conf$.  Then by upper bounding Hoeffding with $\frac{\conf}{2}$ (since we take an additional estimate of $\gamma$), it is the case that
Using Hoeffding's inequality, we then get:
\begin{equation*}
P\pbox{~\left| \empmean - \expct\pbox{\empmean} ~\right| \geq \error} \leq 2\exp{\left(-\frac{2\samplesize^2\error^2}{2^2 \samplesize}\right)} = 2\exp{\left(-\frac{\samplesize\error^2}{2 }\right)}\leq \conf,
\end{equation*}
where the last inequality follows from our choice of $\samplesize$ in~\Cref{alg:mon-sam-global2}.
%\AH{What do you mean by the last inequality following from our choic of $\samplesize$?  do you mean that our choice of $\samplesize$ is governed by the value of $\conf$?}
%\AR{Added in refernce to relevant line number.}
%Solving for the number of samples $\samplesize$ we get
%\begin{align}
%&\frac{\conf}{2} \geq  2\exp{-\left(\frac{2\samplesize^2\error^2}{4\samplesize}\right)}\label{eq:hoeff-1}\\
%&\frac{\conf}{2} \geq \exp{-\left(\frac{2\samplesize^2\error^2}{4\samplesize}\right)}\label{eq:hoeff-2}\\
%&\frac{2}{\conf} \leq \exp{\left(\frac{2\samplesize^2\error^2}{4\samplesize}\right)}\label{eq:hoeff-3}\\
%&\log{\frac{2}{\conf}} \leq \left(\frac{2\samplesize^2\error^2}{4\samplesize}\right)\label{eq:hoeff-4}\\
%&\log{\frac{2}{\conf}} \leq \frac{\samplesize\error^2}{2}\label{eq:hoeff-5}\\
%&\frac{2\log{\frac{4}{\conf}}}{\error^2} \leq \samplesize.\label{eq:hoeff-6}
%\end{align}

%By Hoeffding we obtain the number of samples necessary to achieve the claimed additive error bounds.

This concludes the proof for the first claim of theorem ~\ref{lem:mon-samp}.

\paragraph{Run-time Analysis}
%For a $\bi$ instance, it is possible that cancellations can occur as seen in ~\cref{alg:mon-sam-drop}, and by ~\cref{alg:mon-sam-resamp} the algorithm will then re-sample.  This affects the overall runtime.  Let us denote by $\gamma$ the number of cancellations.

%Note that lines ~\ref{alg:mon-sam-global1}, ~\ref{alg:mon-sam-global2}, and ~\ref{alg:mon-sam-global3} are $O(1)$ global operations.  The call to $\onepass$ in line ~\ref{alg:mon-sam-onepass} by lemma ~\ref{lem:one-pass} is $O(\treesize(\etree))$ time.
%First, algorithm ~\ref{alg:mon-sam} calls \textsc{OnePass} which takes $O(|\etree|)$ time.
%Then for $\numsamp = \ceil{\frac{2 \log{\frac{4}{\conf}}}{\error^2}}$, the $O(1)$ assignment, product, and addition operations occur.  Over the same $\numsamp$ iterations, $\sampmon$ is called, with a runtime of $O(\log{k}\cdot k \cdot depth(\etree))$ by lemma ~\ref{lem:sample}.  Finally, over the same iterations, because $\degree(\polyf(\abs{\etree})) = k$, the assignment and product operations of line ~\ref{alg:mon-sam-product2} are called at most $k$ times.

%Thus we have $O(\treesize(\etree)) + O(\left(\frac{\log{\frac{1}{\conf}}}{\error^2}\right) \cdot \left(k + \log{k}\cdot k \cdot depth(\etree)\right) = O\left(\treesize(\etree) + \left(\left(\frac{\log{\frac{1}{\conf}}}{\error^2}\right) \cdot \left(k \cdot\log{k} \cdot depth(\etree)\right)\right)\right)$ overall running time.
The runtime of the algorithm is dominated by~\Cref{alg:mon-sam-onepass} (which by~\Cref{lem:one-pass} takes time $O(size(\etree))$) and the $\samplesize$ iterations of the loop in~\Cref{alg:sampling-loop}. Each iteration's run time is dominated by the call to~\Cref{alg:mon-sam-sample} (which by~\Cref{lem:sample} takes $O(\log{k} \cdot k \cdot depth(\etree))$) and~\Cref{alg:check-duplicate-block}, which by the subsequent argument takes $O(k\log{k})$ time. We sort the $O(k)$ variables by their block IDs and then check if there is a duplicate block ID or not. Adding up all the times discussed here gives us the desired overall runtime.
\end{proof}

\qed


\subsection{\onepass\ Algorithm}
\label{sec:onepass}

%\subsubsection{Description}
%Algorithm ~\ref{alg:one-pass} satisfies the requirements of lemma ~\ref{lem:one-pass}.

The evaluation of $\abs{\etree}(1,\ldots, 1)$ can be defined recursively, as follows (where $\etree_\lchild$ and $\etree_\rchild$ are the `left' and `right' children of $\etree$ if they exist):


\begin{align}
\label{eq:T-all-ones}
\abs{\etree}(1,\ldots, 1) = \begin{cases}
						\abs{\etree_\lchild}(1,\ldots, 1) \cdot \abs{\etree_\rchild}(1,\ldots, 1)	&\textbf{if }\etree.\type = \times\\
						\abs{\etree_\lchild}(1,\ldots, 1) + \abs{\etree_\rchild}(1,\ldots, 1)		&\textbf{if }\etree.\type = + \\
						 |\etree.\val|											&\textbf{if }\etree.\type = \tnum\\
						1													&\textbf{if }\etree.\type = \var.
					\end{cases}
\end{align}

%\begin{align*}
%&\eval{\etree ~|~ \etree.\type = +}_{\abs{\etree}} =&& \eval{\etree_\lchild}_{\abs{\etree}} + \eval{\etree_\rchild}_{\abs{\etree}}\\
%&\eval{\etree ~|~ \etree.\type = \times}_{\abs{\etree}} = && \eval{\etree_\lchild}_{\abs{\etree}} \cdot \eval{\etree_\rchild}_{\abs{\etree}}\\
%&\eval{\etree ~|~ \etree.\type = \tnum}_{\abs{\etree}} = && \etree.\val\\
%&\eval{\etree ~|~ \etree.\val = \var}_{\abs{\etree}} = && 1
%\end{align*}

%In the same fashion the weighted distribution can be described as above with the following modification for the case when $\etree.\type = +$:
It turns out that for proof of~\Cref{lem:sample}, we need to argue that when $\etree.\type = +$, we indeed have
\begin{align}
\label{eq:T-weights}
%&\abs{\etree_\lchild}(1,\ldots, 1) + \abs{\etree_\rchild}(1,\ldots, 1); &\textbf{if }\etree.\type = + \\
\etree_\lchild.\vari{weight} &\gets \frac{\abs{\etree_\lchild}(1,\ldots, 1)}{\abs{\etree_\lchild}(1,\ldots, 1) + \abs{\etree_\rchild}(1,\ldots, 1)};\\
\etree_\rchild.\vari{weight} &\gets \frac{\abs{\etree_\rchild}(1,\ldots, 1)}{\abs{\etree_\lchild}(1,\ldots, 1)+ \abs{\etree_\rchild}(1,\ldots, 1)}
\end{align}

%\begin{align*}
%&\eval{\etree~|~\etree.\type = +}_{\wght} =&&\eval{\etree_\lchild}_{\abs{\etree}} + \eval{\etree_\rchild}_{\abs{\etree}}; \etree_\lchild.\wght = \frac{\eval{\etree_\lchild}_{\abs{\etree}}}{\eval{\etree_\lchild}_{\abs{\etree}} + \eval{\etree_\rchild}_{\abs{\etree}}}; \etree_\rchild.\wght = \frac{\eval{\etree_\rchild}_{\abs{\etree}}}{\eval{\etree_\lchild}_{\abs{\etree}} + \eval{\etree_\rchild}_{\abs{\etree}}}
%\end{align*}
Algorithm ~\ref{alg:one-pass} essentially implements the above definitions.


%\subsubsection{Psuedo Code}
%See algorithm ~\ref{alg:one-pass} for details.
\begin{algorithm}[h!]
	\caption{\onepass$(\etree)$}
	\label{alg:one-pass}
\begin{algorithmic}[1]
	\Require \etree: Binary Expression Tree
	\Ensure \etree: Binary Expression Tree
	\Ensure \vari{sum} $\in \mathbb{R}$
	\If{$\etree.\type = +$}\label{alg:one-pass-equality1}
		\State $\accum \gets 0$\label{alg:one-pass-plus-assign1}
		\For{$child$ in $\etree.\vari{children}$}\Comment{Sum up all children coefficients}
			\State $(child, \vari{s}) \gets \onepass(child)$
			\State $\accum \gets \accum + \vari{s}$\label{alg:one-pass-plus-add}
		\EndFor
		\State $\etree.\vari{partial} \gets \accum$\label{alg:one-pass-plus-assign2}
		\For{$child$ in $\etree.\vari{children}$}\Comment{Record distributions for each child}
			\State $child.\vari{weight} \gets \frac{child.\vari{partial}}{\etree.\vari{partial}}$\label{alg:one-pass-plus-prob}
		\EndFor
		%\State $\vari{sum} \gets \etree.\vari{partial}$\label{alg:one-pass-plus-assign3}
		\State \Return (\etree, \etree.\vari{partial})
	\ElsIf{$\etree.\type = \times$}\label{alg:one-pass-equality2}
		\State $\accum \gets 1$\label{alg:one-pass-times-assign1}
		\For{$child \text{ in } \etree.\vari{children}$}\Comment{Compute the product of all children coefficients}
			\State $(child, \vari{s}) \gets \onepass(child)$
			\State $\accum \gets \accum \times \vari{s}$\label{alg:one-pass-times-product}
		\EndFor
		\State $\etree.\vari{partial}\gets \accum$\label{alg:one-pass-times-assign2}
		%\State $\vari{sum} \gets \etree.\vari{partial}$\label{alg:one-pass-times-assign3}
		\State \Return (\etree, \etree.\vari{partial})
	\ElsIf{$\etree.\type = numeric$}\Comment{Base case}\label{alg:one-pass-equality3}
		\State $\vari{sum} \gets |\etree.\val|$\label{alg:one-pass-leaf-assign1}\Comment{This step effectively converts $\etree$ into $\abs{\etree}$}
		\State \Return (\etree, \vari{sum})
	\Else\Comment{$\etree.\type = \var$}\label{alg:one-pass-equality4}
		%\State $\vari{sum} \gets 1$\label{alg:one-pass-global-assign}
		\State \Return (\etree,$1$) % \vari{sum})
	\EndIf
\end{algorithmic}
\end{algorithm}

\begin{Example}\label{example:one-pass}
 Let $\etree$ encode the expression $(x_1 + x_2)(x_1 - x_2) + x_2^2$.  After one pass, \cref{alg:one-pass} would have computed the following weight distribution.  For the two children of the root $+$ node $\etree$, $\etree_\lchild.\wght = \frac{4}{5}$ and $\etree_\rchild.\wght = \frac{1}{5}$.  Similarly, let $\stree$ denote the left-subtree of $\etree_{\lchild}$, $\stree_\lchild.\wght = \stree_\rchild.\wght = \frac{1}{2}$.  This is depicted in~\Cref{fig:expr-tree-T-wght}. %Note that in this example, the sampling probabilities for the children of each inner $+$ node of $\stree$ are equal to one another because both parents have the same number of children, and, in each case, the children of each parent $+$ node share the same $|\coef_i|$.
\end{Example}

\begin{figure}[h!]
	\begin{tikzpicture}[thick, every tree node/.style={default_node, thick, draw=black, black, circle, text width=0.3cm, font=\bfseries, minimum size=0.65cm}, every child/.style={black}, edge from parent/.style={draw, thick},
level 1/.style={sibling distance=0.95cm},
level 2/.style={sibling distance=0.7cm},
%level 2+/.style={sibling distance=0.625cm}
%level distance = 1.25cm,
%sibling distance = 1cm,
%every node/.append style = {anchor=center}
]

	\Tree [.\node(root){$\boldsymbol{+}$};
			\edge [wght_color] node[midway, auto= right, font=\bfseries, gray] {$\bsym{\frac{4}{5}}$}; [.\node[highlight_color](tl){$\boldsymbol{\times}$};
				[.\node(s){$\bsym{+}$};
					\edge[wght_color] node[pos=0.35, left, font=\bfseries, gray]{$\bsym{\frac{1}{2}}$}; [.\node[highlight_color](sl){$\bsym{x_1}$}; ]
					\edge[wght_color] node[pos=0.35, right, font=\bfseries, gray]{$\bsym{\frac{1}{2}}$}; [.\node[highlight_color](sr){$\bsym{x_2}$}; ]
					]
				[.\node(sp){$\bsym{+}$};
					\edge[wght_color] node[pos=0.35, left, font=\bfseries, gray]{$\bsym{\frac{1}{2}}$}; [.\node[highlight_color](spl){$\bsym{x_1}$}; ]
					\edge[wght_color] node[pos=0.35, right, font=\bfseries, gray]{$\bsym{\frac{1}{2}}$}; [.\node[highlight_color](spr){$\bsym{\times}$};
						[.$\bsym{-1}$ ] [.$\bsym{x_2}$ ]
						]
					]
				]
			\edge [wght_color] node[midway, auto=left, font=\bfseries, gray] {$\bsym{\frac{1}{5}}$}; [.\node[highlight_color](tr){$\boldsymbol{\times}$};
				[.$\bsym{x_2}$
					\edge [draw=none]; [.\node[draw=none]{}; ]
					\edge [draw=none]; [.\node[draw=none]{}; ]
				]
				[.$\bsym{x_2}$ ] ]
	]
%	labels for plus node children, with arrows
	\node[left=2pt of sl, highlight_color, inner sep=0pt] (sl-label) {$\stree_\lchild$};
	\draw[highlight_color] (sl) -- (sl-label);
	\node[right=2pt of sr, highlight_color, inner sep=0pt] (sr-label) {$\stree_\rchild$};
	\draw[highlight_color] (sr) -- (sr-label);
	\node[below left=2pt of spl, inner sep=0pt, highlight_color](spl-label) {$\stree_\lchild'$};
	\draw[highlight_color] (spl) -- (spl-label);
	\node[right=2pt of spr, highlight_color, inner sep=0] (spr-label) {$\stree_\rchild'$};
	\draw[highlight_color] (spr) -- (spr-label);
	\node[above left=2pt of tl, inner sep=0pt, highlight_color] (tl-label) {$\etree_\lchild$};
	\draw[highlight_color] (tl) -- (tl-label);
	\node[above right=2pt of tr, highlight_color, inner sep=0pt] (tr-label) {$\etree_\rchild$};
	\node[above = 2pt of root, highlight_color, inner sep=0pt, font=\bfseries] (root-label) {$\etree$};
	\node[above = 2pt of s, highlight_color, inner sep=0pt, font=\bfseries] (s-label) {$\stree$};
	\node[above = 2pt of sp, highlight_color, inner sep=0pt, font=\bfseries] (sp-label) {$\stree'$};
	\draw[highlight_color] (tr) -- (tr-label);
%	\draw[<-|, highlight_color] (s) -- (s-label);
%	\draw[<-|, highlight_color] (sp) -- (sp-label);
%	\draw[<-|, highlight_color]  (root) -- (root-label);
%\node[above right=0.7cm of TR, highlight_color, inner sep=0pt, font=\bfseries] (tr-comment) {$\etree_\rchild$};
%		\draw[<-|, highlight_color] (TR) -- (tr-comment);
	\end{tikzpicture}


%	\begin{tikzpicture}[thick, level distance=1.2cm, level 1/.style={sibling distance= 5cm}, level 2/.style={sibling distance=3cm}, level 3/.style={sibling distance=1.5cm}, level 4/.style={sibling distance= 1cm}, every child/.style={black}]
%		\node[tree_node](root) {$\boldsymbol{+}$}
%			child[red]{node[tree_node](tl) {$\boldsymbol{\times}$}
%				child{node[tree_node] {$\boldsymbol{+}$}
%					child{node[tree_node]{$\boldsymbol{x_1}$}	}
%					child{node[tree_node] {$\boldsymbol{x_2}$}}
%					}
%				child{node[tree_node] {$\boldsymbol{+}$}
%					child{node[tree_node] {$\boldsymbol{x_1}$}}
%						%child[missing]{node[tree_node] {$\boldsymbol{1}$}}
%					child[red]{node[tree_node] {$\boldsymbol{\times}$}
%						child{node[tree_node] {$\boldsymbol{-1}$}}
%						child{node[tree_node] {$\boldsymbol{x_2}$}}
%						}
%					}
%			}
%			child{node[tree_node] {$\boldsymbol{\times}$} edge from parent [red]
%				child{node[tree_node] {$\boldsymbol{x_2}$}}
%				child{node[tree_node] {$\boldsymbol{x_2}$}}
%				};
%		\node[font=\bfseries, red] at (-2.8, -0.2) {$\etree_\lchild.\wght \boldsymbol{= \frac{4}{5} } $};
%	\end{tikzpicture}
	\caption{Weights computed by $\onepass$ in ~\cref{example:one-pass}. 
%\AH{I fixed the labels; @atri, let me know if you would rather have the labels positioned in alternative locations.}
%\AR{Looks good-- thanks!}
}
	\label{fig:expr-tree-T-wght}
\end{figure}


We prove the correctness of Algorithm ~\ref{alg:one-pass} by proving~\Cref{lem:one-pass}:

\begin{proof}[Proof of~\Cref{lem:one-pass}]
We prove the first part of lemma ~\ref{lem:one-pass}, i.e., correctness, by structural induction over the depth $d$ of the binary tree $\etree$.

For the base case, $d = 0$, it is the case that the node is a leaf and therefore by definition ~\ref{def:express-tree} must be a variable or coefficient.  When it is a variable, \textsc{OnePass} returns $1$, and we have in this case that $\polyf(\etree) = X_i = \polyf(\abs{\etree})$ for some $i$ in $[\numvar]$, and this evaluated at all $1$'s indeed gives $1$, verifying the correctness of the returned value of $\abs{\etree}(1,\ldots, 1) = 1$.  When the root is a coefficient, the absolute value of the coefficient is returned, which is indeed $\abs{\etree}(1,\ldots, 1)$.  This proves the base case.
%\AH{The inductive step assumes $k \geq 0$ rather than $k \geq 1$, correct?}
%\AR{yep!}
For the inductive hypothesis, assume that for $d \leq k$ for some $k \geq 0$,~\Cref{lem:one-pass} is true for~\Cref{alg:one-pass}.

Now prove that lemma ~\ref{lem:one-pass} holds for $k + 1$.  Notice that $\etree$ has at most two children, $\etree_\lchild$ and $\etree_\rchild$.  Note also, that for each child, it is the case that $d \leq k$.  Then, by inductive hypothesis, lemma ~\ref{lem:one-pass} holds for each existing child, and we are left with two possibilities for $\etree$.  The first case is when $\etree$ is a $+$ node.  When this happens,~\Cref{alg:one-pass} computes $|T_\lchild|(1,\ldots, 1) + |T_\rchild|(1,\ldots, 1)$ on line ~\ref{alg:one-pass-plus-add} which by definition is $\abs{\etree}(1,\ldots, 1)$ and hence the inductive hypothesis holds in this case.  For the weight computation of the children of $+$, by lines ~\ref{alg:one-pass-plus-add}, ~\ref{alg:one-pass-plus-assign2}, and ~\ref{alg:one-pass-plus-prob} algorithm ~\ref{alg:one-pass} computes $\etree_i.\wght = \frac{|T_i|(1,\ldots, 1)}{|T_\lchild|(1,\ldots, 1) + |T_\rchild|(1,\ldots, 1)}$ which is indeed as claimed.  The second case is when the $\etree.\val = \times$.  By inductive hypothesis, it is the case that both $\abs{\etree_\lchild}\polyinput{1}{1}$ and $\abs{\etree_\rchild}\polyinput{1}{1}$ have been correctly computed.  On line~\ref{alg:one-pass-times-product} algorithm ~\ref{alg:one-pass} then computes the product of the subtree partial values, $|T_\lchild|(1,\ldots, 1) \cdot |T_\rchild|(1,\ldots, 1)$ which by definition is $\abs{\etree}(1,\ldots, 1)$.

%That $\onepass$ makes exactly one traversal of $\etree$ follows by noting for lines ~\ref{alg:one-pass-equality1} and ~\ref{alg:one-pass-equality2} are the checks for the non-base cases, where in each matching exactly one recursive call is made on each of $\etree.\vari{children}$.  For the base cases, lines ~\ref{alg:one-pass-equality3} and ~\ref{alg:one-pass-equality4} both return values without making any further recursive calls.  Since all nodes are covered by the cases, and the base cases cover only leaf nodes, it follows that algorithm ~\ref{alg:one-pass} then terminates after it visits every node exactly one time.

%To conclude, note that when $\etree.\type = +$, the compuatation of $\etree_\lchild.\wght$ and $\etree_\rchild.\wght$ are solely dependent on the correctness of $\abs{\etree}\polyinput{1}{1}$, $\abs{\etree_\lchild}\polyinput{1}{1}$, and $\abs{\etree_\rchild}\polyinput{1}{1}$, which have already been argued to be correct.

\paragraph{Run-time Analysis}
The runtime for \textsc{OnePass} is fairly straight forward.  Note first that each node is visited at most one time.  Second, for each type of node visited, it can be trivially verified that there are only a constant number of operations.  This concludes then with a $O\left(\treesize(\etree)\right)$ runtime.

%Note that line ~\ref{alg:one-pass-equality1}, ~\ref{alg:one-pass-equality2}, and ~\ref{alg:one-pass-equality3} give a constant number of equality checks per node.  Then, for $+$ nodes, lines ~\ref{alg:one-pass-plus-add} and ~\ref{alg:one-pass-plus-prob} perform a constant number of arithmetic operations, while ~\ref{alg:one-pass-plus-assign1} ~\ref{alg:one-pass-plus-assign2}, and ~\ref{alg:one-pass-times-assign3} all have $O(1)$ assignments.  Similarly, when a $\times$ node is visited, lines \ref{alg:one-pass-times-assign1}, \ref{alg:one-pass-times-assign2}, and \ref{alg:one-pass-times-assign3} have $O(1)$ assignments, while line ~\ref{alg:one-pass-times-product} has $O(1)$ product operations per node.  For leaf nodes, ~\cref{alg:one-pass-leaf-assign1} and ~\cref{alg:one-pass-global-assign} are both $O(1)$ assignment.

%Thus, the algorithm visits each node of $\etree$ one time, with a constant number of operations for all of the $+$, $\times$, and leaf nodes, leading to a runtime of $O\left(\treesize(\etree)\right)$, and this completes the proof.
\end{proof}

\qed


\subsection{\sampmon\ Algorithm}
\label{sec:samplemonomial}

%Algorithm ~\ref{alg:sample} takes $\etree$ as input, samples an arbitrary $(\monom, \coef)$ from $\expandtree{\etree}$ with probabilities $\stree_\lchild.\wght$ and $\stree_\rchild.\wght$ for each subtree $\stree$ with $\stree.\type = +$, outputting the tuple $(\monom, \sign(\coef))$.  While one cannot compute $\expandtree{\etree}$ in time better than $O(N^k)$, the algorithm, similar to \textsc{OnePass}, uses a technique on $\etree$ which produces a sample from $\expandtree{\etree}$ without ever materializing $\expandtree{\etree}$.

One way to implement \sampmon\ would be to compute $E(T)$ and then sample from it. However, this would be too time consuming.
%
Instead~\Cref{alg:sample} selects a monomial from $\expandtree{\etree}$ by the following top-down traversal.  For a parent $+$ node, a subtree is chosen over the previously computed weighted sampling distribution.  When a parent $\times$ node is visited, both children are visited.  All variable leaf nodes of the subgraph traversal are added to a set.  Additionally, the product of signs over all coefficient leaf nodes of the subgraph traversal is computed.  The algorithm returns a set of the distinct variables of which the monomial is composed and the monomial's sign.

%\begin{Definition}[TreeSet]
%A TreeSet is a data structure whose elements form a set, each of which are stored in a binary tree.
%\end{Definition}

We will assume the TreeSet data structure to maintain sets with logarithmic time insertion and linear time traversal of its elements.

\subsubsection{Pseudo Code}
See algorithm ~\ref{alg:sample} for the details of $\sampmon$ algorithm.


\begin{algorithm}
	\caption{\sampmon(\etree)}
	\label{alg:sample}
	\begin{algorithmic}[1]
		\Require \etree: Binary Expression Tree
		\Ensure \vari{vars}: TreeSet
		\Ensure \vari{sgn} $\in \{-1, 1\}$
		\Comment{\Cref{alg:one-pass} should have been run before this one} % algorithm ~\ref{alg:sample}}
		\State $\vari{vars} \gets \emptyset$ \label{alg:sample-global1}
		\If{$\etree.\type = +$}\Comment{Sample at every $+$ node}
			\State $\etree_{\vari{samp}} \gets$ Sample from left subtree ($\etree_{\lchild}$) and right subtree ($\etree_{\rchild}$) w.p. $\etree_\lchild.\wght$ and $\etree_\rchild.\wght$. \label{alg:sample-plus-bsamp}
			\State $(\vari{v}, \vari{s}) \gets \sampmon(\etree_{\vari{samp}})$\label{alg:sample-plus-traversal}
%				\State $\vari{vars} \gets \vari{vars} \;\cup \;\{\vari{v}\}$\label{alg:sample-plus-union}
%				\State $\vari{sgn} \gets \vari{sgn} \times \vari{s}$\label{alg:sample-plus-product}
			\State $\Return ~(\vari{v}, \vari{s})$
		\ElsIf{$\etree.\type = \times$}\Comment{Multiply the sampled values of all subtree children}
			\State $\vari{sgn} \gets 1$\label{alg:sample-global2}
			\For {$child$ in $\etree.\vari{children}$}
				\State $(\vari{v}, \vari{s}) \gets \sampmon(child)$
				\State $\vari{vars} \gets \vari{vars} \cup \{\vari{v}\}$\label{alg:sample-times-union}
				\State $\vari{sgn} \gets \vari{sgn} \times \vari{s}$\label{alg:sample-times-product}
			\EndFor
			\State $\Return ~(\vari{vars}, \vari{sgn})$
		\ElsIf{$\etree.\type = numeric$}\Comment{The leaf is a coefficient}
			%\State $\vari{sgn} \gets \vari{sgn} \times sign(\etree.\val)$
			\State $\Return ~\left(\{\}, sign(\etree.\val)\right)$\label{alg:sample-num-return}
		\ElsIf{$\etree.\type = var$}
			%\State $\vari{vars} \gets \vari{vars} \; \cup \; \{\;\etree.\val\;\}\label{alg:sample-var-union}$\Comment{Add the variable to the set}
			\State $\Return~\left(\{\etree.\val\}, 1\right)	$\label{alg:sample-var-return}
		\EndIf
	\end{algorithmic}
\end{algorithm}

We argue the correctness of Algorithm ~\ref{alg:sample} by proving~\Cref{lem:sample}:

\begin{proof}[Proof of~\Cref{lem:sample}]
First, we need to show that $\sampmon$ indeed returns a monomial $\monom$,\footnote{Technically it returns $\var(\monom)$ but for less cumbersome notation we will refer to $\var(\monom)$ simply by $\monom$ in this proof.} such that $(\monom, \coef)$ is in $\expandtree{\etree}$, which we do by induction on the depth of $\etree$.

For the base case, let the depth $d$ of $\etree$ be $0$.  We have that the root node is either a constant $\coef$ for which by line ~\ref{alg:sample-num-return} we return $\{~\}$, or we have that $\etree.\type = \var$ and $\etree.\val = x$, and  by line ~\ref{alg:sample-var-return} we return $\{x\}$.  Both cases sample a monomial%satisfy ~\cref{def:monomial}
, and the base case is proven.

For the inductive hypothesis, assume that for $d \leq k$ for some $k \geq 0$, that it is indeed the case that $\sampmon$ returns a monomial.

For the inductive step, let us take a tree $\etree$ with $d = k + 1$.  Note that each child has depth $d \leq k$, and by inductive hypothesis both of them return a valid monomial.  Then the root can be either a $+$ or $\times$ node.  For the case of a $+$ root node, line ~\ref{alg:sample-plus-bsamp} of $\sampmon$ will choose one of the children of the root.  Since by inductive hypothesis it is the case that a monomial is being returned from either child, and only one of these monomials is selected, we have for the case of $+$ root node that a valid monomial is returned by $\sampmon$.  When the root is a $\times$ node, lines ~\ref{alg:sample-times-union} and ~\ref{alg:sample-times-product} multiply the monomials returned by the two children of the root, and it is trivial to see that %by definition ~\ref{def:monomial} 
the product of two monomials is also a monomial, which means that $\sampmon$ returns a valid monomial for the $\times$ root node, thus concluding the fact that $\sampmon$ indeed returns a monomial.

%Note that for any monomial sampled by algorithm ~\ref{alg:sample}, the nodes traversed form a subgraph of $\etree$ that is \textit{not} a subtree in the general case.  We thus seek to prove that the subgraph traversed produces the correct probability corresponding to the monomial sampled.

We will next prove by induction on the depth $d$ of $\etree$ that the $(\monom,\coef)$ returned by $\sampmon$ has a probability %`that is in accordance with the monomial sampled,
 $\frac{|\coef|}{\abs{\etree}\polyinput{1}{1}}$.

For the base case $d = 0$, by definition ~\ref{def:express-tree} we know that the root has to be either a coefficient or a variable.  For either case, the probability of the value returned is $1$ since there is only one value to sample from.  When the root is a variable $x$ the algorithm correctly returns $(\{x\}, 1 )$.  When the root is a coefficient, \sampmon ~correctly returns $(\{~\}, sign(\coef_i))$.

For the inductive hypothesis, assume that for $d \leq k$ and $k \geq 0$ $\sampmon$ indeed samples $\monom$ in $(\monom, \coef)$ in $\expandtree{\etree}$ with probability $\frac{|\coef|}{\abs{\etree}\polyinput{1}{1}}$.%bove is true.%lemma ~\ref{lem:sample} is true.

We prove now, that when $d = k + 1$ the inductive step holds.  It is the case that the root of $\etree$ has up to two children $\etree_\lchild$ and $\etree_\rchild$.  Since $\etree_\lchild$ and $\etree_\rchild$ are both depth $d \leq k$, by inductive hypothesis, $\sampmon$ will sample both monomials $\monom_\lchild$ in $(\monom_\lchild, \coef_\lchild)$ of $\expandtree{\etree_\lchild}$ and $\monom_\rchild$ in $(\monom_\rchild, \coef_\rchild)$ of $\expandtree{\etree_\rchild}$, from $\etree_\lchild$ and $\etree_\rchild$ with probability $\frac{|\coef_\lchild|}{\abs{\etree_\lchild}\polyinput{1}{1}}$ and $\frac{|\coef_\rchild|}{\abs{\etree_\rchild}\polyinput{1}{1}}$.

Then the root has to be either a $+$ or $\times$ node.

Consider the case when the root is $\times$.  Note that we are sampling a term from $\expandtree{\etree}$.  Consider $(\monom, \coef)$ in $\expandtree{\etree}$, where $\monom$ is the sampled monomial.  Notice also that it is the case that $\monom = \monom_\lchild \times \monom_\rchild$, where $\monom_\lchild$ is coming from $\etree_\lchild$ and $\monom_\rchild$ from $\etree_\rchild$.  The probability that \sampmon$(\etree_{\lchild})$ returns $\monom_\lchild$ is $\frac{|\coef_{\monom_\lchild}|}{|\etree_\lchild|(1,\ldots, 1)}$ and $\frac{|\coef_{\monom_\lchild}|}{\abs{\etree_\rchild}\polyinput{1}{1}}$ for $\monom_\rchild$.  Since both $\monom_\lchild$ and $\monom_\rchild$ are sampled with independent randomness, the final probability for sample $\monom$ is then $\frac{|\coef_{\monom_\lchild}| \cdot |\coef_{\monom_R}|}{|\etree_\lchild|(1,\ldots, 1) \cdot |\etree_\rchild|(1,\ldots, 1)}$.  For $(\monom, \coef)$ in \expandtree{\etree}, it is indeed the case that $|\coef_i| = |\coef_{\monom_\lchild}| \cdot |\coef_{\monom_\rchild}|$ and that $\abs{\etree}(1,\ldots, 1) = |\etree_\lchild|(1,\ldots, 1) \cdot |\etree_\rchild|(1,\ldots, 1)$, and therefore $\monom$ is sampled with correct probability $\frac{|\coef_i|}{\abs{\etree}(1,\ldots, 1)}$.

For the case when $\etree.\val = +$, \sampmon ~will sample monomial $\monom$ from one of its children.  By inductive hypothesis we know that any $\monom_\lchild$ in $\expandtree{\etree_\lchild}$ and any $\monom_\rchild$ in $\expandtree{\etree_\rchild}$ will both be sampled with correct probability $\frac{|\coef_{\monom_\lchild}|}{\etree_{\lchild}(1,\ldots, 1)}$ and $\frac{|\coef_{\monom_\rchild}|}{|\etree_\rchild|(1,\ldots, 1)}$, where either $\monom_\lchild$ or $\monom_\rchild$ will equal $\monom$, depending on whether $\etree_\lchild$ or $\etree_\rchild$ is sampled.  Assume that $\monom$ is sampled from $\etree_\lchild$, and note that a symmetric argument holds for the case when $\monom$ is sampled from $\etree_\rchild$.  Notice also that the probability of choosing $\etree_\lchild$ from $\etree$ is $\frac{\abs{\etree_\lchild}\polyinput{1}{1}}{\abs{\etree_\lchild}\polyinput{1}{1} + \abs{\etree_\rchild}\polyinput{1}{1}}$ as computed by $\onepass$.  Then, since $\sampmon$ goes top-down, and each sampling choice is independent (which follows from the randomness in the root of $\etree$ being independent from the randomness used in its subtrees), the probability for $\monom$ to be sampled from $\etree$ is equal to the product of the probability that $\etree_\lchild$ is sampled from $\etree$ and $\monom$ is sampled in $\etree_\lchild$, and
\begin{align*}
&P(\sampmon(\etree) = \monom) = \\
&P(\sampmon(\etree_\lchild) = \monom) \cdot P(SampledChild(\etree) = \etree_\lchild)\\
&= \frac{|\coef_\monom|}{|\etree_\lchild|(1,\ldots, 1)} \cdot \frac{\abs{\etree_\lchild}(1,\ldots, 1)}{|\etree_\lchild|(1,\ldots, 1) + |\etree_\rchild|(1,\ldots, 1)}\\
&= \frac{|\coef_\monom|}{\abs{\etree}(1,\ldots, 1)},
\end{align*}
and we obtain the desired result.


\paragraph{Run-time Analysis}
We now bound the number of recursive calls in $\sampmon$ by $O\left(k\cdot depth(\etree)\right)$. Note that a sampled monomial corresponds to a subtree of $\etree$.  Take an arbitrary sample subgraph of expression tree $\etree$ and note that since every monomial has degree $k$, the subgraph has $O(k)$ leaves and the number of nodes in each layer as one goes from leaves to the root can only go down. Since the sub-graph has depth at most $depth(\etree)$ and that each level has $O(k)$ nodes, the sub-graph as $O(k\cdot depth(\etree))$ nodes in it. Since each node in the sub-graph corresponds to a recursive call we get the desired bound.
%of degree $k$ and pick an arbitrary level $i$.  Call the number of $\times$ nodes in this level $y_i$, and the total number of nodes $x_i$.  Given that both children of a $\times$ node are traversed in $\sampmon$ while only one child is traversed for a $+$ parent node, note that the number of nodes on level $i + 1$ in the general case is at most $y_i + x_i$, and the increase in the number of nodes from level $i$ to level $i + 1$ is upper bounded by $x_{i + 1} - x_i \leq y_i$.

%Now, we prove by induction on the depth $d$ of tree $\etree$ the following claim.
%\begin{Claim}\label{claim:num-nodes-level-i}
%The number of nodes in a sample subgraph of expression tree $\etree$ at arbitrary level $i$ is bounded by the count of $\times$ nodes in levels $[0, i - 1] + 1$.
%\end{Claim}

%\begin{proof}[Proof of Claim ~\ref{claim:num-nodes-level-i}]
%For the base case, $d = 0$, we have the following cases.  For both cases, when $\etree.\type = \tnum$ and when $\etree.\type = \var$, it is trivial to see that the number of nodes on level $0$ = 1, which satisfies the identity of ~\cref{claim:num-nodes-level-i}, i.e., the number of $\times$ nodes in previous levels $+ 1$ = 1, and the base case is upheld.

%Assume that for $d \leq k$ for $k \geq 0$ that ~\cref{claim:num-nodes-level-i} holds.

%The inductive step is to show that for arbitrary $\etree$ with depth = $d + 1 \leq k + 1$ the claim still holds.  Note that we have two possibilities for the value of $\etree$.  First, $\etree.\type = +$, and it is the case in ~\cref{alg:sample-plus-traversal} that only one of $\etree_\lchild$ or $\etree_\rchild$ are part of the subgraph traversed by $\sampmon$.  By inductive hypothesis, both subtrees satisfy the claim.  Since only one child is part of the subgraph, there is exactly one node at level 1, which, as in the base case analysis, satisfies ~\cref{claim:num-nodes-level-i}.  For the second case, $\etree.\type = \times$, $\sampmon$ traverses both children, and the number of nodes at level $1$ in the subgraph is then $2$, which satisfies ~\cref{claim:num-nodes-level-i} since the sum of $\times$ nodes in previous levels (level $0$) is $1$, and $1 + 1 = 2$, proving the claim.
%\end{proof}

%\qed

%By ~\cref{def:degree}, a sampled monomial will have $O(k)$ $\times$ nodes, and this along with ~\cref{claim:num-nodes-level-i} implies $O(k)$ nodes at $\leq$ $depth(\etree)$ levels of the $\sampmon$ subgraph, bounding the number of recursive calls to $O(k \cdot depth(\etree))$.

%Globally, lines ~\ref{alg:sample-global1} and ~\ref{alg:sample-global2} are $O(1)$ time.  For the $+$ node, line ~\ref{alg:sample-plus-bsamp} has $O(1)$ time by the fact that $\etree$ is binary.  Line ~\ref{alg:sample-plus-union} has $O(\log{k})$ time by nature of the TreeSet data structure and the fact that by definition any monomial sampled from $\expandtree{\etree}$ has degree $\leq k$ and hence at most $k$ distinct variables, which in turn implies that the TreeSet has $\leq k$ elements in it at any time.

%Finally, line ~\ref{alg:sample-times-product} is in $O(1)$ for a product and an assignment operation.  When a times node is visited, the same union, product, and assignment operations take place, and we again have $O(\log{k})$ runtime.  When a variable leaf node is traversed, the same union operation occurs with $O(\log{k})$ runtime, and a constant leaf node has the above mentioned product and assignment operations.  Thus for each node visited, we have $O(\log{k})$ runtime, and the final runtime for $\sampmon$ is $O(\log{k} \cdot k \cdot depth(\etree))$.

It is easy to check that except for~\Cref{alg:sample-times-union}, all other lines take $O(1)$ time. Thus, overall all lines except for~\Cref{alg:sample-times-union} take $O(k\cdot depth(\etree))$ time. Now consider all executions of~\Cref{alg:sample-times-union} together. We note that at each level we will be adding a given set of variables to some set at most once: since the sum of the sizes of the sets at a given level is at most $k$, each level involves $O(k\log{k})$ time. Thus, overall all executions of~\Cref{alg:sample-times-union} takes $O(k\log{k}\cdot depth(T))$ time, as desired.
\end{proof}
\qed

\subsection{Experimental results}
\label{sec:experiments}

\input{experiments}

%\AR{Experimental stuff about BIDB should go in here}
%%%%%%%%%%%%%%%%%%%%%%%