While we would like to take advantage of the space efficiency gained in using a circuit \circuit instead an expression tree \etree, we do not know that such a method exists when computing a sample of the input polynomial representation.
The efficiency gains of circuits over trees is found in the capability of circuits to only require space for each \emph{distinct} term in the compressed representation. This saves space in such polynomials containing non-distinct terms multiplied or added to each other, e.g., $x^4$. However, to avoid biased sampling, it is imperative to sample from both inputs of a multiplication gate, independently, which is indeed the approach of \sampmon.
We first need to show that $\sampmon$ indeed returns a monomial $\monom$,\footnote{Technically it returns $\var(\monom)$ but for less cumbersome notation we will refer to $\var(\monom)$ simply by $\monom$ in this proof.} such that $(\monom, \coef)$ is in $\expansion{\circuit}$, which we do by induction on the depth of $\circuit$.
For the base case, let the depth $d$ of $\circuit$ be $0$. We have that the root node is either a constant $\coef$ for which by line ~\ref{alg:sample-num-return} we return $\{~\}$, or we have that $\circuit.\type=\var$ and $\circuit.\val= x$, and by line ~\ref{alg:sample-var-return} we return $\{x\}$. Both cases sample a monomial%satisfy ~\cref{def:monomial}
, and the base case is proven.
For the inductive hypothesis, assume that for $d \leq k$ for some $k \geq0$, that it is indeed the case that $\sampmon$ returns a monomial.
For the inductive step, let us take a circuit $\circuit$ with $d = k +1$. Note that each input has depth $d \leq k$, and by inductive hypothesis both of them return a valid monomial. Then the root can be either a $\circplus$ or $\circmult$ node. For the case of a $\circplus$ root node, line ~\ref{alg:sample-plus-bsamp} of $\sampmon$ will choose one of the inputs of the root. By inductive hypothesis it is the case that a monomial in \expansion(\circuit) is being returned from either input. Then it follows that for the case of $+$ root node a valid monomial is returned by $\sampmon$. When the root is a $\circmult$ node, line ~\ref{alg:sample-times-union}%and ~\ref{alg:sample-times-product} multiply
computes the set union of the monomials returned by the two inputs of the root, and it is trivial to see
%by definition ~\ref{def:monomial}
%the product of two monomials is also a monomial, and
by ~\Cref{def:expand-circuit} that \monom is a valid monomial in some $(\monom, \coef)\in\expansion{\circuit}$.
We will next prove by induction on the depth $d$ of $\circuit$ that the $(\monom,\coef)\in\expansion{\circuit}$ is the \monom returned by $\sampmon$ with a probability %`that is in accordance with the monomial sampled,
$\frac{|\coef|}{\abs{\circuit}\polyinput{1}{1}}$.
For the base case $d =0$, by definition ~\ref{def:express-tree} we know that the root has to be either a coefficient or a variable. For either case, the probability of the value returned is $1$ since there is only one value to sample from. When the root is a variable $x$ the algorithm correctly returns $(\{x\}, 1)$. When the root is a coefficient, \sampmon ~correctly returns $(\{~\}, sign(\coef_i))$.
For the inductive hypothesis, assume that for $d \leq k$ and $k \geq0$$\sampmon$ indeed samples $\monom$ in $(\monom, \coef)$ in $\expansion{\circuit}$ with probability $\frac{|\coef|}{\abs{\circuit}\polyinput{1}{1}}$.%bove is true.%lemma ~\ref{lem:sample} is true.
We prove now for $d = k +1$ the inductive step holds. It is the case that the root of $\circuit$ has up to two inputs $\circuit_\linput$ and $\circuit_\rinput$. Since $\circuit_\linput$ and $\circuit_\rinput$ are both depth $d \leq k$, by inductive hypothesis, $\sampmon$ will sample both monomials $\monom_\lchild$ in $(\monom_\lchild, \coef_\lchild)$ of $\expansion{\circuit_\linput}$ and $\monom_\rchild$ in $(\monom_\rchild, \coef_\rchild)$ of $\expansion{\circuit_\rinput}$, from $\circuit_\linput$ and $\circuit_\rinput$ with probability $\frac{|\coef_\lchild|}{\abs{\circuit_\linput}\polyinput{1}{1}}$ and $\frac{|\coef_\rchild|}{\abs{\circuit_\rinput}\polyinput{1}{1}}$.
The root has to be either a $\circplus$ or $\circmult$ node.
Consider the case when the root is $\circmult$. Note that we are sampling a term from $\expansion{\circuit}$. Consider $(\monom, \coef)$ in $\expansion{\circuit}$, where $\monom$ is the sampled monomial. Notice also that it is the case that $\monom=\monom_\lchild\circmult\monom_\rchild$, where $\monom_\lchild$ is coming from $\circuit_\linput$ and $\monom_\rchild$ from $\circuit_\rinput$. The probability that \sampmon$(\circuit_{\lchild})$ returns $\monom_\lchild$ is $\frac{|\coef_{\monom_\lchild}|}{|\circuit_\linput|(1,\ldots, 1)}$ and $\frac{|\coef_{\monom_\rchild}|}{\abs{\circuit_\rinput}\polyinput{1}{1}}$ for $\monom_\rchild$. Since both $\monom_\lchild$ and $\monom_\rchild$ are sampled with independent randomness, the final probability for sample $\monom$ is then $\frac{|\coef_{\monom_\lchild}| \cdot |\coef_{\monom_\rchild}|}{|\circuit_\linput|(1,\ldots, 1)\cdot |\circuit_\rinput|(1,\ldots, 1)}$. For $(\monom, \coef)$ in \expansion{\circuit}, it is indeed the case that $|\coef| = |\coef_{\monom_\lchild}| \cdot |\coef_{\monom_\rchild}|$ and that $\abs{\circuit}(1,\ldots, 1)= |\circuit_\linput|(1,\ldots, 1)\cdot |\circuit_\rinput|(1,\ldots, 1)$, and therefore $\monom$ is sampled with correct probability $\frac{|\coef|}{\abs{\circuit}(1,\ldots, 1)}$.
For the case when $\circuit.\val=\circplus$, \sampmon ~will sample monomial $\monom$ from one of its inputs. By inductive hypothesis we know that any $\monom_\lchild$ in $\expansion{\circuit_\linput}$ and any $\monom_\rchild$ in $\expansion{\circuit_\rinput}$ will both be sampled with correct probability $\frac{|\coef_{\monom_\lchild}|}{\circuit_{\lchild}(1,\ldots, 1)}$ and $\frac{|\coef_{\monom_\rchild}|}{|\circuit_\rinput|(1,\ldots, 1)}$, where either $\monom_\lchild$ or $\monom_\rchild$ will equal $\monom$, depending on whether $\circuit_\linput$ or $\circuit_\rinput$ is sampled. Assume that $\monom$ is sampled from $\circuit_\linput$, and note that a symmetric argument holds for the case when $\monom$ is sampled from $\circuit_\rinput$. Notice also that the probability of choosing $\circuit_\linput$ from $\circuit$ is $\frac{\abs{\circuit_\linput}\polyinput{1}{1}}{\abs{\circuit_\linput}\polyinput{1}{1}+\abs{\circuit_\rinput}\polyinput{1}{1}}$ as computed by $\onepass$. Then, since $\sampmon$ goes top-down, and each sampling choice is independent (which follows from the randomness in the root of $\circuit$ being independent from the randomness used in its subtrees), the probability for $\monom$ to be sampled from $\circuit$ is equal to the product of the probability that $\circuit_\linput$ is sampled from $\circuit$ and $\monom$ is sampled in $\circuit_\linput$, and
It is easy to check that except for lines~\ref{alg:sample-times-union} and~\ref{alg:sample-plus-bsamp}, all lines take $O(1)$ time. For \Cref{alg:sample-times-uinon}, consider an execution of~\Cref{alg:sample-times-union}. We note that we will be adding a given set of variables to some set at most once: since the sum of the sizes of the sets at a given level is at most $\degree(\circuit)$, each gate visited takes $O(\log{\degree(\circuit)})$. For \Cref{alg:sample-plus-bsamp}, note that we pick $\circuit_\linput$ with probability $\frac a{a+b}$ where $a=\circuit.\vari{Lweight}$ and $b=\circuit.\vari{Rweight}$. We can implement this step by picking a random number $r\in[a+b]$ and then checking if $r\le a$. It is easy to check that $a+b\le\abs{\circuit}(1,\dots,1)$. This means we need to add and compare $\log{\abs{\circuit}(1,\ldots, 1)}$-bit numbers, which can certainly be done in time $\multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}$ (note that this is an over-estimate).
% we have $> O(1)$ time when $\abs{\circuit}(1,\ldots, 1) > \size(\circuit)$. when this is the case that for each sample, we have $\frac{\log{\abs{\circuit}(1,\ldots, 1)}}{\log{\size(\circuit)}}$ operations, since we need to read in and then compare numbers of of $\log{{\abs{\circuit}(1,\ldots, 1)}}$ bits.
Denote \cost(\circuit) (\Cref{eq:cost-sampmon}) to be an upper bound of the number of nodes visited by \sampmon. Then the runtime is $O\left(\cost(\circuit)\cdot\log{\degree(\circuit)}\cdot\multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}\right)$.
We now bound the number of recursive calls in $\sampmon$ by $O\left((\degree(\circuit)+1)\right.$$\left.\cdot\right.$$\left.\depth(\circuit)\right)$, which by the above will prove the claimed runtime.
Let \cost$(\cdot)$ be a function that models an upper bound on the number of gates that can be visited in the run of \sampmon. We define \cost$(\cdot)$ recursively as follows.
First note that the number of gates visited in \sampmon is $\leq\cost(\circuit)$. To show that \Cref{eq:cost-sampmon} upper bounds the number of nodes visited by \sampmon, note that when \sampmon visits a gate such that \circuit.\type$=\circmult$, line ~\ref{alg:sample-times-for-loop} visits each input of \circuit, as defined in (\ref{eq:cost-sampmon}). For the case when \circuit.\type$=\circplus$, line ~\ref{alg:sample-plus-bsamp} visits exactly one of the input gates, which may or may not be the subcircuit with the maximum number of gates traversed, which makes \cost$(\cdot)$ an upperbound. Finally, it is trivial to see that when \circuit.\type$\in\{\var, \tnum\}$, i.e., a source gate, that only one gate is visited.
Note that \Cref{eq:strict-upper-bound} implies the claimed runtime. We prove \Cref{eq:strict-upper-bound} for the number of gates traversed in \sampmon using induction over $\depth(\circuit)$. Recall that \reduce has imposed the invariant that all subcircuits \subcircuit in \circuit must have $\subcircuit.\degval\geq1$.
For the base case $\degree(\circuit)=\depth(\circuit)=0$, $\cost(\circuit)=1$, and it is trivial to see that the inequality $2\degree(\circuit)\cdot\depth(\circuit)+1\geq\cost(\circuit)$ holds.
For the inductive hypothesis, we assume the bound holds for a circuit where $\ell\geq\depth(\circuit)\geq1$.
Now consider the case when \sampmon has an arbitrary circuit \circuit input with $\depth(\circuit)=\ell+1$. By definition \circuit.\type$\in\{\circplus, \circmult\}$. Note that since $\depth(\circuit)\geq2$, \circuit must have inputs. Further we know that by the inductive hypothesis the inputs $\circuit_i$ for $i \in\{\linput, \rinput\}$ of the sink gate \circuit uphold the bound
It is also true that $\depth(\circuit_\linput)\leq\depth(\circuit)-1$ and $\depth(\circuit_\rinput)\leq\depth(\circuit)-1$.
If \circuit.\type$=\circplus$, then $\degree(\circuit)=\max\left(\degree(\circuit_\linput), \degree(\circuit_\rinput)\right)$. Otherwise \circuit.\type = $\circmult$ and $\degree(\circuit)=\degree(\circuit_\linput)+\degree(\circuit_\rinput)$. In either case it is true that $\depth(\circuit)=\max(\depth(\circuit_\linput), \depth(\circuit_\rinput))+1$.
Note that by the \emph{reduced} invariant of \reduce, a circuit \circuit with $\depth(\circuit)\geq1$ will always have at least one input with $\degree(\circuit_i)\geq1$. Thus, \Cref{eq:times-lhs-middle-step1} follows, and the inequality is upheld.
Now to justify (\ref{eq:times-rhs}) which holds for the following reasons. First, the RHS%\Cref{eq:times-rhs}
is the result of \Cref{eq:cost-sampmon} when $\circuit.\type=\circmult$. The LHS %\Cref{eq:times-middle}
is then produced by substituting the upperbound of (\ref{eq:ih-bound-cost}) for each $\cost(\circuit_i)$, trivially establishing the upper bound of (\ref{eq:times-rhs}). This proves \Cref{eq:strict-upper-bound} for the $\circmult$ case.
For the case when \circuit.\type$=\circplus$, substituting values yields
Since $\degree_{\max}\cdot\depth_{\max}\geq\degree(\circuit_i)\cdot\depth(\circuit_i),$ the following upper bound holds for the RHS of (\ref{eq:plus-middle}):
Substituting the upperbound (LHS) of (\ref{eq:plus-middle-expanded}) in for the RHS of (\ref{eq:plus-middle}) we obtain the following for (\ref{eq:plus-middle}):
As in the $\circmult$ case the \emph{reduced} invariant of \reduce implies that $\degree_{\max}\geq1$, and (\ref{eq:plus-upper-bound-final}) follows. This proves (\ref{eq:plus-middle}).
Similar to the case of $\circuit.\type=\circmult$, (\ref{eq:plus-rhs}) follows by equations $(\ref{eq:cost-sampmon})$ and $(\ref{eq:ih-bound-cost})$.
This proves (\ref{eq:strict-upper-bound}) for the $\circplus$ case, as desired. % and thus the claimed $O(k\log{k}\cdot \frac{\log{\abs{\circuit}(1,\ldots, 1)}}{\size(\circuit)}\cdot\depth(\circuit))$ runtime for $k = \degree(\circuit)$ follows.