Please note that it is \textit{assumed} that the original call to \onepass consists of a call on an input circuit \circuit such that the values of members \prt, \lwght and \rwght have been initialized to Null across all gates.
For technical reasons, we require the invariant that every subcircuit \subcircuit corresponding to an internal gate of \circuit has $\degree\left(\subcircuit\right)\geq1$. \revision{\textbf{AARON:} This is now trivially satisfied by the new definition of $\deg(\circuit)$ so please update this part to remove the stuff on $\reduce$. --Atri} To ensure this, auxiliary algorithm ~\ref{alg:reduce} (\reduce) is called to perform any rewrites to \circuit, where an equivalent circuit \circuit' is created and returned by iteratively combining non-variable leaf nodes bottom-up until a parent node is reached which has an input \subcircuit whose subcircuit contains at least one leaf of type \var. It is trivial to see in such a case that $\subcircuit\equiv\subcircuit'$, and this implies $\circuit\equiv\circuit'$.
In $O(\size(\circuit))$, algorithm \reduce inspects input circuit \circuit and outputs an equivalent version \circuit' of \circuit such that all subcircuits \subcircuit of \circuit' have $\degree(\subcircuit)\geq1$.
\end{Lemma}
\begin{proof}[Proof of \Cref{lem:reduce}]
~\paragraph{\reduce correctness}
Note that for a source gate \gate, only when $\gate.\type=\var$ is it the case that $\degree(\gate)=1$, and otherwise $\degree(\gate)=0$. Lines~\ref{alg:reduce-add-deg} and~\ref{alg:reduce-no-deg} compute \gate.\degval.
We prove an equivalent circuit \circuit' by induction over the iteration of \topord. For the base case, consider when we have one node. In such a case, no rewriting occurs, and \reduce returns \circuit. It is trivial to note that $\circuit\equiv\circuit$.
For the inductive hypothesis, we assume that for $k \geq0$ nodes in \topord, the modified circuit $\circuit_k' \equiv\circuit_k$, where $\circuit_k'$ denotes the circuit at the end of iteration $k$. Similarly, when discussing \Cref{alg:reduce} pseudocode, $\gate_{k}$ denotes the gate in position $k$ of \topord, and $\gate_{k_\linput}$ ($\gate_{k_\rinput}$) denotes the left (right) input of $\gate_{k}$.
We now prove for $k +1$ gates in \topord that $\circuit_{k +1}' \equiv\circuit_{k +1}$. Note that if the gate $\gate_{k +1}$ is a source node, then this is again the base case and we are finished. If $\gate_{k +1}$ is an internal node, then $\gate_{k +1}.\type$ must either be $\circmult$ or $\circplus$.
When $\gate_{k +1}$ is $\circmult$, then it is the case that either $\degree(\gate_{{k +1}_\linput})\geq1$ or $\gate_{{k +1}_\linput}.\type$ is $\tnum$ and likewise for $\gate_{{k +1}_\rinput}$. There are then four possibilities, only one of which will prompt a rewrite, namely when we have that both inputs have $\degree(\gate_{{k +1}_i})=0$. In such a case, $\gate_{k +1}.\val\gets\gate_{{k +1}_\linput}.\val\times\gate_{{k +1}_\rinput}.\val$, and the inputs are deleted. Note that since $\gate_{{k +1}_\linput}.\type=\gate_{{k +1}_\rinput}.\type=\tnum$ that we have two constants being multiplied, and that for subcircuit $\subcircuit=(\times, \tnum_1, \tnum_2)$ and $\tnum' =\tnum_1\times\tnum_2$, $\polyf(\subcircuit)=\polyf(\tnum')$ which implies that for the rewritten \subcircuit', $\subcircuit\equiv\subcircuit'$.
A analogous argument applies when $\gate_{k +1}.\type$ is $\circplus$.\qed
\paragraph{\reduce Run-time Analysis}.
$O(\size(\circuit))$ trivially follows by the single iterative pass over the \topord of \circuit, where, as can be seen in lines~\ref{alg:reduce-var},~\ref{alg:reduce-num},~\ref{alg:reduce-mult}, and~\ref{alg:reduce-plus} a constant number of operations are performed on each node.\qed
\end{proof}
\subsection{$\onepass$ Example}
\begin{Example}\label{example:one-pass}
Let $\etree$ encode the expression $(X_1+ X_2)(X_1- X_2)+ X_2^2$. After one pass, \cref{alg:one-pass-iter} would have computed the following weight distribution. For the two inputs of the root $+$ node $\etree$, $\etree.\lwght=\frac{4}{5}$ and $\etree.\rwght=\frac{1}{5}$. Similarly, let $\stree$ denote the left-subtree of $\etree_{\lchild}$, $\stree.\lwght=\stree.\rwght=\frac{1}{2}$. This is depicted in~\Cref{fig:expr-tree-T-wght}.
\end{Example}
\begin{figure}[h!]
\begin{tikzpicture}[thick, every tree node/.style={default_node, thick, draw=black, black, circle, text width=0.3cm, font=\bfseries, minimum size=0.65cm}, every child/.style={black}, edge from parent/.style={draw, thick},
\subsection{Proof of ~\Cref{lem:one-pass}}\label{sec:proof-one-pass}
\paragraph{\onepass Correctness}
We prove the correct computation of \prt, \lwght, \rwght values on \circuit by induction over the number of iterations in line ~\ref{alg:one-pass-loop} over the topological order \topord of the input circuit \circuit. Note that \topord is the standard definition of a topological ordering over the DAG structure of \circuit.
For the base case, we have only one gate, which by definition is a source gate and must be either \var or \tnum. In this case, as per \Cref{eq:T-all-ones}, lines ~\ref{alg:one-pass-var} and ~\ref{alg:one-pass-num} correctly compute \circuit.\prt as $1$ and \circuit.\val respectively.
For the inductive hypothesis, assume that \onepass correctly computes \subcircuit.\prt, \subcircuit.\lwght, and \subcircuit.\rwght for all gates \gate in \circuit with $k > 0$ iterations over \topord.
We now prove for $k +1$ iterations that \onepass correctly computes the \prt, \lwght, and \rwght values for each gate $\gate_\vari{i}$ in \circuit. %By the hypothesis the first $k$ gates (alternatively \textit{iterations}) have correctly computed values.
Note that the $\gate_\vari{k +1}$ must be in the last ordering of all gates $\gate_\vari{i}$ for $i \in[k +1]$. It is also the case that $\gate_{k+1}$ has two inputs. Finally, note that for \size(\circuit) > 1, if $\gate_{k+1}$ is a leaf node, we are back to the base case. Otherwise $\gate_{k +1}$ is an internal node $\gate_\vari{s}.\type=\circplus$ or $\gate_\vari{s}.\type=\circmult$.
When $\gate_{k+1}.\type=\circplus$, then by line ~\ref{alg:one-pass-plus}$\gate_{k+1}$.\prt$=\gate_{{k+1}_\lchild}$.\prt$+\gate_{{k+1}_\rchild}$.\prt, a correct computation, as per \Cref{eq:T-all-ones}. Further, lines ~\ref{alg:one-pass-lwght} and ~\ref{alg:one-pass-rwght} compute $\gate_{{k+1}}.\lwght=\frac{\gate_{{k+1}_\lchild}.\prt}{\gate_{{k+1}}.\prt}$ and analogously for $\gate_{{k+1}}.\rwght$. Note that all values needed for each computation have been correctly computed by the I.H.
When $\gate_{k+1}.\type=\circmult$, then line ~\ref{alg:one-pass-mult} computes $\gate_{k+1}.\prt=\gate_{{k+1}_\lchild.\prt}\circmult\gate_{{k+1}_\rchild}.\prt$, which indeed is correct, as per \Cref{eq:T-all-ones}.
It is known that $\topord(G)$ is computable in linear time. Next, each of the $\size(\circuit)$ iterations of the loop in ~\Cref{alg:one-pass-loop} take $O\left(\multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}\right)$ time. It is easy to see that all numbers that the algorithm computes is at most $\abs{\circuit}(1,\dots,1)$. Hence, by definition each such operation takes $\multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}$ time, which proves the claimed runtime.
%In general it is known that an arithmetic computation which requires $M$ bits takes $O(\frac{\log{M}}{\log{N}})$ time for an input size $N$. Since each of the arithmetic operations at a given gate has a bit size of $O(\log{\abs{\circuit}(1,\ldots, 1)})$, thus, we obtain the general runtime of $O\left(\size(\circuit)\cdot \frac{\log{\abs{\circuit}(1,\ldots, 1)}}{\log{\size(\circuit)}}\right)$.
\paragraph{Sufficient condition for $\abs{\circuit}(1,\ldots, 1)$ to be size $O(N)$}
For our runtime results to be relevant, it must be the case that the sum of the coefficients computed by \onepass is indeed size $O(N)$ since there are $O(\log{N})$ bits in the RAM model where $N$ is the size of the input. The size of the input here is \size(\circuit). We show that when \size$(\circuit_\linput)= N_\linput$, \size$(\circuit_\rinput)= N_\rinput$, where $N_\linput+ N_\rinput\leq N$, this is indeed the case.
\begin{proof}%[Proof of $\abs{\circuit}(1,\ldots, 1)$ is size $O(N)$]
To prove this result, we start by proving that $\abs{\circuit}(1,\ldots, 1)\leq N^{2^k }$ for \degree(\circuit) $= k$.
For the base case, we have that \depth(\circuit) $=0$, and there can only be one node which must contain a coefficient (or constant) of $1$. In this case, $\abs{\circuit}(1,\ldots, 1)=1$, and \size(\circuit) $=1$, and it is true that $\abs{\circuit}(1,\ldots, 1)=1\leq N^{2^k}=1^{2^0}=1$.
Assume for $\ell > 0$ an arbitrary circuit \circuit of $\depth(\circuit)\leq\ell$ that it is true that $\abs{\circuit}(1,\ldots, 1)\leq N^{2^k }$.% for $k \geq 1$ when \depth(C) $\geq 1$.
For the inductive step we consider a circuit \circuit such that $\depth(\circuit)\leq\ell+1$. The sink can only be either a $\circmult$ or $\circplus$ gate. Consider when sink node is $\circmult$. Let $k_\linput, k_\rinput$ denote \degree($\circuit_\linput$) and \degree($\circuit_\rinput$) respectively. Note that this case does not require the constraint on $N_\linput$ or $N_\rinput$.
We derive the upperbound of \cref{eq:sumcoeff-times-upper} by noting that the maximum value of the LHS occurs when both the base and exponent are maximized.
For the case when the sink node is a $\circplus$ node, then we have
Similar to the $\circmult$ case, \cref{eq:sumcoeff-plus-upper} upperbounds its LHS by the fact that the maximum base and exponent combination is always greater than or equal to the sum of lower base/exponent combinations. The final equality is true given the constraint over the inputs.
Since $\abs{\circuit}(1,\ldots, 1)\leq N^{2^k}$ for all circuits such that all $\circplus$ gates share at most one gate with their sibling (across their respective subcircuits), then $\log{N^{2^k}}=2^k \cdot\log{N}$ which for fixed $k$ yields the desired $O(\log{N})$ bits for $O(1)$ arithmetic operations.% for the given query class.