Please note that it is \textit{assumed} that the original call to \onepass consists of a call on an input circuit \circuit such that the values of members \prt, \lwght and \rwght have been initialized to Null across all gates.
%For technical reasons, we require the invariant that every subcircuit \subcircuit corresponding to an internal gate of \circuit has $\degree\left(\subcircuit\right) \geq 1$. \revision{\textbf{AARON:} This is now trivially satisfied by the new definition of $\deg(\circuit)$ so please update this part to remove the stuff on $\reduce$. --Atri} To ensure this, auxiliary algorithm~\ref{alg:reduce} (\reduce) is called to perform any rewrites to \circuit, where an equivalent circuit \circuit' is created and returned by iteratively combining non-variable leaf nodes bottom-up until a parent node is reached which has an input \subcircuit whose subcircuit contains at least one leaf of type \var. It is trivial to see in such a case that $\subcircuit \equiv \subcircuit'$, and this implies $\circuit \equiv \circuit'$.
%In $O(\size(\circuit))$, algorithm \reduce inspects input circuit \circuit and outputs an equivalent version \circuit' of \circuit such that all subcircuits \subcircuit of \circuit' have $\degree(\subcircuit) \geq 1$.
%\end{Lemma}
%
%\begin{proof}[Proof of \Cref{lem:reduce}]
%~\paragraph{\reduce correctness}
%Note that for a source gate \gate, only when $\gate.\type = \var$ is it the case that $\degree(\gate) = 1$, and otherwise $\degree(\gate) = 0$. Lines~\ref{alg:reduce-add-deg} and~\ref{alg:reduce-no-deg} compute \gate.\degval.
%
%We prove an equivalent circuit \circuit' by induction over the iteration of \topord. For the base case, consider when we have one node. In such a case, no rewriting occurs, and \reduce returns \circuit. It is trivial to note that $\circuit \equiv \circuit$.
%
%For the inductive hypothesis, we assume that for $k \geq 0$ nodes in \topord, the modified circuit $\circuit_k' \equiv \circuit_k$, where $\circuit_k'$ denotes the circuit at the end of iteration $k$. Similarly, when discussing \Cref{alg:reduce} pseudocode, $\gate_{k}$ denotes the gate in position $k$ of \topord, and $\gate_{k_\linput}$ ($\gate_{k_\rinput}$) denotes the left (right) input of $\gate_{k}$.
%
%We now prove for $k + 1$ gates in \topord that $\circuit_{k + 1}' \equiv \circuit_{k + 1}$. Note that if the gate $\gate_{k + 1}$ is a source node, then this is again the base case and we are finished. If $\gate_{k + 1}$ is an internal node, then $\gate_{k + 1}.\type$ must either be $\circmult$ or $\circplus$.
%
%When $\gate_{k + 1}$ is $\circmult$, then it is the case that either $\degree(\gate_{{k + 1}_\linput}) \geq 1$ or $\gate_{{k + 1}_\linput}.\type$ is $\tnum$ and likewise for $\gate_{{k + 1}_\rinput}$. There are then four possibilities, only one of which will prompt a rewrite, namely when we have that both inputs have $\degree(\gate_{{k + 1}_i}) = 0$. In such a case, $\gate_{k + 1}.\val \gets \gate_{{k + 1}_\linput}.\val \times \gate_{{k + 1}_\rinput}.\val$, and the inputs are deleted. Note that since $\gate_{{k + 1}_\linput}.\type = \gate_{{k + 1}_\rinput}.\type = \tnum$ that we have two constants being multiplied, and that for subcircuit $\subcircuit = (\times, \tnum_1, \tnum_2)$ and $\tnum' = \tnum_1 \times \tnum_2$, $\polyf(\subcircuit) = \polyf(\tnum')$ which implies that for the rewritten \subcircuit', $\subcircuit \equiv \subcircuit'$.
%
%A analogous argument applies when $\gate_{k + 1}.\type$ is $\circplus$.\qed
%
%\paragraph{\reduce Run-time Analysis}.
%$O(\size(\circuit))$ trivially follows by the single iterative pass over the \topord of \circuit, where, as can be seen in lines~\ref{alg:reduce-var},~\ref{alg:reduce-num},~\ref{alg:reduce-mult}, and~\ref{alg:reduce-plus} a constant number of operations are performed on each node.\qed
Let $\etree$ encode the expression $(X_1+ X_2)(X_1- X_2)+ X_2^2$. After one pass, \Cref{alg:one-pass-iter} would have computed the following weight distribution. For the two inputs of the root $+$ node $\etree$, $\etree.\lwght=\frac{4}{5}$ and $\etree.\rwght=\frac{1}{5}$. Similarly, let $\stree$ denote the left-subtree of $\etree_{\lchild}$, $\stree.\lwght=\stree.\rwght=\frac{1}{2}$. This is depicted in \Cref{fig:expr-tree-T-wght}.
% \begin{tikzpicture}[thick, every tree node/.style={default_node, thick, draw=black, black, circle, text width=0.3cm, font=\bfseries, minimum size=0.65cm}, every child/.style={black}, edge from parent/.style={draw, thick},
We prove the correct computation of \prt, \lwght, \rwght values on \circuit by induction over the number of iterations in the topological order \topord (line~\ref{alg:one-pass-loop}) of the input circuit \circuit. Note that \topord follows the standard definition of a topological ordering over the DAG structure of \circuit.
For the base case, we have only one gate, which by definition is a source gate and must be either \var or \tnum. In this case, as per \cref{eq:T-all-ones}, lines~\ref{alg:one-pass-var} and~\ref{alg:one-pass-num} correctly compute \circuit.\prt as $1$ and \circuit.\val respectively.
For the inductive hypothesis, assume that \onepass correctly computes \subcircuit.\prt, \subcircuit.\lwght, and \subcircuit.\rwght for all gates \gate in \circuit with $k \geq0$ iterations over \topord.
We now prove for $k +1$ iterations that \onepass correctly computes the \prt, \lwght, and \rwght values for each gate $\gate_\vari{i}$ in \circuit for $i \in[k +1]$.
Note that the $\gate_\vari{k +1}$ must be in the last ordering of all gates $\gate_\vari{i}$. Note that for \size(\circuit) > 1, if $\gate_{k+1}$ is a leaf node, we are back to the base case. Otherwise $\gate_{k +1}$ is an internal node $\gate_\vari{s}.\type=\circplus$ or $\gate_\vari{s}.\type=\circmult$, which both require binary input.
When $\gate_{k+1}.\type=\circplus$, then by line~\ref{alg:one-pass-plus}$\gate_{k+1}$.\prt$=\gate_{{k+1}_\lchild}$.\prt$+\gate_{{k+1}_\rchild}$.\prt, a correct computation, as per \cref{eq:T-all-ones}. Further, lines~\ref{alg:one-pass-lwght} and~\ref{alg:one-pass-rwght} compute $\gate_{{k+1}}.\lwght=\frac{\gate_{{k+1}_\lchild}.\prt}{\gate_{{k+1}}.\prt}$ and analogously for $\gate_{{k+1}}.\rwght$. Note that all values needed for each computation have been correctly computed by the inductive hypothesis.
When $\gate_{k+1}.\type=\circmult$, then line~\ref{alg:one-pass-mult} computes $\gate_{k+1}.\prt=\gate_{{k+1}_\lchild.\prt}\circmult\gate_{{k+1}_\rchild}.\prt$, which indeed by \cref{eq:T-all-ones} is correct.
It is known that $\topord(G)$ is computable in linear time. Next, each of the $\size(\circuit)$ iterations of the loop in \Cref{alg:one-pass-loop} take $O\left(\multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}\right)$ time. It is easy to see that each of all the numbers which the algorithm computes is at most $\abs{\circuit}(1,\dots,1)$. Hence, by definition each such operation takes $\multc{\log\left(\abs{\circuit(1\ldots, 1)}\right)}{\log{\size(\circuit)}}$ time, which proves the claimed runtime.
%In general it is known that an arithmetic computation which requires $M$ bits takes $O(\frac{\log{M}}{\log{N}})$ time for an input size $N$. Since each of the arithmetic operations at a given gate has a bit size of $O(\log{\abs{\circuit}(1,\ldots, 1)})$, thus, we obtain the general runtime of $O\left(\size(\circuit)\cdot \frac{\log{\abs{\circuit}(1,\ldots, 1)}}{\log{\size(\circuit)}}\right)$.
For our runtime results to be relevant, it must be the case that the sum of the coefficients computed by \onepass is indeed size $O(N)$ since there are $O(\log{N})$ bits in the RAM model where $N$ is the size of the input. The size of the input here is \size(\circuit). We show that when \size$(\circuit_\linput)= N_\linput$, \size$(\circuit_\rinput)= N_\rinput$, where $N_\linput+ N_\rinput\leq N$, this is indeed the case.
\begin{proof}%[Proof of $\abs{\circuit}(1,\ldots, 1)$ is size $O(N)$]
To prove this result, we start by proving that $\abs{\circuit}(1,\ldots, 1)\leq N^{2^k }$ for \degree(\circuit) $= k$.
For the base case, we have that \depth(\circuit) $=0$, and there can only be one node which must contain a coefficient (or constant) of $1$. In this case, $\abs{\circuit}(1,\ldots, 1)=1$, and \size(\circuit) $=1$, and it is true that $\abs{\circuit}(1,\ldots, 1)=1\leq N^{2^k}=1^{2^0}=1$.
Assume for $\ell > 0$ an arbitrary circuit \circuit of $\depth(\circuit)\leq\ell$ that it is true that $\abs{\circuit}(1,\ldots, 1)\leq N^{2^k }$.% for $k \geq 1$ when \depth(C) $\geq 1$.
For the inductive step we consider a circuit \circuit such that $\depth(\circuit)\leq\ell+1$. The sink can only be either a $\circmult$ or $\circplus$ gate. Consider when sink node is $\circmult$. Let $k_\linput, k_\rinput$ denote \degree($\circuit_\linput$) and \degree($\circuit_\rinput$) respectively. Note that this case does not require the constraint on $N_\linput$ or $N_\rinput$.
We derive the upperbound of \Cref{eq:sumcoeff-times-upper} by noting that the maximum value of the LHS occurs when both the base and exponent are maximized.
Similar to the $\circmult$ case, \Cref{eq:sumcoeff-plus-upper} upperbounds its LHS by the fact that the maximum base and exponent combination is always greater than or equal to the sum of lower base/exponent combinations. The final equality is true given the constraint over the inputs.
Since $\abs{\circuit}(1,\ldots, 1)\leq N^{2^k}$ for all circuits such that all $\circplus$ gates share at most one gate with their sibling (across their respective subcircuits), then $\log{N^{2^k}}=2^k \cdot\log{N}$ which for fixed $k$ yields the desired $O(\log{N})$ bits for $O(1)$ arithmetic operations.% for the given query class.