Added new SampleMonomial proof with Cost function.

This commit is contained in:
Aaron Huber 2021-02-23 10:45:58 -05:00
parent 2a3e7cc8a5
commit ff1431a195
3 changed files with 71 additions and 13 deletions

View file

@ -410,7 +410,7 @@ $\sampmon$ is given in \Cref{alg:sample}, and a proof of its correctness (via \C
\State $\Return ~(\vari{v}, \vari{s})$
\ElsIf{$\circuit.\type = \times$}\Comment{Multiply the sampled values of all inputs}
\State $\vari{sgn} \gets 1$\label{alg:sample-global2}
\For {$input$ in $\circuit.\vari{input}$}
\For {$input$ in $\circuit.\vari{input}$}\label{alg:sample-times-for-loop}
\State $(\vari{v}, \vari{s}) \gets \sampmon(input)$
\State $\vari{vars} \gets \vari{vars} \cup \{\vari{v}\}$\label{alg:sample-times-union}
\State $\vari{sgn} \gets \vari{sgn} \times \vari{s}$\label{alg:sample-times-product}

View file

@ -618,11 +618,11 @@ level 2/.style={sibling distance=0.7cm},
\ElsIf{\gate.\type $=$ \tnum}
\State \gate.\prt $\gets \abs{\gate.\val}$\label{alg:one-pass-num}
\ElsIf{\gate.\type $= \circmult$}
\State \gate.\prt $\gets \gate_\linput.\prt \times \gate_\rinput.\prt$
\State \gate.\prt $\gets \gate_\linput.\prt \times \gate_\rinput.\prt$\label{alg:one-pass-mult}
\Else %\Comment{\gate.\type $= \circplus$}
\State \gate.\prt $\gets \gate_\linput.\prt + \gate_\rinput.\prt$
\State \gate.\lwght $\gets \frac{\gate_\linput.\prt}{\gate.\prt}$
\State \gate.\rwght $\gets \frac{\gate_\rinput.\prt}{\gate.\prt}$
\State \gate.\prt $\gets \gate_\linput.\prt + \gate_\rinput.\prt$\label{alg:one-pass-plus}
\State \gate.\lwght $\gets \frac{\gate_\linput.\prt}{\gate.\prt}$\label{alg:one-pass-lwght}
\State \gate.\rwght $\gets \frac{\gate_\rinput.\prt}{\gate.\prt}$\label{alg:one-pass-rwght}
\EndIf
\State \vari{sum} $\gets \gate.\prt$
\EndFor
@ -635,10 +635,21 @@ level 2/.style={sibling distance=0.7cm},
\subsection{Proof of ~\Cref{lem:one-pass}}\label{sec:proof-one-pass}
\paragraph{\onepass Correctness}
We first note that all DAGs have a topological ordering. By definition, for a graph $G = (V, E)$, a topological ordering $\topord(G)$ orders the elements of V in correspondence to the directed edges of $E$, such that for each edge $(u, v)$ the gate $u$ will be ordered before gate $v$, since the edge goes from $u$ to $v$. Since the directed edges of circuit \circuit go from the children to the parent, it is then invariant in $\topord(G)$ that all the children $v_\linput$ and/or $v_\rinput$ appear before their parental internal gate $v \in V$ in $\topord(G)$. Since \onepass follows $\topord(\circuit)$ (~\Cref{alg:one-pass-loop}), it has to be the case that before an internal gate is visited, both of its subcircuits $\subcircuit_\linput$ and $\subcircuit_\rinput$ will have previously been visited, computing \prt, \lwght, and \rwght values in a bottom-up traversal. When the node visited by \onepass is a source node, it is trivial to see in ~\Cref{alg:one-pass-var} and ~\Cref{alg:one-pass-num} that such a child node's \prt value is correctly computed. This implies the correctness of the \prt annotations across all gates in both $\subcircuit_\linput$ and $\subcircuit_\rinput$ Recall that all internal gates are either $\circplus$ or $\circmult$. When an internal gate \subcircuit is visited in \topord(\circuit), if \subcircuit.\type is $\circplus$, \onepass will add the values of the two source nodes $\subcircuit_\linput.\prt + \subcircuit_\rinput.\prt$ to correctly annotate \subcircuit.\prt. Further \subcircuit.\lwght will be computed correctly as $\frac{\subcircuit_\linput.\prt}{\subcircuit.\prt}$ and analogously for \subcircuit.\rwght. If the parent gate \subcircuit visited is a $\circmult$, then \onepass will correctly annotate \subcircuit.\prt as $\subcircuit_\linput.\prt \times \subcircuit_\rinput.\prt$. As gates further in the order are subsequently visited, note that it is the case that all previous gates visited will always contain the correct \prt values and it follows that all subsequent gates visited in \topord(\circuit) will correctly compute their respective \prt values. Since \lwght and \rwght are computed dependent only on \prt values, it follows that all \lwght and \rwght values will then be computed correctly.
We prove the correct computation of \prt, \lwght, \rwght values on \circuit by induction over the topological order of the input circuit \circuit.
For the base case, we have only one gate, which by definition is a source gate and must be either \var or \tnum. In this case, lines ~\ref{alg:one-pass-var} and ~\ref{alg:one-pass-num} correctly compute \circuit.\prt as $1$ and \circuit.\val respectively.
For the inductive hypothesis, assume that the correctness of \onepass holds for a circuit \circuit with $k > 0$ gates.
We now prove for $k + 1$ gates that \onepass correctly computes the \prt, \lwght, and \rwght values for each gate \gate in \circuit. By the hypothesis the first $k$ gates (alternatively \textit{iterations}) have correctly computed values. Note that the $k+ 1$ iteration must be that the last gate in the topological order, and this gate must be the sink gate $\gate_\vari{s}$. It is also the case that $\gate_\vari{s}$ has at most two children. Finally, note that $\gate_\vari{s}$ is an internal node and implying that $\gate_\vari{s}.\type = \circplus$ or $\gate_\vari{s}.\type = \circmult$.
When $\gate_\vari{s}.\type = \circplus$, then by line ~\ref{alg:one-pass-plus} $\gate_\vari{s}.\prt = \gate_{\vari{s}_\lchild}.\prt + \gate_{\vari{s}_\rchild}.\prt$, a correct computation. Further, lines ~\ref{alg:one-pass-lwght} and ~\ref{alg:one-pass-rwght} compute $\gate_{\vari{s}}.\lwght = \frac{\gate_{\vari{s}_\lchild}.\prt}{\gate_{\vari{s}}.\prt}$ and analogously for $\gate_{\vari{s}}.\rwght$, both of which are correct computations based on the I.H. Note that all values needed for each computation have been correctly computed by the I.H.
When $\gate_\vari{s}.\type = \circmult$, then line ~\ref{alg:one-pass-mult} computes $\gate_\vari{s}.\prt = \gate_{\vari{s}_\lchild.\prt} \circmult \gate_{\vari{s}_\rchild}.\prt$, which indeed is correct.
%We first note that all DAGs have a topological ordering. By definition, for a graph $G = (V, E)$, a topological ordering $\topord(G)$ orders the elements of V in correspondence to the directed edges of $E$, such that for each edge $(u, v)$ the gate $u$ will be ordered before gate $v$, since the edge goes from $u$ to $v$. Since the directed edges of circuit \circuit go from the children to the parent, it is then invariant in $\topord(G)$ that all the children $v_\linput$ and/or $v_\rinput$ appear before their parental internal gate $v \in V$ in $\topord(G)$. Since \onepass follows $\topord(\circuit)$ (~\Cref{alg:one-pass-loop}), it has to be the case that before an internal gate is visited, both of its subcircuits $\subcircuit_\linput$ and $\subcircuit_\rinput$ will have previously been visited, computing \prt, \lwght, and \rwght values in a bottom-up traversal. When the node visited by \onepass is a source node, it is trivial to see in ~\Cref{alg:one-pass-var} and ~\Cref{alg:one-pass-num} that such a child node's \prt value is correctly computed. This implies the correctness of the \prt annotations across all gates in both $\subcircuit_\linput$ and $\subcircuit_\rinput$. Recall that all internal gates are either $\circplus$ or $\circmult$. When an internal gate \subcircuit is visited in \topord(\circuit), if \subcircuit.\type is $\circplus$, \onepass will add the values of the two source nodes $\subcircuit_\linput.\prt + \subcircuit_\rinput.\prt$ to correctly annotate \subcircuit.\prt. Further \subcircuit.\lwght will be computed correctly as $\frac{\subcircuit_\linput.\prt}{\subcircuit.\prt}$ and analogously for \subcircuit.\rwght. If the parent gate \subcircuit visited is a $\circmult$, then \onepass will correctly annotate \subcircuit.\prt as $\subcircuit_\linput.\prt \times \subcircuit_\rinput.\prt$. As gates further in the order are subsequently visited, note that it is the case that all previous gates visited will always contain the correct \prt values and it follows that all subsequent gates visited in \topord(\circuit) will correctly compute their respective \prt values. Since \lwght and \rwght are computed dependent only on \prt values, it follows that all \lwght and \rwght values will then be computed correctly.
\paragraph{\onepass Runtime}
Each of the \numvar~ iterations of the loop in ~\Cref{alg:one-pass-loop} take $O(1)$ time, thus yielding a runtime of $O(\size(\circuit)$.
It is known that $\topord(G)$ is computable in linear time. Next, each of the \numvar~ iterations of the loop in ~\Cref{alg:one-pass-loop} take $O(1)$ time, thus yielding a runtime of $O\left(\size(\circuit)\right)$.
}
@ -691,10 +702,54 @@ and we obtain the desired result.
\paragraph{Run-time Analysis}
We now bound the number of recursive calls in $\sampmon$ by $O\left(k\cdot depth(\circuit)\right)$.
\revision{
Note that a sampled monomial corresponds to a subtree of the equivalent expression tree \etree of $\circuit$. Take an arbitrary sample subtree \stree of \etree and note that since every monomial has degree at most $k$, \stree has $O(k)$ leaves and the number of recursive calls in each layer as one goes from leaves to the root can only go down. Note that \depth(\etree) = \depth(\circuit) since any gate \gate in \circuit with more than one output is simply duplicated (\gate') in \etree, where \gate' is input into its additional parent. This can only increase the breadth of \etree and not the depth, since \gate is input at least one level up, and therefore \gate' would be at the same level or higher in \etree. (Note that the \depth(\circuit) is considered as the longest path from a sink node to a source node.) It follows that adding a child subtree \gate' as input to a parent one or more levels higher can only increase the breadth. Since \stree has depth at most $\depth(\etree) = \depth(\circuit)$ and that each level has $O(k)$ nodes, the subcircuit has $O(k\cdot \depth(\circuit))$ nodes in it.
It is important to note that since there are $O(k)$ recursive calls at any given level, the case of more than one recursive call to an arbitrary gate is accounted for in this bound, i.e., on any given level, there cannot be more than $\numvar$ calls on a particular gate such that the sum of all other calls on other gates is $m$ and $\numvar + m = k$. This yields the desired bound.
}
Let \cost be a function that models an upper bound on the number of gates that can be visited in the run of \sampmon. We define \cost recursively as follows.
\begin{equation*}
\cost(\circuit) =
\begin{cases}
1 + \cost(\circuit_\linput) + \cost(\circuit_\rinput) & \textbf{if } \text{\circuit.\type = }\circmult\\
1 + max\left(\cost(\circuit_\linput), \cost(\circuit_\rinput)\right) & \textbf{if } \text{\circuit.\type = \circplus}\\
1 & \textbf{otherwise}
\end{cases}
\end{equation*}
First note that \cost is indeed a correct upper bound. When \sampmon visits a gate such that \circuit.\type $ =\circmult$, line ~\ref{alg:sample-times-for-loop} visits each input of \circuit. For the case when \circuit.\type $= \circplus$, line ~\ref{alg:sample-plus-bsamp} visits exactly one of the input gates. Finally, it is trivial to see that when \circuit.\type $\in \{\var, \tnum\}$, i.e., a source gate, that only one gate is visited.
We prove an upper bound of $O\left(k \cdot \depth(\circuit)\right) \geq \cost(\circuit)$ on the number of gates traversed in \sampmon using induction over $\depth(\circuit) = d$. %of \circuit, where $\degree(\circuit) = k$, and $\size(\circuit) = \ell$.
For the base case $\depth(\circuit) = 0$, $\cost(\circuit) = 1$, and it is trivial to see that $\degree(\circuit) = k = 1$, $\depth(\circuit) = 1$, which imples that $\cost(\circuit) \leq O(k \cdot d)$, and the bound trivially holds.
For the inductive hypothesis, we assume the bound holds for a circuit of depth $d \leq \ell \geq 0$.
Now consider the case when \sampmon has an arbitrary circuit \circuit input with $d + 1 \leq \ell + 1$. By the hypothesis, we know all inputs of the sink gate \circuit uphold the bound. It is true then that $\depth(\circuit_\rinput) \leq d \geq \depth(\circuit_\linput)$ where at least one of the inequalities is a strict equality. Fix $\degree(\circuit_\linput) = k'$ and $\degree(\circuit_\rinput) = k''$. If \circuit.\type $= \circplus$, then $\degree(\circuit) = k = max(k', k'')$. Otherwise \circuit.\type = $\circmult$ and $\degree(\circuit) = k' + k''$.
If \circuit.\type $= \circmult$, we have already seen that \sampmon visits all inputs of \circuit. Then \sampmon visits $\cost(\circuit) = 1 + \cost(\circuit_\linput) + \cost(\circuit_\rinput)$ gates. Since we have that $O\left(k' \cdot (d - 1)\right) \geq \cost(\circuit_\linput)$ gates were traversed in $\circuit_\linput$ and $O\left(k'' \cdot (d - 1)\right) \geq \cost(\circuit_\rinput)$ gates in $\circuit_\rinput$, then we have for \circuit that $O\left((k' + k'')\cdot(d - 1)\right) + 1 = O(\left(k \cdot(d - 1)\right) + 1$ gates have been traversed. It is trivial to see that the sink gate \circuit has $\leq k$ gates, thus holding the $O(k\cdot d)\geq\cost(\circuit)$ bound.
If \circuit.\type $= \circplus$, then \sampmon samples exactly one of its inputs. Then it follows that $\cost(\circuit) = 1 + max\left(\cost(\circuit_\linput), \cost(\circuit_\rinput)\right)$. Suppose that $max\left(\cost(\circuit_\linput), \cost(\circuit_\rinput)\right) = \circuit_\linput$, then it is the case that $k \geq k'$, with a bound $O(k' \cdot d - 1)\geq \cost(\circuit_\linput)$ on $\circuit_\linput$, and this implies a bound of $O\left(k \cdot d\right)\geq \cost(\circuit)$ on \circuit since sink gate \circuit has $\leq k$ nodes.
}
%\revision{
%\begin{Definition}[Equivalent Expression Tree]\label{def:eet}
%Given an arbitrary circuit \circuit, there exists an equivalent expression tree \etree such that \etree is a copy of \circuit for all gates \gate where \gate has an outdegree $\leq 1$. When \gate has an outdegree greater than one, \gate is duplicated to be a subtree for each `extra' parent of \gate. We define this recursively as follows.
%\end{Definition}
%$\equivtree(\circuit) =
% \begin{cases}
% \emptyset & \textbf{if } \text{\circuit does not exist}\\
% \left(\circuit.\val, \equivtree(\circuit_\linput), \equivtree(\circuit_\rinput)\right) & \textbf{if } \text{\circuit is an iternal node}\\
% \circuit.\val & \textbf{otherwise}
%\end{cases}$
%
%\begin{Definition}[Expression Tree Subgraph]\label{def:etree-subgraph}
%Given an arbitrary circuit \circuit, the union of corresponding paths in $\equivtree(\circuit)$ traversed in \sampmon(\circuit) for all variables forming the output sampled in \sampmon constitutes the Expression Tree Subgraph.
%\end{Definition}
%
%For an arbitrary circuit \circuit with $\degree(k)$, it must be the case that the subgraph $\subgraph$ of $\equivtree(\circuit)$ has at most $k$ nodes at each level. This is true by the fact that for each internal node at least one or more children nodes will be visited by \sampmon all the way down to the leaves, and by definition, there are $O(k)$ leaf nodes in a polynomial with $\degree(k)$. Since it is also the case that $\equivtree(\circuit)$ has \depth(d), this then implies a bound of $O\left(k \cdot \depth(d)\right)$.
%
%We now prove that \depth(\circuit) = \depth($\equivtree(\circuit)$).
%}
%\revision{
%
% Note that a sampled monomial corresponds to a subtree of the equivalent expression tree \etree of $\circuit$. Take an arbitrary sample subtree \stree of \etree and note that since every monomial has degree at most $k$, \stree has $O(k)$ leaves and the number of recursive calls in each layer as one goes from leaves to the root can only go down. Note that \depth(\etree) = \depth(\circuit) since any gate \gate in \circuit with more than one output is simply duplicated (\gate') in \etree, where \gate' is input into its additional parent. This can only increase the breadth of \etree and not the depth, since \gate is input at least one level up, and therefore \gate' would be at the same level or higher in \etree. (Note that the \depth(\circuit) is considered as the longest path from a sink node to a source node.) It follows that adding a child subtree \gate' as input to a parent one or more levels higher can only increase the breadth. Since \stree has depth at most $\depth(\etree) = \depth(\circuit)$ and that each level has $O(k)$ nodes, the subcircuit has $O(k\cdot \depth(\circuit))$ nodes in it.
%It is important to note that since there are $O(k)$ recursive calls at any given level, the case of more than one recursive call to an arbitrary gate is accounted for in this bound, i.e., on any given level, there cannot be more than $\numvar$ calls on a particular gate such that the sum of all other calls on other gates is $m$ and $\numvar + m = k$. This yields the desired bound.
%}
%
It is easy to check that except for~\Cref{alg:sample-times-union}, all other lines take $O(1)$ time. Thus, overall all lines except for~\Cref{alg:sample-times-union} take $O(k\cdot depth(\circuit))$ time. Now consider all executions of~\Cref{alg:sample-times-union} together. We note that at each level we will be adding a given set of variables to some set at most once: since the sum of the sizes of the sets at a given level is at most $k$, each level involves $O(k\log{k})$ time. Thus, overall all executions of~\Cref{alg:sample-times-union} takes $O(k\log{k}\cdot \depth(\circuit))$ time, as desired.

View file

@ -89,8 +89,8 @@
\newcommand{\wght}{\vari{weight}\xspace}
\newcommand{\vpartial}{\vari{partial}\xspace}
%types of T
\newcommand{\var}{\textsc{var}}
\newcommand{\tnum}{num}
\newcommand{\var}{\textsc{var}\xspace}
\newcommand{\tnum}{\textsc{num}\xspace}
%%%%%%%
\renewcommand{\algorithmicrequire}{\textbf{Input:}}
\renewcommand{\algorithmicensure}{\textbf{Output:}}
@ -101,13 +101,14 @@
\newcommand{\expandtree}[1]{\vari{E}(#1)}
\newcommand{\expansion}[1]{\vari{E}(#1)}
\newcommand{\elist}[1]{\vari{List}\pbox{#1}}
\newcommand{\equivtree}{\vari{EET}}
%expandtree tuple elements:
\newcommand{\monom}{\vari{v}}
\newcommand{\coef}{\vari{c}}
%----------------------------------
\newcommand{\abs}[1]{\left|#1\right|}
\newcommand{\func}[1]{\textsc{#1}}
\newcommand{\func}[1]{\textsc{#1}\xspace}
\newcommand{\polyf}{\func{poly}}
\newcommand{\evalmp}{\func{eval}}
\newcommand{\degree}{\func{deg}}
@ -296,6 +297,8 @@
\newcommand{\rwght}{\vari{Rweight}}
\newcommand{\prt}{\vari{partial}}
\newcommand{\type}{\vari{type}}
\newcommand{\subgraph}{\vari{S}_{\equivtree(\circuit)}}
\newcommand{\cost}{\func{Cost}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Sets