More changes in Approx Algo.

This commit is contained in:
Aaron Huber 2020-08-14 12:03:26 -04:00
parent f10b65525b
commit 2026e06669

View file

@ -3,34 +3,9 @@
\AH{I am attempting to rewrite this section mostly from scratch. This will involve taking 'baby' steps towards the goals we spoke of on Friday 080720 as well as throughout the following week on chat channel.}
\AH{\textbf{BEGIN}: Old stuff.}
\begin{Lemma}\label{lem:approx-alg}
For any query polynomial $\poly(X_1,\ldots, X_n)$, an approximation of $\rpoly(\prob_1,\ldots, \prob_n)$ can be computed in $O\left(|\poly|\cdot k \frac{\log\frac{1}{\conf}}{\error^2}\right)$, within $1 \pm \error$ multiplicative error with probability $\geq 1 - \conf$, where $k$ denotes the product width of $\poly$.
\end{Lemma}
\begin{proof}[Proof of Lemma \ref{lem:approx-alg}]
Let $c_i$ be the coefficient of the $i^{th}$ monomial and $\distinctvars_i$ be the number of distinct variables appearing in the $i^{th}$ monomial. Then $\coeffitem{i}$ is the value of the $i^{th}$ monomial term in $\rpoly(\prob_1,\ldots, \prob_n)$. Define $\coeffset$ to be the set $\{\coeffitem{1},\ldots, \coeffitem{\setsize}\}$. Assume a set of $\samplesize$ random variables $\vct{\randvar}$, where $\randvar_i \sim \unidist{\coeffset}$.
Given random variable $\randvar_i$, it is the case that $\expct\pbox{\randvar_i} = \sum_{i = 1}^{\setsize}\frac{\coeffitem{i}}{\setsize} = \ave{\coeffset}$. Let $\hoeffest = \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\randvar_i$. Then it is true that
\[\expct\pbox{\hoeffest} = \expct\pbox{ \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\randvar_i} = \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\expct\pbox{\randvar_i} = \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\frac{1}{\setsize}\sum_{j = 1}^{\setsize}\coeffitem{j} = \ave{\coeffset}.\]
Denote $\hoeffestsum = \hoeffest \cdot \setsize$ and $\setsum = \ave{\coeffset} \cdot \setsize$.
Given the range $[a, b]$ for every $\randvar_i$ in $\vct{\randvar}$, by Hoeffding, it is the case that $Pr\pbox{| \hoeffestsum - \setsum | \geq \error\setsize} \leq 2\exp{-\frac{2\samplesize^2\setsize^2\error^2}{\sum_{i = 1}^{\samplesize}\left(b_i - a_i\right)^2}} \leq \conf$.
\AH{Needs to be rewritten using the 2nd Hoeffding.}
Solving for the number of samples $\samplesize$ we get
\begin{align}
&\conf \geq 2\exp{-\frac{2\samplesize^2\setsize^2\error^2}{\samplesize\left(b - a\right)^2}}\label{eq:hoeff-1}\\
&\frac{\conf}{2} \geq \exp{-\frac{2\samplesize^2\setsize^2\error^2}{\samplesize\left(b - a\right)^2}}\label{eq:hoeff-2}\\
&\frac{2}{\conf} \leq \exp{\frac{2\samplesize^2\setsize^2\error^2}{\samplesize\left(b - a\right)^2}}\label{eq:hoeff-3}\\
&\log{\frac{2}{\conf}} \leq \frac{2\samplesize^2\setsize^2\error^2}{\samplesize\left(b - a\right)^2}\label{eq:hoeff-4}\\
&\log{\frac{2}{\conf}} \leq \frac{2\samplesize\setsize^2\error^2}{\left(b - a\right)^2}\label{eq:hoeff-5}\\
&\frac{\log{\frac{2}{\conf}}\left(b - a\right)^2}{2\setsize^2\error^2} \leq \samplesize.\label{eq:hoeff-6}
\end{align}
Equation \cref{eq:hoeff-1} results from the fact that for all $\samplesize$ samples, the range of values over the random variable $\randvar_i$ is always the same. Equation \cref{eq:hoeff-2} is the result of dividing both sides by $2$. Equation \cref{eq:hoeff-3} follows from taking the reciprocal of both sides, and noting that such an operation flips the inequality sign. We then derive \cref{eq:hoeff-4} by the taking the base $e$ log of both sides, and \cref{eq:hoeff-5} results from cancelling the common $\samplesize$ factor. We arrive at the final result of \cref{eq:hoeff-6} by simply multiplying both sides by the reciprocal of the RHS fraction without the $\samplesize$ factor.
\begin{proof}
Let us now show a sampling scheme which can run in $O\left(|\poly|\cdot k\right)$ per sample.
@ -52,12 +27,37 @@ Thus, it is the case, that we can approximate $\rpoly(\prob_1,\ldots, \prob_n)$
\qed
\AH{{\bf END:} Old Stuff}
Before proceeding to describe the approximation algorithm, let us intrduce notation that will be of use in the following discussion. First, when we speak of $\smb$, we are speaking of a polynomial $\poly$ of the standard monomial basis, i.e., a polynomial whose monomials are not only in SOP form, but non-distinct monomials have been collapsed into one distinct monomial, with a correct corresponding coefficient.
Before proceeding to describe the approximation algorithm, let us intrduce notation that will be of use in the following discussion. First, when we speak of $\smb$, we are speaking of a polynomial $\poly$ of the standard monomial basis, i.e., a polynomial whose monomials are not only in SOP form, but one whose non-distinct monomials have been collapsed into one distinct monomial, with its corresponding coefficient accurately reflecting the number of monomials combined.
Let $\expresstree{\smb}$ be the set of all possible polynomial expressions equivalent to $\smb$. Call the input polynomial $\polytree$, and note that $\polytree \subseteq \expresstree{\smb}$ and need not be of the standard monomial basis. Refer to the expanded SOP form of $\poly$ as $\expandtree$, which is the SOP form of $\poly$ such that all coefficients $c_i$ are in the set $\{-1, 1\}$, thus relaxing the distinct monomial requirement of the standard monomial basis. Denote $\abstree$ as the resulting polynomial when all monomial coefficients of $\polytree$ are converted to positive coefficients, and then $\polytree$ itself is converted to the standard monomial basis.
\subsection{Monomial Sample Algorithm}
\begin{Lemma}\label{lem:approx-alg}
For any query polynomial $\poly(\vct{X})$, an approximation of $\rpoly(\prob_1,\ldots, \prob_n)$ can be computed in $O\left(|\poly|\cdot k \frac{\log\frac{1}{\conf}}{\error^2}\right)$, within $1 \pm \error$ multiplicative error with probability $\geq 1 - \conf$, where $k$ denotes the product width of $\poly$.
\end{Lemma}
\begin{proof}[Proof of Lemma \ref{lem:approx-alg}]
Consider $\polytree$ in the standard monomial basis and let $c_i$ be the coefficient of the $i^{th}$ monomial and $\distinctvars_i$ be the number of distinct variables appearing in the $i^{th}$ monomial. Note that sampling each term $t$ in $\polytree$ with probability $\frac{|c_i|}{\abstree(1,\ldots, 1)}$ is the equivalent of sampling uniformly over $\expandtree$. Now consider $\rpoly$ and note that $\coeffitem{i}$ is the value of the $i^{th}$ monomial term in $\rpoly(\prob_1,\ldots, \prob_n)$. Let $m$ be the number of terms in $\expandtree$ and $\coeffset$ to be the set $\{c'_1,\ldots, c'_m\}$ where each $c'_i$ is in $\{-1, 1\}$.
Consider now a set of $\samplesize$ random variables $\vct{\randvar}$, where $\randvar_i \sim \unidist{\coeffset}$. Recall that we are estimating for $\rpoly(\prob,\ldots, \prob)$. Then for random variable $\randvar_i$, it is the case that $\expct\pbox{\randvar_i} = \sum_{i = 1}^{\setsize}\frac{c'_i \cdot \prob^{\distinctvars}}{\setsize} = \frac{\rpoly(\prob,\ldots, \prob)}{\abstree(1,\ldots, 1)}$. Let $\hoeffest = \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\randvar_i$. It is also true that
\[\expct\pbox{\hoeffest} = \expct\pbox{ \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\randvar_i} = \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\expct\pbox{\randvar_i} = \frac{1}{\samplesize}\sum_{i = 1}^{\samplesize}\frac{1}{\setsize}\sum_{j = 1}^{\setsize}\frac{c'_i \cdot \prob^{\distinctvars}}{\setsize} = \frac{\rpoly(\prob,\ldots, \prob)}{\abstree(1,\ldots, 1)}.\]
Given the range $[-1, 1]$ for every $\randvar_i$ in $\vct{\randvar}$, by Hoeffding, it is the case that $P\pbox{~\left| \hoeffest - \expct\pbox{\hoeffest} ~\right| \geq \error} \leq 2\exp{-\frac{2\samplesize^2\error^2}{2^2 \samplesize}} \leq \conf$.
Solving for the number of samples $\samplesize$ we get
\begin{align}
&\conf \geq 2\exp{-\frac{2\samplesize^2\error^2}{4\samplesize}}\label{eq:hoeff-1}\\
&\frac{\conf}{2} \geq \exp{-\frac{2\samplesize^2\error^2}{4\samplesize}}\label{eq:hoeff-2}\\
&\frac{2}{\conf} \leq \exp{\frac{2\samplesize^2\error^2}{4\samplesize}}\label{eq:hoeff-3}\\
&\log{\frac{2}{\conf}} \leq \frac{2\samplesize^2\error^2}{4\samplesize}\label{eq:hoeff-4}\\
&\log{\frac{2}{\conf}} \leq \frac{\samplesize\error^2}{2}\label{eq:hoeff-5}\\
&\frac{2\log{\frac{2}{\conf}}}{\error^2} \leq \samplesize.\label{eq:hoeff-6}
\end{align}
Equation \cref{eq:hoeff-1} results computing the sum in the denominator of the exponential. Equation \cref{eq:hoeff-2} is the result of dividing both sides by $2$. Equation \cref{eq:hoeff-3} follows from taking the reciprocal of both sides, and noting that such an operation flips the inequality sign. We then derive \cref{eq:hoeff-4} by the taking the base $e$ log of both sides, and \cref{eq:hoeff-5} results from reducing common factors. We arrive at the final result of \cref{eq:hoeff-6} by simply multiplying both sides by the reciprocal of the RHS fraction without the $\samplesize$ factor.
\end{proof}
\subsubsection{Description}
\subsubsection{Psuedo Code}