Expectation of 4 Products with Independent Polarity

This commit is contained in:
Aaron Huber 2019-09-16 09:32:20 -04:00
parent 9b9ca0ef32
commit 2e0190e4f3

View file

@ -81,15 +81,15 @@ The even case can be reduced to the odd case by including the one's vector as an
&\sum_{\wVec \in \pw}\gVP{1}{\wVec}\gVP{2}{\wVec} + \gVP{1}{\wVec}\sum_{\substack{\wTwo \in \pw \st \\ \wTwo \neq \wVec}}\gVP{2}{\wTwo} + \\
&\qquad\gVP{2}{\wVec}\sum_{\substack{\wOne \in \pw \st\\\wOne \neq \wVec}}\gVP{1}{\wOne} + \sum_{\substack{\wOne \in \pw \st\\\wOne \neq \wVec}}\gVP{1}{\wOne}\gVP{2}{\wOne}.
\end{align*}
For $\est{3}$, multiplying an even number of sketches yields
\begin{align*}
&\expect{\sum_{j \in \sketchCols}\sCom{1}{j} \cdot \sCom{2}{j}}\\
=&\expect{\sum_{j \in \sketchCols}\left(\sum_{\substack{\wOne \in \pw \st\\\hashP{\wOne} = j}}\gVP{1}{\wOne}\polP{\wOne}\cdot \sum_{\substack{\wTwo \in \pw \st\\\hashP{\wTwo} = j}}\gVP{2}{\wTwo}\polP{\wTwo}\right)}\\
=&\mathbb{E}\big[\sum_{j \in \sketchCols}\sum_{\substack{\wOne, \wTwo \in \pw \st\\\hashP{\wOne} = j\\\wOne = \wTwo}}\gVP{1}{\wOne}\gVP{2}{\wOne}\polP{\wOne}\polP{\wOne} +\\
&\qquad \gVP{1}{\wOne}\polP{\wOne}\sum_{\substack{\wOne, \wTwo \in \pw \st\\\hashP{\wOne} = j\\\wOne \neq \wTwo}}\gVP{2}{\wTwo}\polP{\wTwo}\big]\\
=&\expect{\sum_{\wOne \in \pw}\gVP{1}{\wOne}\gVP{2}{\wOne}}\\
=&\sum_{\wOne \in \pw}\gVP{1}{\wOne}\gVP{2}{\wOne}
\end{align*}
For $\est{3}$, multiplying two sketches yields
\begin{align}
&\expect{\sum_{j \in \sketchCols}\sCom{1}{j} \cdot \sCom{2}{j}} \nonumber\\
=&\expect{\sum_{j \in \sketchCols}\left(\sum_{\substack{\wOne \in \pw \st\\\hashP{\wOne} = j}}\gVP{1}{\wOne}\polP{\wOne}\cdot \sum_{\substack{\wTwo \in \pw \st\\\hashP{\wTwo} = j}}\gVP{2}{\wTwo}\polP{\wTwo}\right)} \nonumber\\
=&\mathbb{E}\big[\sum_{j \in \sketchCols}\sum_{\substack{\wOne, \wTwo \in \pw \st\\\hashP{\wOne} = j\\\wOne = \wTwo}}\gVP{1}{\wOne}\gVP{2}{\wOne}\polP{\wOne}\polP{\wOne} + \nonumber\\
&\qquad \gVP{1}{\wOne}\polP{\wOne}\sum_{\substack{\wOne, \wTwo \in \pw \st\\\hashP{\wOne} = j\\\wOne \neq \wTwo}}\gVP{2}{\wTwo}\polP{\wTwo}\big] \nonumber\\
=&\expect{\sum_{\wOne \in \pw}\gVP{1}{\wOne}\gVP{2}{\wOne}} \nonumber \\
=&\sum_{\wOne \in \pw}\gVP{1}{\wOne}\gVP{2}{\wOne}\label{eq:two-sk-prod}
\end{align}
Following the reversal of the pattern of $\est{2}$, an odd number of sketches would produce an expectation of $0$, since each product in the sum has an operand whose expectation evaluates to $0$, as seen in the following,
\begin{align*}
&\expect{\sum_{\wVec \in \pw}\gVP{1}{\wVec}\polP{\wVec} \cdot \sum_{\wVecPrime \in \pw}\gVP{2}{\wVecPrime}\polP{\wVecPrime}\cdot\sum_{\wVec'' \in \pw}\gVP{3}{\wVec''}\polP{\wVec''}}\\
@ -117,8 +117,23 @@ The case for an odd number of sketches can likewise be reduced to the even case
\hashP{\wTwo} = j,\\
\wTwo \neq \wOne}}\gVP{2}{\wTwo}\gVP{3}{\wTwo}.
\end{align*}
We desire an expectation which yields the ground truth. Thus we seek to find sketch products whose expectation computes to the extraneous terms above in order to cancel them out.
One potential work around would be to store additional sketches with independent $\pol$ functions. For $\est{2}$, this would result in
One potential work around would be to store additional sketches with independent $\pol$ functions. Assuming independent $\pol$ functions between the $\mathcal{S}_1, \mathcal{S}_2$ and $\mathcal{S}_3, \mathcal{S}_4$ pairs allows us to use linearity of expectations resulting in
\begin{align*}
&\expect{\sum_{j \in \sketchCols}\sCom{1}{j}\sCom{2}{j}\sCom{3}{j}\sCom{4}{j}}\\
%&= \expect{\sum_{j \in \sketchCols}\sCom{1}{j}\sCom{2}{j}}\expect{\sum_{j \in \sketchCols}\sCom{3}{j}\sCom{4}{j}}\\
&= \sum_{j \in \sketchCols}\expect{\sum_{\substack{\wOne, \wTwo,\\ \wThree, \wFour \in \pw \st\\\hashP{\wOne} =\hashP{\wTwo}\\=\hashP{\wThree} = \hashP{\wFour}}}\gVP{1}{\wOne}\polI{1}{\wOne}\gVP{2}{\wTwo}\polI{1}{\wTwo}\gVP{3}{\wThree}\polI{2}{\wThree}\gVP{4}{\wFour}\polI{2}{\wFour}}\\
&=\sum_{j \in \sketchCols}\expect{\sum_{\substack{\wOne, \wTwo \in \pw \st \\ \hashP{\wOne} = \hashP{\wTwo}}}\gVP{1}{\wOne}\polI{1}{\wOne}\gVP{2}{\wTwo}\polI{1}{\wTwo}}\\
&\qquad\cdot\expect{\sum_{\substack{\wThree, \wFour \in \pw \st \\ \hashP{\wThree} = \hashP{\wFour}}}\gVP{3}{\wThree}\polI{2}{\wThree}\gVP{4}{\wFour}\polI{2}{\wFour}}
\end{align*}
which reduces by \eqref{eq:two-sk-prod} to
\begin{equation*}
\sum_{\wOne, \wTwo \in \pw}\gVP{1}{\wOne}\gVP{2}{\wOne}\cdot \sum_{\wThree, \wFour \in \pw}\gVP{3}{\wThree}\gVP{4}{\wFour}.
\end{equation*}
The remaining additional terms can be analogously found.
\newline For $\est{2}$, this would result in
\begin{align*}
&\expect{\sum_{\wVec \in \pw}\polI{1}{\wVec}\sum_{\substack{\wOne, \wTwo, \wThree \in \pw \st\\
\hashP{\wVec} = \hashP{\wOne} =\\ \hashP{\wTwo} = \hashP{\wThree}}}\gVP{1}{\wOne}\polI{1}{\wOne}\gVP{2}{\wTwo}\polI{2}{\wTwo}\gVP{3}{\wThree}\polI{2}{\wThree}}\\
@ -129,6 +144,7 @@ One potential work around would be to store additional sketches with independent
\hashP{\wTwo} = \hashP{\wVec}}}\gVP{2}{\wTwo}\gVP{3}{\wTwo}\polI{2}{\wTwo}^2\right)\big]\\
&= \sum_{\wVec \in \pw}\gVP{1}{\wVec}\sum_{\wTwo \in \pw}\gVP{2}{\wTwo}\gVP{3}{\wTwo}
\end{align*}
\startOld{Old Content}
For the case of multiplication, when assumming independent variables, it is a known result that
\[