34 lines
1.3 KiB
TeX
34 lines
1.3 KiB
TeX
|
% -*- root: main.tex -*-
|
||
|
|
||
|
\section{Bounding the Estimates}
|
||
|
|
||
|
\newcommand{\bMu}{\epsilon\mu_{\sketchCols_{sum}}}
|
||
|
|
||
|
For a $\sketchCols$ estimate, denoted $\sketchCols_{est}$, and given the following:
|
||
|
|
||
|
\begin{align*}
|
||
|
&\bMu \text{ is the expectation for the sum of estimates.}\\
|
||
|
&X = \sum_{i = 1}^{\sketchRows}X_i \\
|
||
|
&X_i\text{ is i.i.d. r.v.} \in [0, 1], i \in \sketchRows \\
|
||
|
&X_i = \begin{cases}
|
||
|
0 &\sketchCols_{est} > \bMu\\
|
||
|
1 &\sketchCols_{est} \leq \bMu
|
||
|
\end{cases}\\
|
||
|
&p[X_i = 1] \geq \frac{2}{3}\\
|
||
|
&p[X_i = 0] \leq \frac{1}{3}\\
|
||
|
&\mu = \frac{2}{3}\sketchRows\\
|
||
|
&\epsilon = 0.5
|
||
|
\end{align*}
|
||
|
|
||
|
Because Chebyshev bounds tell us that the probability of a bad row estimate is $\leq \frac{1}{3}$, we set epsilon to the value that, when multiplied to $\mu$, outputs $\frac{1}{3}$. We then derive bounds for $\sketchRows$.
|
||
|
Note, because we are only concerned with the left side of the tail, we can use the generic Chernoff bounds for the left tail,
|
||
|
\begin{equation*}
|
||
|
Pr[|X - \mu| \leq (1 - \epsilon)\mu] \leq e^{-\frac{\epsilon^2}{2 + \epsilon}\mu}.
|
||
|
\end{equation*}
|
||
|
Solving for $\delta$,
|
||
|
\begin{align*}
|
||
|
\delta \geq e^{-\frac{(\frac{1}{3})^2}{2 + \frac{1}{3}}\frac{2}{3}\sketchRows}\\
|
||
|
\delta \geq e^{-\frac{63}{2}\sketchRows}\\
|
||
|
e^{\frac{63}{2}\sketchRows} \geq \frac{1}{\delta}\\
|
||
|
\sketchRows \geq \frac{63}{2}ln(\frac{1}{\delta})
|
||
|
\end{align*}
|