Made some comments on Sec 2 till eq (13)

This commit is contained in:
Atri Rudra 2019-08-06 23:28:50 -04:00
parent f88374d63a
commit 9dd18b7846

View file

@ -1,6 +1,9 @@
% -*- root: main.tex -*-
\section{Analysis}
\label{sec:analysis}
\AR{This ia a notational nitpick but I would prefer it if this section was written for a function $v: W\to K$ and not neccessarily the special case of $v=v_t$. In particular, there is no nottion of probablitty $p$. At some point, we'll have to revisit this but I think it would be good to have the analysis in this section be for arbirary functuon $v$ and not the specific one from the TIDB. Note that this means that you should not have the first two equations in this section.}
We begin the analysis by showing that with high probability an estimate is approximately $\numWorldsP$, where $p$ is a tuple's probability measure for a given TIPD. Note that
\begin{equation}
%\gVt{k\cdot}
@ -15,7 +18,7 @@ We start off by making the claim that the expectation of the estimate of a tuple
\begin{equation}
\expect{\sum_{\wVec \in \pw} \sketchJParam{\sketchHashParam{\wVec}} \cdot \sketchPolarParam{\wVec}} = \sum_{\wVec \in \pw}\kMapParam{\wVec}\label{eq:allWorlds-est}.
\end{equation}
To verify this claim, we argue that the expectation of the estimate of a tuple's appearance in single world is its annotation, i.e.
To verify this claim, we argue that the expectation of the estimate of a tuple's appearance in single world is its annotation,\AR{Again this claim should be for every $\mathbf{w}\in W$ and not related to whether $t$ appears in a world or not.} i.e.
\begin{equation}
\expect{\sketchJParam{\sketchHashParam{\wVec}}\cdot \sketchPolarParam{\wVec}} = \kMapParam{\wVec} \label{eq:single-est}.
\end{equation}
@ -52,12 +55,14 @@ For a given $\wVec \in \pw$, substituting definitions we have
=&~\kMapParam{\wVec}\label{eq:step-five}
\end{align}
\end{subequations}
\AR{The numbering of the equations above is a bit off: you go from (4) to (3a) and so on. Also for the case when $\mathbf{w}=\mathbf{w'}$ there is no need to sum over $\mathbf{w},\mathbf{w'}\in W$-- it just makes things confusing-- sjust sum over $\mathbf{w}'\in W$.}
\begin{Justification}
\hfill
\begin{itemize}
\item \eq{\eqref{eq:step-one}} is a substitution of the definition of $\sketch$.
\item \eq{\eqref{eq:step-two}} uses the commutativity of addition to rearrange the sum.
\item \eq{\eqref{eq:step-three}} uses linearity of expectation to reduce the large expectation into smaller expectations.
\item \eq{\eqref{eq:step-two}} uses the commutativity of addition to rearrange the sum. \AR{Technically this is using associtivity but this is a nitpick.}
\item \eq{\eqref{eq:step-three}} uses linearity of expectation to reduce the large expectation into smaller expectations. \AR{I would puch the expectation further in so that they only deal with the $s_i$ terms.}
\item \eq{\eqref{eq:step-four}} follows from the second term of \eq{eq:step-three} evaluating to zero. This assumes pairwise independence of $\sketchPolar.$
\item \eq{\eqref{eq:step-five}} follows from the squaring of the $\sketchPolarParam{\wVec}$ term, which will always evaluate to 1. Keep in mind that in the summation we trivially have only 1 $\wVecPrime$ which equals $\wVec$.
\end{itemize}
@ -157,6 +162,7 @@ Note that four-wise independence is assumed across all four random variables of
\end{equation}
we see that %it can be seen that for $\wOne, \wOneP \in \pw$ and $\wTwo, \wTwoP \in \pw'$, all four random variables in \eqref{eq:polar-product} take their values from $\pw$, although we have iteration over two separate sets $\pw$.
there are five possible sets of $\wVec$ variable combinations, namely for $a, b, c, d \in \{1, 1', 2, 2'\} \st a \neq b \neq c \neq d$:
\AR{This confused me a lot to start off with. I think it is better to use $a,b,c,d$ only in the definitions of $S_1$ to $S_5$ where it is needed. In particular, it is not the case in $S_1$ to $S_3$ that you look at all possible assignment of $a, b, c, d \in \{1, 1', 2, 2'\}$.}
\begin{align*}
&\distPattern{1}:&\forElems{\cOne}\\
&\distPattern{2}:&\forElems{\cTwo}\\
@ -164,7 +170,8 @@ there are five possible sets of $\wVec$ variable combinations, namely for $a, b,
&\distPattern{4}:&\forElems{\cFour}\\
&\distPattern{5}:&\forElems{\cFive}
\end{align*}
Note that each $\wVec$ is the preimage of the same $\sketchPolar$ function, meaning, that equal worlds produce the same element in the image of $\sketchPolar$.
\AR{I think the definitions above need more work and/or there needs to be a justification for why $S_1$ to $S_2$ partition all the possibilities.}
Note that each $\wVec$ is the preimage of the same $\sketchPolar$ function, meaning, that equal worlds produce the same element in the image of $\sketchPolar$. \AR{I am not sure what the senetence above is saying.}
We are interested in those particular cases whose expectation does not equal zero, since an expectation of zero will not add to the summation of \eqref{eq:var-sum-w}. In expectation we have that
\begin{align}
@ -280,6 +287,7 @@ Computing each term separately gives
\norm{\kMap{t}}\prob \cdot \frac{\norm{\kMap{t}}\prob - \frac{\norm{\kMap{t}}}{\numWorlds}}{\sketchCols}\label{eq:spaceTwo}.
\end{align}
%In both equations, the sum of $\kMapParam{\wVec}$ over all $\wVec \in \pw$ is $\numWorldsP$ since as noted in equation \eqref{eq:mu} we are summing the number of worlds a tuple $t$ appears in, and for a TIPDB, that is exactly 2 to the power of the number of tuples in the TIPDB (due to the independence of tuples) times tuple $t$'s probability.
\AR{the above two need more work. Let's discuss more in the Aug 7 meeting.}
In equation \eqref{eq:spaceOne} we have the multiplicative factor which in expectation turns out to be the number of worlds $\numWorlds$ divided evenly across the number of buckets $\sketchCols$ minus the one tuple that $\wVecPrime$ cannot be. This factor is multiplied to sum of squares of each of the $\numWorldsP$ worlds that $t$ appears in.