Changes according to Atri's suggestions: Section 1

This commit is contained in:
Aaron Huber 2019-07-05 21:59:56 -04:00
parent 4fe9f53b04
commit 67888e23f7
2 changed files with 10 additions and 5 deletions

View file

@ -101,7 +101,7 @@
\newcommand{\BG}[1]{\todo[inline]{\textbf{Boris says:$\,$} #1}}
\newcommand{\SF}[1]{\todo[inline]{\textbf{Su says:$\,$} #1}}
\newcommand{\OK}[1]{\todo[inline]{\textbf{Oliver says:$\,$} #1}}
\newcommand{\AH}[1]{\todo[inline]{\textbf{Aaron says:$\,$} #1}}
\newcommand{\AH}[1]{\todo[inline, backgroundcolor=blue]{\textbf{Aaron says:$\,$} #1}}
\newcommand{\AR}[1]{\todo[inline, color=green]{\textbf{Atri says:$\,$} #1}}
%\newcommand{\comment}[1]{}

View file

@ -2,10 +2,12 @@
\section{Notation}
\label{sec:notation}
The following notation is used to reason about the sketching of world membership for a given tuple. We denote the set of all possible worlds as $\pw$. A given sketch $\sketch$ can be viewed as an $\sketchRows \times \sketchCols$ matrix, i.e. a matrix with $\sketchRows$ rows and $\sketchCols$ columns. Each row of $\sketch$ is an estimation of the of $\kDom$ frequency for the given tuple represented by $\sketch$ across all possible worlds. \AR{Nitpick: the claim in the last sentence is only true at initialization. If you add/mult the vector (via aggregates) then the claim is no longer true.}
The following notation is used to reason about the sketching of world membership for a given tuple. We denote the set of all possible worlds as $\pw$. A given sketch $\sketch$ can be viewed as an $\sketchRows \times \sketchCols$ matrix, i.e. a matrix with $\sketchRows$ rows and $\sketchCols$ columns. Each row of $\sketch$ is an estimation of the $\kDom$ frequency for a given tuple represented by $\sketch$ across all possible worlds. \AR{Nitpick: the claim in the last sentence is only true at initialization. If you add/mult the vector (via aggregates) then the claim is no longer true.}
\AH{I am not sure if I know you mean that the claim is no longer true: do you mean that it is no longer true until we prove bounds for multiplication? We can add sketches with the same epsilom delta bounds, correct? OR do you mean that the tuple which the $\sketch$ represents is a different tuple than the one we started with (after performing add/mult operations on it.}
\AR{In this section, the notations $\sketchHash{i}$ and $\sketchPolar{i}$ in this section are messed up.}
To facilitate binning the $\kDom$ values for a given world $\wVec$, each row has two pairwise independent hash functions $\sketchHash{i}:\pw \to [B]$ and $\sketchPolar{i}:\pw \to \{-1,1\}$, where all functions are independent of one another. Finally, the function $\kMap{t}$ defined as $\kMap{t} : \{0, 1\}^\numTup \rightarrow \kDom$ is used to determine the tuple's $\kDom$ annotation for a given world.
\AH{Fixed.}
To facilitate binning the $\kDom$ values for a given world $\wVec$, each row has two pairwise independent hash functions $\sketchHash[i]:\pw \to [B]$ and $\sketchPolar[i]:\pw \to \{-1,1\}$, where all functions are independent of one another. Finally, the function $\kMap{t}$ defined as $\kMap{t} : \{0, 1\}^\numTup \rightarrow \kDom$ is used to determine the tuple's $\kDom$ annotation for a given world.
%\AR{I do not like this notation. I prefer vectors being typeset in bold, i.e. $\mathbf{w}$. $\wVec$ is good for writing on the board but it is more standard to bold vectors in linear algebra. Also the $\kDom$ values are not binned by $\sketchHash{i}$ but the actual $\wVec$s are.}
%\AH{Done.}
@ -15,9 +17,12 @@ To facilitate binning the $\kDom$ values for a given world $\wVec$, each row has
%\AR{While in general I'm a fan of using English to define things, one of the exceptions if when you are defining a function. It would be better to explicit state that $\sketchHash{i}:W\to [B]$ and $\sketchPolar{i}:W\to \{-1,1\}$. Of course for these definitions you need to define $W$ upfront.}
%\AH{Done}
When a world $\wVec$'s $\kDom$ value is updated, it's $\kDom$ value is first retrieved via $\kMap{t}$ and then multiplied by the output of the $i^{th}$ row's polarity function $\sketchPolar{i}$. The resulting computation is then added to the current value contained in the bin mapping. Formally:
$$\sketch[i][\sketchHash{i}(\wVec)] ~+=~ \sketchPolar{i}(\wVec) \times \kMap{t}(\wVec)$$
When a world $\wVec$'s $\kDom$ value is updated, it's $\kDom$ value is first retrieved via $\kMap{t}$ and then multiplied by the output of the $i^{th}$ row's polarity function $\sketchPolar$. The resulting computation is then added to the current value contained in the bin mapping. Formally:
$$\sketch[i][\sketchHashParam{\wVec}] ~+=~ \sketchPolarParam{\wVec} \times \kMapParam{\wVec}$$
\AR{It would also be good to state what the value in $\sketch[i][j]$ is after the initialization with the function $v_t$ is done.}
\AH{Done.}
After initialization is complete we have that
$$\sketch[i][j] = \sum_{\{\wVec \st \sketchHashParam{\wVec} = j\}}\kMapParam{\wVec} \sketchPolarParam{\wVec}.$$
When referring to Tuple Independent Databases (TIDB), a database $\relation$ contains $\numTup$ tuples, with $\numWorlds$ possible worlds $\pw$. $\pw$ is denoted as $\{0, 1\}^\numTup$, where a specific world $\wVec$ is defined as $\wVec \in \{0, 1\}^\numTup$.