paper-BagRelationalPDBsAreHard/notation.tex

25 lines
2.7 KiB
TeX

% -*- root: main.tex -*-
\section{Notation}
\label{sec:notation}
The following notation is used to reason about the sketching of world membership for a given tuple. We denote the set of all possible worlds as $\pw$. A given sketch $\sketch$ can be viewed as an $\sketchRows \times \sketchCols$ matrix, i.e. a matrix with $\sketchRows$ rows and $\sketchCols$ columns. Each row of $\sketch$ is an estimation of the of $\kDom$ frequency for the given tuple represented by $\sketch$ across all possible worlds.
To facilitate binning the $\kDom$ values for a given world $\wVec$, each row has two pairwise independent hash functions $\sketchHash{i}:\pw \to [B]$ and $\sketchPolar{i}:\pw \to \{-1,1\}$, where all functions are independent of one another. Finally, the function $\kMap{t}$ defined as $\kMap{t} : \{0, 1\}^\numTup \rightarrow \kDom$ is used to determine the tuple's $\kDom$ annotation for a given world.
\AR{I do not like this notation. I prefer vectors being typeset in bold, i.e. $\mathbf{w}$. $\wVec$ is good for writing on the board but it is more standard to bold vectors in linear algebra. Also the $\kDom$ values are not binned by $\sketchHash{i}$ but the actual $\wVec$s are.}
\AH{Done.}
%for each $i, j \in \sketchRows \text{ s.t. } i \neq j, \sketchHash{i}$ is independent of $\sketchHash{j}$ and $\sketchPolar{i}$ is independent of $\sketchPolar{j}$. Thus each row can be viewed as an independent estimation.
\AR{While in general I'm a fan of using English to define things, one of the exceptions if when you are defining a function. It would be better to explicit state that $\sketchHash{i}:W\to [B]$ and $\sketchPolar{i}:W\to \{-1,1\}$. Of course for these definitions you need to define $W$ upfront.}
\AH{Done}
When a world value $\wVec$'s $\kDom$ value is updated, it's $\kDom$ value is first retrieved via $\kMap{t}$ and then multiplied by the output of the $i^{th}$ row's polarity function $\sketchPolar{i}$. The resulting computation is then added to the current value contained in the bin mapping. Formally:
$$\sketch[\sketchHash{i}(\wVec)] ~+=~ \sketchPolar{i}(\wVec) \times \kMap{t}(\wVec)$$
When referring to Tuple Independent Databases (TIDB), a database $\relation$ contains $\numTup$ tuples, with $\numWorlds$ possible worlds $\pw$. $\pw$ is denoted as $\{0, 1\}^\numTup$, where a specific world $\wVec$ is defined as $\wVec \in \{0, 1\}^\numTup$.
\AR{I'm fine $\kMap{t}$ defined as a function instead of a vector in $\kDom^W$ but I'm not sure if one would be easier than the other to write arguments. I guess we can re-consider this later as it is defined as a macro.}
\AH{I too am unsure of which way would be best to go on this. I think originally we had proposed to define $\wVec$ as a mapping to the tuple's $\kDom$ annotation.}