paper-BagRelationalPDBsAreHard/instantiation.tex

25 lines
1.4 KiB
TeX

% -*- root: main.tex -*-
\section{Instantiation}
\label{sec:instantiation}
\subsection{TIDB}
Consider the case of a TIDB with $\numTup$ tuples, with $\prob = \frac{1}{2}$ for given tuple $t$. Because TIDB has the property of set semantics, the vector $\genV$ can then be defined as a binary bit vector $\{0, 1\}^\numTup$, whose value represents a possible world, and, where each index represents a specific tuple $t$ id. Under these semantics, with $w_t$ representing the index mapped to a a tuple $t$'s identity, $\genV$ can alternatively be viewed as a function
\begin{equation*}
\genV = \begin{cases}
1, &w_t = 1\\
0, &otherwise
\end{cases}
\end{equation*}
where a value of $1$ indicates that the tuple is present in a given world, and $0$ denotes that the tuple is absent in the world represented by the binary bit string.
In this representation, a few properties of $\genV$ immediately stand out. First, the length of $\genV$ is the same as the number of tuples in the TIDB, $|\genV| = \numTup$. This combined with the assumption of $\prob = \frac{1}{2}$ implies that the L1 norm of $\genV$ is $\frac{\numTup}{2}$ and that the L2 norm of $\genV$ squared is also the same value,
\begin{equation*}
|\genV| = \numTup \wedge \prob = \frac{1}{2} \Rightarrow \norm{\genV}_1 = \norm{\genV}_2^2.
\end{equation*}
By \eqref{eq:b-cauchy} this yields a bucket size of
\begin{equation*}
\sketchCols \leq 4\sqrt{\frac{\numTup}{2}} \cdot 2^{(\numTup/2)}.
\end{equation*}