paper-BagRelationalPDBsAreHard/binarybidb.tex

74 lines
6.3 KiB
TeX
Raw Normal View History

%root: main.tex
2020-06-26 17:27:52 -04:00
%!TEX root=./main.tex
2020-12-16 17:25:37 -05:00
\section{Background and Notation}\label{sec:background}
2022-02-07 12:09:43 -05:00
\subsection{Polynomial Definition and Terminology}
2022-02-21 17:13:01 -05:00
Given an index set $S$ over variables $X_\tup$ for $\tup\in S$, a (general) polynomial $\genpoly$ over $\inparen{X_\tup}_{\tup \in S}$ with individual degree $\hideg <\infty$
is formally defined as:
\begin{align}
2022-02-07 12:09:43 -05:00
\label{eq:sop-form}
\genpoly\inparen{\inparen{X_\tup}_{\tup\in S}}=\sum_{\vct{d}\in\{0,\ldots,\hideg\}^{S}} c_{\vct{d}}\cdot \prod_{\tup\in S}X_\tup^{d_\tup}&&\text{ where } c_{\vct{d}}\in \semN.
\end{align}
2022-02-07 12:09:43 -05:00
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{Definition}[Standard Monomial Basis]\label{def:smb}
The term $\prod_{\tup\in S} X_\tup^{d_\tup}$ in \Cref{eq:sop-form} is a {\em monomial}. A polynomial $\genpoly\inparen{\vct{X}}$ is in standard monomial basis (\abbrSMB) when we keep only the terms with $c_{\vct{d}}\ne 0$ from \Cref{eq:sop-form}.
2022-02-07 12:09:43 -05:00
\end{Definition}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Unless othewise noted, we consider all polynomials to be in \abbrSMB representation.
2022-02-21 17:13:01 -05:00
When it is unclear, we use $\smbOf{\genpoly}~\inparen{\smbOf{\poly}}$ to denote the \abbrSMB form of a polynomial (lineage polynomial) $\genpoly~\inparen{\poly}$.
2022-02-07 12:09:43 -05:00
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2022-05-17 09:29:45 -04:00
We call a polynomial $\poly\inparen{\vct{X}}$ a \emph{\abbrCTIDB-lineage polynomial} (or simply lineage polynomial), if it is clear from context that there exists an $\raPlus$ query $\query$, \abbrCTIDB $\pdb$, and result tuple $\tup$ such that $\poly\inparen{\vct{X}} = \apolyqdt\inparen{\vct{X}}.$
2022-02-07 12:09:43 -05:00
2022-02-21 17:13:01 -05:00
\subsection{\abbrOneBIDB}\label{subsec:one-bidb}
2020-12-19 12:59:27 -05:00
\label{subsec:tidbs-and-bidbs}
2022-02-08 16:39:14 -05:00
\noindent A block independent database \abbrBIDB $\pdb'$ models a set of worlds each of which consists of a subset of the possible tuples $\tupset'$, where $\tupset'$ is partitioned into $\numblock$ blocks $\block_i$ and all $\block_i$ are independent random events. $\pdb'$ further constrains that all $\tup\in\block_i$ for all $i\in\pbox{\numblock}$ of $\tupset'$ be disjoint events. We refer to any monomial that includes $X_\tup X_{\tup'}$ for $\tup\neq\tup'\in\block_i$ as a \emph{cancellation}. We define next a specific construction of \abbrBIDB that is useful for our work.
2022-02-21 17:13:01 -05:00
\begin{Definition}[\abbrOneBIDB]\label{def:one-bidb}
2022-02-21 18:39:06 -05:00
Define a \emph{\abbrOneBIDB} to be the pair $\pdb' = \inparen{\bigtimes_{\tup\in\tupset'}\inset{0, \bound_\tup}, \bpd'},$ where $\tupset'$ is the set of possible tuples such that each $\tup \in \tupset'$ has a multiplicity domain of $\inset{0, \bound_\tup}$, with $\bound_\tup\in\mathbb{N}$. $\tupset'$ is partitioned into $\numblock$ independent blocks $\block_i,$ for $i\in\pbox{\numblock}$, of disjoint tuples. $\bpd'$ is characterized by the vector $\inparen{\prob_\tup}_{\tup\in\tupset'}$ where for every block $\block_i$, $\sum_{\tup \in \block_i}\prob_\tup \leq 1$. Given $W\in\onebidbworlds{\tupset'}$ and for $i\in\pbox{\numblock}$, let $\prob_i(W) = \begin{cases}
2022-02-21 17:13:01 -05:00
1 - \sum_{\tup\in\block_i}\prob_\tup & \text{if }W_\tup = 0\text{ for all }\tup\in\block_i\\
2022-05-18 17:51:12 -04:00
0 & \text{if there exists } \tup \neq \tup'\in\block_i; W_\tup, W_{\tup'}\neq 0\\
2022-02-21 18:39:06 -05:00
\prob_\tup & W_\tup \ne 0 \text{ for the unique } t\in B_i.\\
2022-02-21 17:13:01 -05:00
\end{cases}$
2022-02-21 18:39:06 -05:00
\noindent$\bpd'$ is the probability distribution across all worlds such that, given $W\in\bigtimes_{\tup \in \tupset'}\inset{0,\bound_\tup}$, $\probOf\pbox{\worldvec = W} = \prod_{i\in\pbox{\numblock}}\prob_{i}(W)$.
2022-02-21 17:13:01 -05:00
\footnote{
We slightly abuse notation here, denoting a world vector as $W$ rather than $\worldvec$ to distinguish between the random variable and the world instance. When there is no ambiguity, we will denote a world vector as $\worldvec$.}
\end{Definition}
2022-02-21 17:13:01 -05:00
2022-05-19 14:45:58 -04:00
Lineage polynomials for arbitrary deterministic $\gentupset'$ are constructed in a manner analogous to $1$-\abbrTIDB\xplural (see \Cref{fig:nxDBSemantics}), differing only in the base case.
In a $1$-\abbrTIDB, each tuple contributes a multiplicity of 0 or 1, and $\polyqdt{\rel}{\gentupset}{\tup} = X_\tup$.
In a $c$-\abbrTIDB, each expanded tuple $t_j$ contributes its corresponding multiplicity: $\polyqdt{\rel}{\gentupset}{\tup_j} = j\cdot X_\tup$. These semantics are fully detailed in \Cref{fig:lin-poly-bidb}.
2022-02-08 12:51:15 -05:00
2022-03-06 22:08:00 -05:00
\begin{Proposition}[\abbrCTIDB reduction]\label{prop:ctidb-reduct}
Given \abbrCTIDB $\pdb =$\newline $\inparen{\worlds, \bpd}$, let $\pdb' = \inparen{\onebidbworlds{\tupset'}, \bpd'}$ be the \emph{\abbrOneBIDB} obtained in the following manner: for each $\tup\in\tupset$, create block $\block_\tup = \inset{\intuple{\tup, j}_{j\in\pbox{\bound}}}$ of disjoint tuples, for all $j\in\pbox{\bound}$ where $\bound_{\tup_j} = j$ for each $\tup_j$ in $\tupset'$.
The probability distribution $\bpd'$ is the characterized by the vector $\vct{p} = \inparen{\inparen{\prob_{\tup, j}}_{\tup\in\tupset, j\in\pbox{\bound}}}$.
2022-05-18 17:51:12 -04:00
Then, $\mathcal{P}$ and $\mathcal{P}'$ are equivalent.
\end{Proposition}
We now define the reduced polynomial $\rpoly'$ of a \abbrOneBIDB.
2022-03-06 22:08:00 -05:00
\begin{Definition}[$\rpoly'$]\label{def:reduced-poly-one-bidb}
2022-05-18 17:51:12 -04:00
Given a polynomial $\poly'\inparen{\vct{X}}$ generated from a \abbrOneBIDB, let $\rpoly'\inparen{\vct{X}}$ denote the reduced $\poly'\inparen{\vct{X}}$ derived as follows: i) compute $\smbOf{\poly'\inparen{\vct{X}}}$ eliminating monomials with cross terms $X_{\tup}X_{\tup'}$ for $\tup\neq \tup' \in \block_i$ and ii) reduce all \emph{variable} exponents $e > 1$ to $1$.
\end{Definition}
2022-03-06 22:08:00 -05:00
Then given $\worldvec\in\inset{0,1}^{\tupset'}$ over the reduced \abbrOneBIDB of~\Cref{prop:ctidb-reduct}, the disjoint requirement and the semantics for constructing the lineage polynomial over a \abbrOneBIDB, $\poly'\inparen{\worldvec}$ is of the same structure as the reformulated polynomial $\refpoly{}\inparen{\worldvec}$ of step i) from~\Cref{def:reduced-poly}, which then implies that $\rpoly'$ is the reduced polynomial that results from step ii) of both~\Cref{def:reduced-poly} and~\Cref{def:reduced-poly-one-bidb}, and further that~\Cref{lem:tidb-reduce-poly} immediately follows for \abbrOneBIDB polynomials.
2022-04-20 09:45:11 -04:00
\begin{Lemma}\label{lem:bin-bidb-phi-eq-redphi}
Given any \emph{\abbrOneBIDB} $\pdb'$, $\raPlus$ query $\query$, and lineage polynomial
2022-04-26 09:02:09 -04:00
$\poly'\inparen{\vct{X}}=\poly'\pbox{\query,\tupset',\tup}\inparen{\vct{X}}$, it holds that \newline$
\expct_{\vct{W} \sim \pdassign'}\pbox{\poly'\inparen{\vct{W}}} = \rpoly'\inparen{\probAllTup}.
$
\end{Lemma}
2021-09-07 10:33:13 -04:00
2022-02-07 12:09:43 -05:00
2020-12-13 15:51:55 -05:00
2021-04-08 22:51:36 -04:00
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "main"
%%% End: