Changes to proof Lem 4.8

master
Aaron Huber 2022-03-06 22:08:00 -05:00
parent 6d17225b33
commit 9ff8aa220c
4 changed files with 38 additions and 38 deletions

View File

@ -2,8 +2,8 @@
%!TEX root=./main.tex
\section{$1 \pm \epsilon$ Approximation Algorithm}\label{sec:algo}
In \Cref{sec:hard}, we showed that \Cref{prob:bag-pdb-poly-expected} cannot be solved in $\bigO{\qruntime{\optquery{\query},\tupset,\bound}}$ runtime. In light of this, we desire to produce and approximation algorithm that runs in time $\bigO{\qruntime{\optquery{\query},\tupset,\bound}}$. We do this by showing the result via circuits,
such that our approximation algorithm for this problem runs in $\bigO{\abs{\circuit}}$ for a very broad class of circuits, (thus affirming~\Cref{prob:intro-stmt}); see the discussion after \Cref{lem:val-ub} for more).
In \Cref{sec:hard}, we showed that \Cref{prob:bag-pdb-poly-expected} cannot be solved in $\bigO{\qruntime{\optquery{\query},\tupset,\bound}}$ runtime. In light of this, we desire to produce an approximation algorithm that runs in time $\bigO{\qruntime{\optquery{\query},\tupset,\bound}}$. We do this by showing the result via circuits,
such that our approximation algorithm for this problem runs in $\bigO{\abs{\circuit}}$ for a very broad class of circuits, (thus affirming~\Cref{prob:intro-stmt}); see the discussion after \Cref{lem:val-ub} for more.
The following approximation algorithm applies to bag query semantics over both
\abbrCTIDB lineage polynomials and general \abbrBIDB lineage polynomials in practice, where for the latter we note that a $1$-\abbrTIDB is equivalently a \abbrBIDB (blocks are size $1$). Our experimental results (see~\Cref{app:subsec:experiment}) which use queries from the PDBench benchmark~\cite{pdbench} show a low $\gamma$ (see~\Cref{def:param-gamma}) supporting the notion that our bounds hold for general \abbrBIDB in practice.
@ -14,7 +14,7 @@ Corresponding proofs and pseudocode for all formal statements and algorithms
\subsection{Preliminaries and some more notation}
We now introduce definitions and notation related to circuits and polynomials that we will need to state our upper bound results. First we introduce the expansion $\expansion{\circuit}$ of circuit $\circuit$ which % encodes the reduced polynomial for $\polyf\inparen{\circuit}$ and is the basis
is used in our auxiliary algorithm~\Cref{alg:sample} for sampling monomials when computing the approximation. % (part of our approximation algorithm).
is used in our auxiliary algorithm \sampmon for sampling monomials when computing the approximation. % (part of our approximation algorithm).
\begin{Definition}[$\expansion{\circuit}$]\label{def:expand-circuit}
For a circuit $\circuit$, we define $\expansion{\circuit}$ as a list of tuples $(\monom, \coef)$, where $\monom$ is a set of variables and $\coef \in \domN$.
@ -78,7 +78,7 @@ Our approximation algorithm (\approxq pseudo code in \Cref{sec:proof-lem-approx-
is based on the following observation.
% The algorithm (\approxq detailed in \Cref{alg:mon-sam}) to prove \Cref{lem:approx-alg} follows from the following observation.
Given a lineage polynomial $\poly(\vct{X})=\polyf(\circuit)$ for circuit \circuit over
\abbrOneBIDB (recall that all \abbrCTIDB can be reduced to \abbrOneBIDB by~\Cref{def:ctidb-reduct}), we have: % can exactly represent $\rpoly(\vct{X})$ as follows:
\abbrOneBIDB (recall that all \abbrCTIDB can be reduced to \abbrOneBIDB by~\Cref{prop:ctidb-reduct}), we have: % can exactly represent $\rpoly(\vct{X})$ as follows:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -138,31 +138,36 @@ In particular, if $\prob_0>0$ and $\gamma<1$ are absolute constants then the abo
%\end{Corollary}
The restriction on $\gamma$ is satisfied by any
$1$-\abbrTIDB (where $\gamma=0$ in the equivalent $1$-\abbrBIDB of~\Cref{def:ctidb-reduct})
$1$-\abbrTIDB (where $\gamma=0$ in the equivalent $1$-\abbrBIDB of~\Cref{prop:ctidb-reduct})
as well as for all three queries of the PDBench \abbrBIDB benchmark (see \Cref{app:subsec:experiment} for experimental results). Further, we can alo argue the following result:
\secrev{
\begin{Lemma}
\label{lem:c-TIDB-gamma}
Given \emph{\abbrOneBIDB} computed from the reduction of~\Cref{def:ctidb-reduct}, $\gamma\inparen{\circuit}\leq 1 - \inparen{c + 1}^{-\inparen{k-1}}$.
\label{lem:ctidb-gamma}
Given $\raPlus$ query $\query$ and \abbrCTIDB $\pdb$, let \circuit be the circuit computed by $\query\inparen{\tupset}$. Then, for the reduced \abbrOneBIDB $\pdb'$ there exists an equivalent circuit \circuit' obtained from $\query\inparen{\tupset'}$, such that $\gamma\inparen{\circuit'}\leq 1 - \inparen{\bound + 1}^{-\inparen{k-1}}$ with $\size\inparen{\circuit'} \leq \size\inparen{\circuit} + \numvar\cdot\inparen{2^{\inparen{\log{2\bound}}+ 1} - 1}$ and $\depth\inparen{\circuit'} = \depth\inparen{\circuit} + \ceil{\log{2\bound}}$.
\end{Lemma}
\begin{proof}[Proof of~\Cref{lem:c-TIDB-gamma}]
Let $\pdb' = \inparen{\onebidbworlds{\tupset'}, \pdb'}$ be the reduced \abbrOneBIDB and $\pdb = \inparen{\worlds, \pdb}$ the original \abbrCTIDB.
}
By~\Cref{def:ctidb-reduct}, $\pdb'$ is a \abbrOneBIDB.
By~\Cref{def:one-bidb}, a block $\block_\tup$ of $\pdb'$ has the property that $\sum_{\tup\in\tupset, j\in\pbox{\bound}}\prob_{\tup, j}\leq 1$. Then, if we consider the case of strict inequality, we have an extra possible outcome in block $\block_\tup$, the outcome when no tuple is present in a possible world. Let us denote this as $\tup_0$. Then there are at most $c + 1$ disjoint tuples in $\block_\tup$. We argue later that the case when $\tup_0$ is a possibility produces the worst case $\gamma$.
\secrev{
\begin{proof}[Proof of~\Cref{lem:ctidb-gamma}]
%Let $\pdb' = \inparen{\onebidbworlds{\tupset'}, \pdb'}$ be the reduced \abbrOneBIDB and $\pdb = \inparen{\worlds, \pdb}$ the original \abbrCTIDB.
The circuit \circuit' is built from \circuit in the following manner. For each input gate $\gate_i$ with $\gate_i.\val = X_\tup$, replace $\gate_i$ with the circuit \subcircuit for $\sum_{j = 1}^\bound j\cdot X_{\tup, j}$. We argue that \circuit' is a valid circuit by the following facts. Let $\pdb = \inparen{\worlds, \pdb}$ be the original \abbrCTIDB \circuit was generated from. Then, by~\Cref{prop:ctidb-reduct} there exists a reduced $\pdb' = \inparen{\onebidbworlds{\tupset'}, \pdb'}$, from which the conversion from \circuit to \circuit' follows. Both $\polyf\inparen{\circuit}$ and $\polyf\inparen{\circuit'}$ have the same expected multiplicity since (by~\Cref{prop:ctidb-reduct}) the distributions $\bpd$ and $\bpd'$ are equivalent and each $j\cdot\worldvec'_{\tup, j} = \worldvec_\tup$ for $\worldvec'\in\inset{0, 1}^{\bound\numvar}$ and $\worldvec\in\worlds$. Finally, note that the above conversion implies the claimed size and depth bounds of the lemma.
Let $\poly'\inparen{\vct{X}}$ be an aribitrary polynomial produced by $\query\inparen{\pdb'}$ with $\vct{X} = \inparen{X_{\tup, j}}_{\tup\in\tupset', j\in\pbox{0, \bound}}$ the set of variables in $\pdb'$. Let $m$ be an arbitrary monomial in $\poly'\inparen{\vct{X}}$ and $v_m$ be the set of variables appearing in $m$. We define a cross term to be any monomial $m$ such that there exists $j\neq j'\in\pbox{0, \bound}$ such that $X_{\tup, j}, X_{\tup, j'}\in v_m$.
The semantics of~\Cref{fig:lin-poly-bidb-redux} show that a new monomial product can only be generated by the $\join$ operator of $\raPlus$ queries. Further, a cross term may only be produced specifically when the join is a self join. The highest number of terms that can be produced by a self join of $\block_\tup$ is $\inparen{\bound + 1}^k$, the case for when all tuples join and $\sum_{\tup\in\tupset, j\in\pbox{\bound}}\prob_{\tup, \bound} < 1$ as noted above. For monomials $m\in\inset{\bigtimes_{i\in\pbox{k}, j\in\pbox{0, \bound}} X_{\tup, j_i}}$, there exist \emph{exaclty} $\inparen{\bound + 1}$ \emph{non}-cross terms, specifically $X_{\tup, j}^k$ for $j\in\pbox{0, \bound}$. Then there are exactly $\inparen{\bound + 1}^k - \inparen{\bound + 1}$ cross terms (cancellations). This implies that $\gamma\inparen{\circuit} = 1 - \frac{\inparen{\bound + 1}}{\inparen{\bound + 1}^k}$ for this case.
We now show that the case above is indeed the worst case. First, given a self join, it is always the case that $X_{\tup, j}^k$ will be in the output since all tuples join with themselves. Then, the most number of cancellations occurs when we have that all $X_{\tup, j}$ joins with all $X_{\tup, j'}$ for $j\neq j' \in \pbox{0, c}$. Finally, it is the case that $\bound^k - \bound \leq \inparen{\bound + 1}^k - \inparen{\bound + 1} = \sum_{i = 1}^k\binom{k}{i}c^i - \inparen{\bound - 1}$ for $\bound, k \in \mathbb{N}$, which implies that the worst case is when we have the `extra' tuple $\tup_0$ and all tuples joining, which is exactly the case above, producing the greatest $\gamma\inparen{\circuit}$ ratio.
Since the size of any block $\block$ is $\bound + 1$, it follows that $\gamma\inparen{\circuit}$ ratio for block $\block_\tup$ is the same when taken across all blocks of $\query\inparen{\pdb'}$, since the number of blocks $\numvar$ cancels out of the ratio calculations.%stays the same the number of blocks $\numvar$ results for one block $\block_\tup$ hold for the entire $\tupset'$, where the number of monomials is $\numvar\inparen{\bound + 1}^k$ and the number of non-cross terms is $\numvar\inparen{\bound + 1}$. Thus the multiplicative factor $\numvar$ (number of blocks) cancels out.% then the total number of monomials is the number of blocks $\numvar$ times $\bound + 1$ or $\numvar\cdot\inparen{\bound + 1}$.
Consider list of expanded monomials $\expansion{\circuit}$ for \abbrCTIDB circuit \circuit. Let \monom be an arbitrary monomial with $\ell$ variables. Then $\monom = X_{\tup, 1}^{d_1},\ldots,X_{\tup, \ell}^{d_\ell}$ yields the set of monomials $\inparen{j_1\cdot X_{\tup, 1}^{d_1},\ldots, j_\ell\cdot X_{\tup, \ell}^{d_\ell}}_{j_1,\ldots, j_\ell \in \pbox{0, \bound}}$ in $\expansion{\circuit'}$. Observe that cancellations can only occur for each $X_{\tup_i}^{d_i}$ which implies that, considering only $X_{\tup, i}^{d_i}$, $\gamma \leq 1 - \inparen{c + 1}^{k - 1}$, since for each element in $\inset{\bigtimes_{i\in\pbox{k}, j_i\in\pbox{0, \bound}}X_{j_i}}$ there are \emph{exactly} $\bound+1$ surviving elements with $j_1=\cdots=j_k$, i.e. $X_j^k$ for each $j\in\pbox{0, \bound}$. The rest of the $\inparen{\bound + 1}^k$ cross terms cancel.
%By~\Cref{def:ctidb-reduct}, $\pdb'$ is a \abbrOneBIDB.
%By~\Cref{def:one-bidb}, a block $\block_\tup$ of $\pdb'$ has the property that $\sum_{\tup\in\tupset, j\in\pbox{\bound}}\prob_{\tup, j}\leq 1$. Then, if we consider the case of strict inequality, we have an extra possible outcome in block $\block_\tup$, the outcome when no tuple is present in a possible world. Let us denote this as $\tup_0$. Then there are at most $c + 1$ disjoint tuples in $\block_\tup$. We argue later that the case when $\tup_0$ is a possibility produces the worst case $\gamma$.
%
%Let $\poly'\inparen{\vct{X}}$ be an aribitrary polynomial produced by $\query\inparen{\pdb'}$ with $\vct{X} = \inparen{X_{\tup, j}}_{\tup\in\tupset', j\in\pbox{0, \bound}}$ the set of variables in $\pdb'$. Let $m$ be an arbitrary monomial in $\poly'\inparen{\vct{X}}$ and $v_m$ be the set of variables appearing in $m$. We define a cross term to be any monomial $m$ such that there exists $j\neq j'\in\pbox{0, \bound}$ such that $X_{\tup, j}, X_{\tup, j'}\in v_m$.
%
%The semantics of~\Cref{fig:lin-poly-bidb-redux} show that a new monomial product can only be generated by the $\join$ operator of $\raPlus$ queries. Further, a cross term may only be produced specifically when the join is a self join. The highest number of terms that can be produced by a self join of $\block_\tup$ is $\inparen{\bound + 1}^k$, the case for when all tuples join and $\sum_{\tup\in\tupset, j\in\pbox{\bound}}\prob_{\tup, \bound} < 1$ as noted above. For monomials $m\in\inset{\bigtimes_{i\in\pbox{k}, j\in\pbox{0, \bound}} X_{\tup, j_i}}$, there exist \emph{exaclty} $\inparen{\bound + 1}$ \emph{non}-cross terms, specifically $X_{\tup, j}^k$ for $j\in\pbox{0, \bound}$. Then there are exactly $\inparen{\bound + 1}^k - \inparen{\bound + 1}$ cross terms (cancellations). This implies that $\gamma\inparen{\circuit} = 1 - \frac{\inparen{\bound + 1}}{\inparen{\bound + 1}^k}$ for this case.
%
%We now show that the case above is indeed the worst case. First, given a self join, it is always the case that $X_{\tup, j}^k$ will be in the output since all tuples join with themselves. Then, the most number of cancellations occurs when we have that all $X_{\tup, j}$ joins with all $X_{\tup, j'}$ for $j\neq j' \in \pbox{0, c}$. Finally, it is the case that $\bound^k - \bound \leq \inparen{\bound + 1}^k - \inparen{\bound + 1} = \sum_{i = 1}^k\binom{k}{i}c^i - \inparen{\bound - 1}$ for $\bound, k \in \mathbb{N}$, which implies that the worst case is when we have the `extra' tuple $\tup_0$ and all tuples joining, which is exactly the case above, producing the greatest $\gamma\inparen{\circuit}$ ratio.
%
%Since the size of any block $\block$ is $\bound + 1$, it follows that $\gamma\inparen{\circuit}$ ratio for block $\block_\tup$ is the same when taken across all blocks of $\query\inparen{\pdb'}$, since the number of blocks $\numvar$ cancels out of the ratio calculations.%stays the same the number of blocks $\numvar$ results for one block $\block_\tup$ hold for the entire $\tupset'$, where the number of monomials is $\numvar\inparen{\bound + 1}^k$ and the number of non-cross terms is $\numvar\inparen{\bound + 1}$. Thus the multiplicative factor $\numvar$ (number of blocks) cancels out.% then the total number of monomials is the number of blocks $\numvar$ times $\bound + 1$ or $\numvar\cdot\inparen{\bound + 1}$.
\end{proof}
\qed
}
We briefly connect the runtime in \Cref{eq:approx-algo-runtime} to the algorithm outline earlier (where we ignore the dependence on $\multc{\cdot}{\cdot}$, which is needed to handle the cost of arithmetic operations over integers). The $\size(\circuit)$ comes from the time take to run \onepass once (\onepass essentially computes $\abs{\circuit}(1,\ldots, 1)$ using the natural circuit evaluation algorithm on $\circuit$). We make $\frac{\log{\frac{1}{\conf}}}{\inparen{\error'}^2\cdot(1-\gamma)^2\cdot \prob_0^{2k}}$ many calls to \sampmon (each of which essentially traces $O(k)$ random sink to source paths in $\circuit$ all of which by definition have length at most $\depth(\circuit)$).
We briefly connect the runtime in \Cref{eq:approx-algo-runtime} to the algorithm outline earlier (where we ignore the dependence on $\multc{\cdot}{\cdot}$, which is needed to handle the cost of arithmetic operations over integers). The $\size(\circuit)$ comes from the time taken to run \onepass once (\onepass essentially computes $\abs{\circuit}(1,\ldots, 1)$ using the natural circuit evaluation algorithm on $\circuit$). We make $\frac{\log{\frac{1}{\conf}}}{\inparen{\error'}^2\cdot(1-\gamma)^2\cdot \prob_0^{2k}}$ many calls to \sampmon (each of which essentially traces $O(k)$ random sink to source paths in $\circuit$ all of which by definition have length at most $\depth(\circuit)$).
Finally, we address the $\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}$ term in the runtime. %In \Cref{susec:proof-val-up}, we show the following:
\begin{Lemma}
@ -181,7 +186,7 @@ Note that the above implies that with the assumption $\prob_0>0$ and $\gamma<1$
%\AH{Is it standard to assume that in the asymptotic notation above, $\error$ and $\delta$ are constant? Otherwise this does not uphold~\Cref{prob:intro-stmt}.}
Finally, note that by \Cref{prop:circuit-depth} and \Cref{lem:circ-model-runtime} for any $\raPlus$ query $\query$, there exists a circuit $\circuit^*$ for $\apolyqdt$ such that $\depth(\circuit^*)\le O_{|Q|}(\log{n})$ and $\size(\circuit)\le O_k\inparen{\qruntime{\query, \dbbase}}$. Using this along with \Cref{lem:val-ub}, \Cref{cor:approx-algo-const-p} and the fact that $n\le \qruntime{\query, \dbbase}$, we have the following corollary:
Finally, note that by \Cref{prop:circuit-depth} and \Cref{lem:circ-model-runtime} for any $\raPlus$ query $\query$, there exists a circuit $\circuit^*$ for $\apolyqdt$ such that $\depth(\circuit^*)\le O_{|Q|}(\log{n})$ and $\size(\circuit)\le O_k\inparen{\qruntime{\query, \tupset, \bound}}$. Using this along with \Cref{lem:val-ub}, \Cref{cor:approx-algo-const-p} and the fact that $n\le \qruntime{\query, \tupset, \bound}$, we have the following corollary:
\begin{Corollary}
\label{cor:approx-algo-punchline}
Let $\query$ be an $\raPlus$ query and $\pdb$ be a \emph{\abbrOneBIDB} with $p_0>0$ and $\gamma<1$ (where $p_0,\gamma$ as in \Cref{cor:approx-algo-const-p}) are absolute constants. Let $\poly(\vct{X})=\apolyqdt$ for any result tuple $\tup$ with $\deg(\poly)=k$. Then one can compute an approximation satisfying \Cref{eq:approx-algo-bound-main} in time $O_{k,|Q|,\error',\conf}\inparen{\qruntime{\optquery{\query}, \tupset, \bound}}$ (given $\query,\tupset$ and $p_i$ for each $i\in [n]$ that defines $\pd$).
@ -189,7 +194,7 @@ Let $\query$ be an $\raPlus$ query and $\pdb$ be a \emph{\abbrOneBIDB} with $p_0
% $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)$. % for the case when $\circuit$ satisfies the specific conditions in \Cref{lem:val-ub}.
\end{Corollary}
Next, we note that the above result along with \Cref{lem:c-TIDB-gamma}
Next, we note that the above result along with \Cref{lem:ctidb-gamma}
answers \Cref{prob:big-o-joint-steps} in the affirmative as follows:
\begin{Corollary}
\label{cor:approx-algo-punchline-ctidb}
@ -199,7 +204,7 @@ Let $\query$ be an $\raPlus$ query and $\pdb$ be a \abbrCTIDB with $p_0>0$ (wher
\end{Corollary}
\secrev{
\begin{proof}[Proof of~\Cref{cor:approx-algo-punchline-ctidb}]
The proof follows by~\Cref{def:ctidb-reduct},~\Cref{lem:c-TIDB-gamma}, and~\Cref{cor:approx-algo-punchline}.
The proof follows by~\Cref{def:ctidb-reduct},~\Cref{lem:ctidb-gamma}, and~\Cref{cor:approx-algo-punchline}.
\end{proof}
\qed
}

View File

@ -77,7 +77,7 @@ We are now ready to present our main hardness result.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{Theorem}\label{thm:mult-p-hard-result}
Let $\prob_0,\ldots,\prob_{2k}$ be $2k + 1$ distinct values in $(0, 1]$. Then computing $\rpoly_G^\kElem(\prob_i,\dots,\prob_i)$ (over all $i\in [2k+1]$ for arbitrary $G=(\vset,\edgeSet)$
Let $\prob_0,\ldots,\prob_{2k}$ be $2k + 1$ distinct values in $(0, 1]$. Then computing $\rpoly_G^\kElem(\prob_i,\dots,\prob_i)$ (over all $i\in [2k+1]$) for arbitrary $G=(\vset,\edgeSet)$
%and any $(2k+1)$ distinct values $\prob_i$ ($0\le i \le 2k$)
needs time $\bigOmega{\kmatchtime}$, assuming $\kmatchtime\ge \omega\inparen{\abs{\edgeSet}}$.
\end{Theorem}

View File

@ -59,7 +59,7 @@ In this example, $\abs{\block_\tup} = \bound$ for all $\tup$.
We now make a meaningful connection between possible world semantics and world assignments on the lineage polynomial.
\AH{Not sure if I should change~\Cref{prop:expection-of-polynom} to \abbrCTIDB. The proof is for general \abbrBPDB, which is a stronger statement, but our work focuses mostly on \abbrCTIDB. Do I stick with \abbrCTIDB and modify the proof? Why?}
\begin{Proposition}[Expectation of polynomials]\label{prop:expection-of-polynom}
Given a \abbrBPDB $\pdb = (\Omega,\bpd)$, $\raPlus$ query $\query$, and lineage polynomial $\apolyqdt$ for arbitrary result tuple $\tup$, %$\semNX$-\abbrPDB $\pxdb = (\idb_{\semNX}',\pd')$ where $\rmod(\pxdb) = \pdb$,
we have (denoting $\randDB$ as the random variable over $\Omega$):

View File

@ -67,9 +67,9 @@ Define a \emph{\abbrOneBIDB} to be the pair $\pdb' = \inparen{\bigtimes_{\tup\in
We slightly abuse notation here, denoting a world vector as $W$ rather than $\worldvec$ to distinguish between the random variable and the world instance. When there is no ambiguity, we will denote a world vector as $\worldvec$.}% $\worldvec\in\prod_{\tup\in\tupset'}\inset{0,\bound_\tup},\tup,~\tup'\in\block_i~:~\probOf\pbox{\worldvec_\tup, \worldvec_\tup'>0} = 0$.
\end{Definition}
We now present a reduction that is useful in deriving our results:
\Cref{fig:lin-poly-bidb} shows the lineage construction of $\poly'\inparen{\vct{X}}$ given $\raPlus$ query $\query$ for arbitrary deterministic $\gentupset'$. Note that the semantics differ from~\Cref{fig:nxDBSemantics} only in the base case.
\begin{Proposition}[\abbrCTIDB reduction]\label{def:ctidb-reduct}
\begin{Proposition}[\abbrCTIDB reduction]\label{prop:ctidb-reduct}
Given \abbrCTIDB $\pdb = \inparen{\worlds, \bpd}$, let $\pdb' = \inparen{\onebidbworlds{\tupset'}, \bpd'}$ be the \emph{\abbrOneBIDB} obtained in the following manner: for each $\tup\in\tupset$, create block $\block_\tup = \inset{\intup{\tup, j}_{j\in\pbox{\bound}}}$ of disjoint tuples, for all $j\in\pbox{\bound}$.% such that $X_{\tup, j}\in\inset{0,1}$.
The probability distribution $\bpd'$ is the characterized by the vector $\vct{p} = \inparen{\inparen{\prob_{\tup, j}}_{\tup\in\tupset, j\in\pbox{\bound}}}$. % for $\tup\in\tupset$ with multiplicity $j$.
Then, the distributions $\mathcal{P}$ and $\mathcal{P}'$ are equivalent.
@ -78,8 +78,8 @@ Given \abbrCTIDB $\pdb = \inparen{\worlds, \bpd}$, let $\pdb' = \inparen{\onebid
%$\tup_j\geq1\implies \tup_{j'} = 0$.$\forall j, j' \in \pbox{\bound},\forall \tup\in\tupset, \tup_j\geq 1\implies \tup_{j'} = 0$ for any block $\block_\tup$.
\end{Proposition}
For $\poly\inparen{\vct{X}}$ generated from \abbrCTIDB $\pdb$, each $X_\tup\in\pbox{\bound}$, while, given $\poly'\inparen{\vct{X}}$ produced from the reduced \abbrOneBIDB $\pdb'$, each $X_{\tup, j}\in\inset{0, 1}$. %As previously noted, unlike $X_{\tup}\in\inset{0,\ldots,\bound}$ for $X_{\tup}\in\vars{\pdb}$, $X_{\tup, j}\in\inset{0,1}$ for $X_{\tup, j}\in\vars{\pdb'}$.
Hence, in the setting of \abbrOneBIDB, we have the following semantics for generating lineage polynomials in $\raPlus$ queries shown in~\Cref{fig:lin-poly-bidb-redux}. Note that the semantics for lineage polynomial construction only changes for the base case.
%For $\poly\inparen{\vct{X}}$ generated from \abbrCTIDB $\pdb$, each $X_\tup\in\pbox{\bound}$, while, given $\poly'\inparen{\vct{X}}$ produced from the reduced \abbrOneBIDB $\pdb'$, each $X_{\tup, j}\in\inset{0, 1}$. %As previously noted, unlike $X_{\tup}\in\inset{0,\ldots,\bound}$ for $X_{\tup}\in\vars{\pdb}$, $X_{\tup, j}\in\inset{0,1}$ for $X_{\tup, j}\in\vars{\pdb'}$.
%Hence, in the setting of \abbrOneBIDB, we have the following semantics for generating lineage polynomials in $\raPlus$ queries shown in~\Cref{fig:lin-poly-bidb}. Note that the semantics for lineage polynomial construction only changes for the base case.
We now define the reduced polynomial $\rpoly'$ of a \abbrOneBIDB.
\begin{figure}[t!]
@ -100,18 +100,13 @@ We now define the reduced polynomial $\rpoly'$ of a \abbrOneBIDB.
\end{align*}\\[-10mm]
\end{minipage}}
\caption{Construction of the lineage (polynomial) for an $\raPlus$ query $\query$ over $\gentupset'$.}
\label{fig:lin-poly-bidb-redux}
\label{fig:lin-poly-bidb}
\end{figure}
\AH{I feel disatisfied by the following items:\newline
i) The use of $X\in\vct{X}$ in~\Cref{fig:lin-poly-bidb-redux} is assuming a \emph{type}! This is an abuse of our designating $X\in\vct{X}$ to be typless/abstract.\newline
ii) I am not sure whether to use $\tupset'$ or $\gentupset'$ in~\Cref{fig:lin-poly-bidb-redux}. I am inclined to use $\gentupset'$ to denote arbitrary reduced \abbrOneBIDB.
\newline
iii) The typeset of $\gentupset'$ is a bit odd, with the apostrophe being higher; however, placing the apostrophe underneath the overline is too low.
}
\begin{Definition}[$\rpoly'$]\label{def:reduced-poly-redux}
Given a polynomial $\poly'\inparen{\vct{X}}$ generated from a \abbrOneBIDB produced from the reduction of~\Cref{def:ctidb-reduct} and let $\rpoly'\inparen{\vct{X}}$ denote the reduced form of $\poly'\inparen{\vct{X}}$ computed as follows: i) compute $\smbOf{\poly'\inparen{\vct{X}}}$, ii) reduce all \emph{variable} exponents $e > 1$ to $1$.
\begin{Definition}[$\rpoly'$]\label{def:reduced-poly-one-bidb}
Given a polynomial $\poly'\inparen{\vct{X}}$ generated from a \abbrOneBIDB and let $\rpoly'\inparen{\vct{X}}$ denote the reduced form of $\poly'\inparen{\vct{X}}$ derived as follows: i) compute $\smbOf{\poly'\inparen{\vct{X}}}$ eliminating all monomials with cross terms $X_{\tup}X_{\tup'}$ for $\tup\neq \tup' \in \block_i$ and ii) reduce all \emph{variable} exponents $e > 1$ to $1$.
\end{Definition}
Then given $\worldvec\in\inset{0,1}^{\tupset'}$, the disjoint requirement and the semantics for constructing the lineage polynomial over a \abbrOneBIDB, $\poly'\inparen{\worldvec}$ is of the same structure as the reformulated polynomial $\refpoly{}\inparen{\worldvec}$ of step i) from~\Cref{def:reduced-poly}, which then implies that $\rpoly'$ is the reduced polynomial that results from step ii) of~\Cref{def:reduced-poly}, and further that~\Cref{lem:tidb-reduce-poly} immediately follows for \abbrOneBIDB polynomials.
Then given $\worldvec\in\inset{0,1}^{\tupset'}$ over the reduced \abbrOneBIDB of~\Cref{prop:ctidb-reduct}, the disjoint requirement and the semantics for constructing the lineage polynomial over a \abbrOneBIDB, $\poly'\inparen{\worldvec}$ is of the same structure as the reformulated polynomial $\refpoly{}\inparen{\worldvec}$ of step i) from~\Cref{def:reduced-poly}, which then implies that $\rpoly'$ is the reduced polynomial that results from step ii) of both~\Cref{def:reduced-poly} and~\Cref{def:reduced-poly-one-bidb}, and further that~\Cref{lem:tidb-reduce-poly} immediately follows for \abbrOneBIDB polynomials.
\begin{Lemma}
Given any %\abbrCTIDB $\pdb$, its reduced counterpart
\emph{\abbrOneBIDB} $\pdb'$, $\raPlus$ query $\query$, and lineage polynomial