Changes to S.2.
parent
99cf4a38e1
commit
e48bbb27c6
|
@ -26,13 +26,14 @@ When it is unclear, we use $\smbOf{\poly}$ to denote the \abbrSMB form of a poly
|
|||
The degree of polynomial $\poly(\vct{X})$ is the largest $\sum_{i=1}^n d_i$ such that $c_{(d_1,\dots,d_n)}\ne 0$. % maximum sum of exponents, over all monomials in $\smbOf{\poly(\vct{X})}$.
|
||||
\end{Definition}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
||||
\AH{The example needs to change to avoid ambiguity between different definitions of \emph{degree}. Our definition is the one we want, since we two different tuples join without any tuples joining on themselves, and the degree would be $2$, rather than $1$, for example.}
|
||||
The degree of the polynomial $X^2+2XY+Y^2$ is $2$.
|
||||
Product terms in lineage arise only from join operations (\Cref{fig:nxDBSemantics}), so intuitively, the degree of a lineage polynomial is analogous to the largest number of joins in any clause of the UCQ query that created it.
|
||||
In this paper we consider only finite degree polynomials.
|
||||
We call a polynomial $\poly\inparen{\vct{X}}$ a \emph{\bi-lineage polynomial} (resp., \emph{\ti-lineage polynomial}, or simply lineage polynomial), if there exists a \AH{Which formalism? UCQ?}$\raPlus$ query $\query$, \bi $\pxdb$ (\ti $\pxdb$, or $\semNX$-PDB $\pxdb$), and tuple $\tup$ such that $\poly\inparen{\vct{X}} = \query(\pxdb)(\tup)$.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\AH{There was reviewer disgruntlement with how we discussed modding and all in the next few definitions. Need to revisit these comments and adjust accordingly.}
|
||||
\begin{Definition}[Modding with a set]\label{def:mod-set}
|
||||
Let $S$ be a {\em set} of polynomials over $\vct{X}$. Then $\poly(\vct{X})\mod{S}$ is the polynomial obtained by taking the mod of $\poly(\vct{X})$ over {\em all} polynomials in $S$ (order does not matter).
|
||||
\end{Definition}
|
||||
|
@ -158,6 +159,7 @@ to the variables $\vct{X}$. Intuitively, \Cref{lem:exp-poly-rpoly} states that w
|
|||
\begin{Corollary}\label{cor:expct-sop}
|
||||
If $\poly$ is a \bi-lineage polynomial already in \abbrSMB, then the expectation of $\poly$, i.e., $\expct\pbox{\poly} = \rpoly\left(\prob_1,\ldots, \prob_\numvar\right)$ can be computed in $\bigO{\size\inparen{\poly}}$, where $\size\inparen{\poly}$ (\Cref{def:size}) is proportional to the total number of multiplication/addition operators in $\poly$.
|
||||
\end{Corollary}
|
||||
\AH{Maybe state in terms of $\abs{\circuit}$.}
|
||||
%\AH{What if $\poly$ is not in \abbrSMB form?}
|
||||
|
||||
|
||||
|
|
|
@ -11,9 +11,9 @@ We represent query polynomials via {\em arithmetic circuits}~\cite{arith-complex
|
|||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\begin{Definition}[Circuit]\label{def:circuit}
|
||||
A circuit $\circuit$ is a Directed Acyclic Graph (DAG) whose source gates (in degree of $0$) consist of elements in either $\domN$ or $\vct{X}$. The internal gates and (the single) sink gate of $\circuit$ (corresponding to the result tuple $t$) have binary input and are either sum ($\circplus$) or product ($\circmult$) gates.
|
||||
A circuit $\circuit$ is a Directed Acyclic Graph (DAG) whose source gates (in degree of $0$) consist of elements in either $\domN$ or $\vct{X}$. The internal gates have binary input and are either sum ($\circplus$) or product ($\circmult$) gates.
|
||||
%
|
||||
Each node in a circuit $\circuit$ has the following members: \type, \val, \vpartial, \vari{input}, \degval and \vari{Lweight}, \vari{Rweight}, where \type is the type of value stored in the gate (one of $\{\circplus, \circmult, \var, \tnum\}$, \val is the value stored (a constant or variable), and \vari{input} is the list of the gate's inputs. We use $\circuit_\linput$ to denote the left input and $\circuit_\rinput$ the right input of the sink of circuit $\circuit$.
|
||||
Each internal node in a circuit $\circuit$ has the following members: \type, \vpartial, \vari{input}, \degval, \vari{Lweight}, and \vari{Rweight}, where \type is the type of value stored in the gate (one of $\{\circplus, \circmult, \var, \tnum\}$ and \vari{input} is the list of the gate's inputs. The source gates have an additional member \val, which holds the value stored (constant or variable). We use $\circuit_\linput$ to denote the left input and $\circuit_\rinput$ the right input of a sink of circuit $\circuit$.
|
||||
%The member \degval holds the degree of \circuit.
|
||||
When the underlying DAG is a tree (with edges pointing towards the root), we will refer to the structure as an expression tree \etree. Note that in such a case, the root of \etree is analogous to the sink of \circuit.
|
||||
\end{Definition}
|
||||
|
@ -22,7 +22,7 @@ When the underlying DAG is a tree (with edges pointing towards the root), we wil
|
|||
|
||||
As stated in \Cref{def:circuit}, every internal node has at most two in-edges, is labeled as an addition or a multiplication node, and has no limit on its outdegree.
|
||||
Note that if we limit the outdegree to one, then we get expression trees.
|
||||
We ignore the fields \vari{partial}, \vari{Lweight}, and \vari{Rweight} until \Cref{sec:algo}.\AH{We omit degree here too, which {\emph I think} is used only in appendix proofs.}
|
||||
We ignore the fields \vari{partial}, \degval, \vari{Lweight}, and \vari{Rweight} until \Cref{sec:algo}.\AH{We omit degree here too, which {\emph I think} is used only in appendix proofs.}
|
||||
|
||||
|
||||
\begin{Example}
|
||||
|
|
|
@ -5,7 +5,7 @@
|
|||
|
||||
\subsection{Probabilistic Databases}
|
||||
|
||||
We focus primarily on set-\abbrPDB inputs in this section, but as noted in \cref{sec:intro-rewrite-070921}, this is not limiting.
|
||||
The setting used in this section is primarily that of a bag-\abbrPDB query with set-\abbrPDB inputs. Recall, as noted in \cref{sec:intro-rewrite-070921}, this is not limiting.
|
||||
|
||||
An \textit{incomplete database} $\idb$ is a set of deterministic databases $\db$ called possible worlds.
|
||||
Denote the schema of $\db$ as $\sch(\db)$. A \textit{probabilistic database} $\pdb$ is a pair $(\idb, \pd)$ where $\idb$ is an incomplete database and $\pd$ is a probability distribution over $\idb$. Queries over probabilistic databases are evaluated using the so-called possible world semantics. Under the possible world semantics, the result of a query $\query$ over an incomplete database $\idb$ is the set of query answers produced by evaluating $\query$ over each possible world: $\query(\idb) = \comprehension{\query(\db)}{\db \in \idb}$.
|
||||
|
@ -47,7 +47,7 @@ A \emph{\ti} is a \bi where each block contains exactly one tuple.
|
|||
\Cref{subsec:supp-mat-ti-bi-def} explains \tis and \bis in greater detail.
|
||||
%
|
||||
In a \bi (and by extension a \ti) $\pxdb$, tuples are partitioned into $\ell$ blocks $\block_1, \ldots, \block_\ell$ where tuple $t_{i,j} \in \block_i$ is associated with a probability $\prob_{\tup_{i,j}} = \probOf[X_{i,j} = 1]$, and is annotated with a unique variable $X_{i,j}$.\footnote{
|
||||
Although only a single independent, $[\abs{\block_i}+1]$-valued variable is customarily used per block, we decompose it into $\abs{\block_i}$ correlated $\{0,1\}$-valued variables per block that can be used directly in polynomials (without an indicator function). For $t_j \in b_i$, the event $(X_{i,j} = 1)$ corresponds to the event $(X_i = j)$ in the customary annotation scheme.
|
||||
Although only a single independent, $[\abs{\block_i}+1]$-valued variable is customarily used per block, we decompose it into $\abs{\block_i}$ correlated $\{0,1\}$-valued variables per block that can be used directly in polynomials (without an indicator function). For $t_{i, j} \in b_i$, the event $(X_{i,j} = 1)$ corresponds to the event $(X_i = j)$ in the customary annotation scheme.
|
||||
}
|
||||
Because blocks are independent and tuples from the same block are disjoint, the probabilities $\prob_{\tup_{i,j}}$ and the blocks induce the probability distribution $\pd$ of $\pxdb$.
|
||||
We will write a \bi-lineage polynomial $\poly(\vct{X})$ for a \bi with $\ell$ blocks as
|
||||
|
|
Loading…
Reference in New Issue