Ported some defs from S4 to S2; capitalized variables.

master
Aaron Huber 2020-12-18 11:39:38 -05:00
parent 634d6a0048
commit a8c399325e
3 changed files with 33 additions and 38 deletions

View File

@ -9,40 +9,22 @@ In~\cref{sec:hard}, we showed that computing the expected multiplicity of a comp
First, let us introduce some useful definitions and notation related to polynomials and their representations. For illustrative purposes in the definitions below, we will use the following {\em bivariate} polynomial:
\begin{equation}
\label{eq:poly-eg}
\poly(x,y) = 2x^2 + 3xy - 2y^2.
\poly(X, Y) = 2X^2 + 3XY - 2Y^2.
\end{equation}
\AR{The definition from this and my next comments are "new"-- they might be better off in the prelims section and moved to later in this section. Am keeping all of them in one place for easy lookup for now.}
\begin{Definition}[Variables in a monomial]\label{def:vars}
Given a monomial $v$, we use $\var(v)$ to denote the set of variables in $v$.
\end{Definition}
For example the monomial $3xy$ in the polynomial in~\cref{eq:poly-eg} has $\var(3xy)=\inset{x,y}$.
For example the monomial $XY$ has $\var(XY)=\inset{X,Y}$.
%\AH{@atri, just a heads up. I had defined \emph{monomial} as the variables exclusively, i.e., without the coefficient. However, it appears that @lordpretzel removed that definition, but I just wanted to mention this so that \emph{hopefully} we are consistent in our language as to what we mean by the term monomial.}
%\AR{I think its OK: it not a big difference and I don't think the readers will get confused-- if we are worried we can always add a disclaimer saying we might include the coefficient or nont dependning on the context}.
\begin{definition}[Modding with a set]\label{def:mod-set}
Let $S$ be a {\em set} of polynomials over $\vct{X}$. Then $\poly(\vct{X})\mod{S}$ is the polynomial obtained by taking the mod of $\poly(\vct{X})$ over {\em all} polynomials in $S$ (the order does not matter).
\end{definition}
For example when $S_0=\inset{x^2-x,y^2-y}$, taking the polynomial in~\cref{eq:poly-eg} mod $S_0$, we get $2x+3xy-2y$.
\begin{Definition}\label{def:mod-set-polys}
Given the set of BIDB variables $\inset{X_{b,i}}$, define
\[\mathcal{B}=\inset{X_{b,i}\cdot X_{b,j}|\text{ for every block } b \text{and } i\ne j},\]
\[\mathcal{T}=\inset{X_{b,i}^2-X_{b,i}|\text{ for every block } b \text{and } i}.\]
\end{Definition}
%\begin{Definition}[Expression Tree]\label{def:express-tree}
%An expression tree $\etree$ is a binary %an ADT logically viewed as an n-ary
%tree, whose internal nodes are from the set $\{+, \times\}$, with leaf nodes being either from the set $\mathbb{R}$ $(\tnum)$ or from the set of monomials $(\var)$. The members of $\etree$ are \type, \val, \vari{partial}, \vari{children}, and \vari{weight}, where \type is the type of value stored in the node $\etree$ (i.e. one of $\{+, \times, \var, \tnum\}$, \val is the value stored, and \vari{children} is the list of $\etree$'s children where $\etree_\lchild$ is the left child and $\etree_\rchild$ the right child. Remaining fields hold values whose semantics we will fix later. When $\etree$ is used as input of ~\Cref{alg:mon-sam} and ~\Cref{alg:one-pass}, the values of \vari{partial} and \vari{weight} will not be set. %SEMANTICS FOR \etree: \vari{partial} is the sum of $\etree$'s coefficients , n, and \vari{weight} is the probability of $\etree$ being sampled.
%\end{Definition}
\AR{Something to check/square out: we have been using both $X_{b,j}$ and $X_1,\dots,X_n$ for vars in BIDB-- I think this is OK as long as we explicitly talk about these two notations and how we might switch between them. Or we decide not to...}
\AR{Some of these definitions have been pulled to the prelims section. Another pass is needed to sync up these occurrences. Leaving them in for now.}
\begin{Definition}[Expression Tree]\label{def:express-tree}
An expression tree $\etree$ is a binary %an ADT logically viewed as an n-ary
tree, whose internal nodes are from the set $\{+, \times\}$, with leaf nodes being either from the set $\mathbb{R}$ $(\tnum)$ or from the set of monomials $(\var)$. The members of $\etree$ are \type, \val, \vari{partial}, \vari{children}, and \vari{weight}, where \type is the type of value stored in the node $\etree$ (i.e. one of $\{+, \times, \var, \tnum\}$, \val is the value stored, and \vari{children} is the list of $\etree$'s children where $\etree_\lchild$ is the left child and $\etree_\rchild$ the right child. Remaining fields hold values whose semantics we will fix later. When $\etree$ is used as input of ~\Cref{alg:mon-sam} and ~\Cref{alg:one-pass}, the values of \vari{partial} and \vari{weight} will not be set. %SEMANTICS FOR \etree: \vari{partial} is the sum of $\etree$'s coefficients , n, and \vari{weight} is the probability of $\etree$ being sampled.
\end{Definition}
Note that $\etree$ need not encode an expression in the standard monomial basis. For instance, $\etree$ could represent a compressed form of the polynomial in~\cref{eq:poly-eg}, such as $(x + 2y)(2x - y)$.
%Note that $\etree$ need not encode an expression in the standard monomial basis. For instance, $\etree$ could represent a compressed form of the polynomial in~\cref{eq:poly-eg}, such as $(x + 2y)(2x - y)$.
\begin{Definition}[$\polyf(\cdot)$]\label{def:poly-func}
Denote $\polyf(\etree)$ to be the function that takes as input expression tree $\etree$ and outputs its corresponding polynomial. $poly(\cdot)$ is recursively defined on $\etree$ as follows, where $\etree_\lchild$ and $\etree_\rchild$ denote the left and right child of $\etree$ respectively.
@ -66,10 +48,10 @@ Denote $\polyf(\etree)$ to be the function that takes as input expression tree $
Note that addition and multiplication above follow the standard interpretation over polynomials.
%Specifically, when adding two monomials whose variables and respective exponents agree, the coefficients corresponding to the monomials are added and their sum is multiplied to the monomial. Multiplication here is denoted by concatenation of the monomial and coefficient. When two monomials are multiplied, the product of each corresponding coefficient is computed, and the variables in each monomial are multiplied, i.e., the exponents of like variables are added. Again we notate this by the direct product of coefficient product and all disitinct variables in the two monomials, with newly computed exponents.
\begin{Definition}[Expression Tree Set]\label{def:express-tree-set}$\etreeset{\smb}$ is the set of all possible expression trees $\etree$, such that $poly(\etree) = \poly(\vct{X})$.
\end{Definition}
For the polynomial in~\cref{eq:poly-eg}, $\etreeset{\smb}$ would include the following (represented as their corresponding expression trees): $2x^2 + 3xy - 2y^2, (x + 2y)(2x - y), x(2x - y) + 2y(2x - y), 2x(x + 2y) - y(x + 2y)$. Note that \cref{def:express-tree-set} implies that for any expression tree $\etree$, we have $\etree \in \etreeset{poly(\etree)}$.
%\begin{Definition}[Expression Tree Set]\label{def:express-tree-set}$\etreeset{\smb}$ is the set of all possible expression trees $\etree$, such that $poly(\etree) = \poly(\vct{X})$.
%\end{Definition}
%
%For the polynomial in~\cref{eq:poly-eg}, $\etreeset{\smb}$ would include the following (represented as their corresponding expression trees): $2x^2 + 3xy - 2y^2, (x + 2y)(2x - y), x(2x - y) + 2y(2x - y), 2x(x + 2y) - y(x + 2y)$. Note that \cref{def:express-tree-set} implies that for any expression tree $\etree$, we have $\etree \in \etreeset{poly(\etree)}$.
\begin{Definition}[Expanded T]\label{def:expand-tree}
@ -97,7 +79,7 @@ $\expandtree{\etree}$ is the (pure) sum of products expansion of $\etree$, which
\begin{Example}\label{example:expr-tree-T}
Consider the factorized representation $(x + 2y)(2x - y)$ of the polynomial in~\cref{eq:poly-eg}. Its expression tree $\etree$ is illustrated in Figure ~\ref{fig:expr-tree-T}. The pure expansion of the product is $2x^2 - xy + 4xy - 2y^2$ and the $\expandtree{\etree}$ is $[(2, x^2), (-1, xy), (4, xy), (-2, y^2)]$.
Consider the factorized representation $(X+ 2Y)(2X - Y)$ of the polynomial in~\cref{eq:poly-eg}. Its expression tree $\etree$ is illustrated in Figure ~\ref{fig:expr-tree-T}. The pure expansion of the product is $2X^2 - XY + 4XY - 2Y^2$ and the $\expandtree{\etree}$ is $[(2, X^2), (-1, XY), (4, XY), (-2, Y^2)]$.
\end{Example}
@ -144,7 +126,7 @@ For any expression tree $\etree$, the corresponding
{\em positive tree}, denoted $\abs{\etree}$ obtained from $\etree$ as follows. For each leaf node $\ell$ of $\etree$ where $\ell.\type$ is $\tnum$, update $\ell.\vari{value}$ to $|\ell.\vari{value}|$. %value $\coef$ of each coefficient leaf node in $\etree$ is set to %$\coef_i$ in $\etree$ is exchanged with its absolute value$|\coef|$.
\end{Definition}
Using the same factorization from ~\cref{example:expr-tree-T}, $poly(\abs{\etree}) = (x + 2y)(2x + y) = 2x^2 +xy +4xy + 2y^2 = 2x^2 + 5xy + 2y^2$. Note that this \textit{is not} the same as the polynomial from~\cref{eq:poly-eg}.
Using the same factorization from ~\cref{example:expr-tree-T}, $poly(\abs{\etree}) = (X + 2Y)(2X + Y) = 2X^2 +XY +4XY + 2Y^2 = 2X^2 + 5XY + 2Y^2$. Note that this \textit{is not} the same as the polynomial from~\cref{eq:poly-eg}.
\begin{Definition}[Evaluation]\label{def:exp-poly-eval}
Given an expression tree $\etree$ and $\vct{v} \in \mathbb{R}^\numvar$, $\etree(\vct{v}) = poly(\etree)(\vct{v})$.

View File

@ -65,13 +65,24 @@ $\poly(\vct{X})$ = $\poly(X_{\block_1, 1},\ldots, X_{\block_1, \abs{\block_1}},$
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{definition}[Modding with a set]\label{def:mod-set}
Let $S$ be a {\em set} of polynomials over $\vct{X}$. Then $\poly(\vct{X})\mod{S}$ is the polynomial obtained by taking the mod of $\poly(\vct{X})$ over {\em all} polynomials in $S$ (the order does not matter).
\end{definition}
For example when $S_0=\inset{X^2-X, Y^2-Y}$, taking the polynomial in~\cref{eq:poly-eg} mod $S_0$, we get $2X+3XY-2Y$.
\begin{Definition}\label{def:mod-set-polys}
Given the set of BIDB variables $\inset{X_{b,i}}$, define
\[\mathcal{B}=\inset{X_{b,i}\cdot X_{b,j}|\text{ for every block } b \text{and } i\ne j},\]
\[\mathcal{T}=\inset{X_{b,i}^2-X_{b,i}|\text{ for every block } b \text{and } i}.\]
\end{Definition}
\begin{Definition}[Reduced \bi Polynomials]\label{def:reduced-bi-poly}
Let $\poly(\vct{X})$ be a \bi-lineage polynomial.
The reduced form $\rpoly(\vct{X})$ of $\poly(\vct{X})$ is defined as
\begin{equation*}
\rpoly(\vct{X}) = \smbOf{\poly(\vct{X})} \mod X_i^2 - X_i \mod X_{\block_s, t}X_{\block_s, u}
\rpoly(\vct{X}) = \smbOf{\poly(\vct{X})} \mod \mathcal{T} \mod \mathcal{B}%X_i^2 - X_i \mod X_{\block_s, t}X_{\block_s, u}
\end{equation*}
for all $i$ in $[\numvar]$ and for all $s$ in $\ell$, such that for all $t, u$ in $[\abs{\block_s}]$, $t \neq u$.
%for all $i$ in $[\numvar]$ and for all $s$ in $\ell$, such that for all $t, u$ in $[\abs{\block_s}]$, $t \neq u$.
\end{Definition}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -125,7 +136,7 @@ Note the following fact:
\]
\end{Proposition}
The proof is in~\Cref{sec:proofs-background}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -147,7 +158,7 @@ Note that in the preceding lemma, we have assigned $\vct{p}$
%(introduced in \Cref{subsec:def-data})
to the variables $\vct{X}$. Intuitively, \Cref{lem:exp-poly-rpoly} states that when we replace each variable $X_i$ with its probability $\prob_i$ in the reduced form of a \bi-lineage polynomial and evaluate the resulting expression in $\mathbb{R}$, then the result is the expectation of the polynomial.
The proof for~\Cref{lem:exp-poly-rpoly} can be seen in~\Cref{sec:proofs-background}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -161,7 +172,7 @@ If $\poly$ is a \bi-lineage polynomial, then the expectation of $\poly$, i.e., $
\end{Corollary}
\AH{What if $\poly$ is not in \abbrSMB form?}
We defer the proof for~\Cref{cor:expct-sop} in~\Cref{sec:proofs-background}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

View File

@ -51,6 +51,8 @@ We use $\evald{\cdot}{\db}$ to denote the result of evaluating query $\query$ ov
Let $\semNX$ denote the set of polynomials over variables $\vct{X}$ with natural number co-efficients and exponents.
Consider now the semiring $(\semNX, +, \cdot, 0, 1)$ whose domain is $\semNX$ and addition and multiplication are standard addition and multiplication of polynomials. We will utilize $\semNX$-databases $\db$ paired with probability distributions to represent $\semN$-PDBs.\BG{Need more motivation?} To justify the use of $\semNX$-databases, we need to show that we can encode any $\semN$-PDB in this way and that the query semantics over this representation coincides with query semantics over $\semN$-PDB. For that it will be opportune to define representation systems for $\semN$-PDBs.\BG{cite}
Before we proceed, unless otherwise mentioned, all subsequent proofs for~\Cref{sec:background} can be found in~\Cref{sec:proofs-background}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{Definition}[Representation System]\label{def:representation-syste}
A representation system for $\semN$-PDBs is a tuple $(\reprs, \rmod)$ where $\reprs$ is a set of representations and $\rmod$ associates with each $\repr \in \reprs$ an $\semN$-PDB $\pdb$. We say that a representation system is \emph{closed} under a class of queries $\qClass$ if for any query $\query \in \qClass$ we have:
@ -91,7 +93,7 @@ Importantly, as the following proposition shows, any finite $\semN$-PDB can be e
$\semNX$-PDBs are a complete representation system for $\semN$-PDBs that is closed under $\raPlus$ queries.
\end{Proposition}
The proof can be seen in~\Cref{sec:proofs-background}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -110,7 +112,7 @@ Since $\semNX$-PDBs $\pxdb$ are a complete representation system for $\semN$-PDB
\[ \expct_{\idb \sim \pd}[\query(\db)(t)] = \expct_{\vct{X} \sim \pd'}\pbox{\poly(\vct{X})} \]
\end{Proposition}
The proof can be seen in~\Cref{sec:proofs-background}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%