Oliver's notes

2020-06-26 17:27:52 -04:00 · 2020-06-26 17:27:52 -04:00 · 0d58ec08b7
parent 6e8a7e8027
commit 0d58ec08b7
3 changed files with 49 additions and 5 deletions
--- a/main.tex
+++ b/main.tex
@ -5,7 +5,7 @@

 \usepackage{comment}
 \usepackage{amsmath}
-\usepackage{amssymb}
+% \usepackage{amssymb}
 %\let\proof\relax
 %

--- a/poly-form.tex
+++ b/poly-form.tex
@ -1,9 +1,17 @@
 %root: main.tex
+%!TEX root = ./main.tex
+
 \section{Polynomial Formulation}

 Further, define $\rpoly(X_1,\ldots, X_\numTup)$ as the reduced version of $\poly(X_1,\ldots, X_\numTup)$, of the form
 \[\rpoly(X_1,\ldots, X_\numTup) = \poly(X_1,\ldots, X_\numTup) \mod \wbit_1^2-\wbit\cdots\mod \wbit_\numTup^2 - \wbit_\numTup.\]  

+\OK{Shouldn't it be $\wbit_1^2 - \wbit_1$ (missing a subscript)?
+This definition of $\rpoly$ may be inappropriately concise, as it doesn't (I think?) get across the
+``expanded SoP form" constraint.  
+Also, one way to establish a preliminary intuition for $\rpoly$ might be to associate it with idempotent multiplication operations --- it's the simplest sum-of-products representation of $\poly$ that is equivalent under an idempotent $\otimes$.
+}
+
 Intuitively, $\rpoly(\textbf{X})$ is the expanded sum of products form of $\poly(\textbf{X})$ such that if any $X_j$ term  has an exponent $e > 1$, it is reduced to $1$, i.e. $X_j^e\mapsto X_j$ for any $e > 1$.  The usefulness of this reduction will be seen shortly.

 \begin{Lemma}\label{lem:pre-poly-rpoly}
@ -19,6 +27,8 @@ First, note the following fact:
 For all $b \in \{0, 1\}$ and all $e \geq 1$, $b^e = 1$.\qed
 \end{proof}

+\OK{Might help to emphasize the Sum of Products constraint.}
+

 \begin{Lemma}\label{lem:exp-poly-rpoly}
 The expectation of a possible world in $\poly$ is equal to $\rpoly(\prob_1,\ldots, \prob_\numTup)$.
@ -44,7 +54,17 @@ Then for expectation we have
 &= \rpoly(\prob_1,\ldots, \prob_\numTup)\label{p1-s5}
 \end{align}

-In steps \cref{p1-s1} and \cref{p1-s2}, by linearity of expectation, the expecation can be pushed all the way inside of the product.  In \cref{p1-s3}, note that $w_i \in \{0, 1\}$ which further implies that for any exponent $e \geq 1$, $w_i^e = w_i$.  Next, by definition of TIDB, in \cref{p1-s4} the expectation of a tuple across all possible worlds is indeed its probability.  Finally, observe \cref{p1-s5} by construction in \cref{lem:pre-poly-rpoly}, that $\rpoly(\prob_1,\ldots, \prob_\numTup)$ is exactly the product of probabilities of each variable in each monomial across the entire sum.
+In steps \cref{p1-s1} and \cref{p1-s2}, by linearity of expectation, the expecation can be pushed all the way inside of the product.  In \cref{p1-s3}, note that $w_i \in \{0, 1\}$ which further implies that for any exponent $e \geq 1$, $w_i^e = w_i$.  Next, by definition of TIDB, in \cref{p1-s4} the expectation of a tuple across all possible worlds is indeed its probability.  
+
+\OK{
+	You don't need to tie this to TI-DBs if you define the variables ($X_i$) to be independent.  
+	Annotations 
+	Boolean expressions over uncorrelated boolean variables are sufficient to model TI-, BI-, and
+	PC-Tables.  This should still hold for arithmetic over the naturals.
+}
+
+
+Finally, observe \cref{p1-s5} by construction in \cref{lem:pre-poly-rpoly}, that $\rpoly(\prob_1,\ldots, \prob_\numTup)$ is exactly the product of probabilities of each variable in each monomial across the entire sum.

 \qed
 \end{proof}
@ -77,6 +97,11 @@ First, let us do a warm-up by computing $\rpoly(\wElem_1,\dots, \wElem_\numTup)$

 \AH{We need to make a decision on subgraph notation, and number of occurrences notation.  Waiting to hear back from Oliver before making a decision.}

+\OK{
+	I'm not sure what I can add.  The existing notation is fine (for now).  I would suggest adding
+	a definition table.
+}
+
 \begin{Claim}
 We can compute $\rpoly_2$ in O(m) time.
 \end{Claim}
--- a/ra-to-poly.tex
+++ b/ra-to-poly.tex
@ -1,4 +1,5 @@
 %root: main.tex
+%!TEX root=./main.tex

 \section{Query translation into polynomials}
 \AH{This section will involve the set of queries (RA+) that we are interested in, the probabilistic/incomplete models we address, and the outer aggregate functions we perform over the output \textit{annotation}
@ -7,13 +8,28 @@
 3) How queries translate into polynomials
 }

-Given tables $\rel, \reli$, an arbitrary query $\query(\rel)$ over the positive relational operators (SPJU), abusing notation slightly denote the query polynomial as $\poly(X_1,\ldots, X_\numTup)$.  To be clear,  $\poly(X_1,\ldots, X_\numTup)$ is a polynomial whose variables represent the tuple annotations of an arbitrary query.The annotation for arbitrary tuple $\tup$ can be viewed as an element of the image of $\rel$, where relation $\rel$ can be thought of as a function with preimage of all tuples in $\rel$, such that $\rel(\tup) = \poly(X_1,\ldots, X_\numTup)$.  Further, it is known that the algebraic semiring structure aptly models the translation and computation of query operations into tuple annotation, aka polynomials.  
+Given tables $\rel, \reli$, an arbitrary query $\query(\rel)$ over the positive relational operators (SPJU), abusing notation slightly denote the query polynomial as $\poly(X_1,\ldots, X_\numTup)$.  
+\OK{
+  Eventually, you probably want a little more background here, depending on the query notation you choose to use.  The simplest approach would be basing it on the Green et. al. Provenance Semirings paper.  As we discussed, that would make $\query(\mathcal D)(t)$ the query polynomial.
+}
+
+To be clear,  $\poly(X_1,\ldots, X_\numTup)$ is a polynomial whose variables represent the tuple annotations of an arbitrary query.
+\OK{
+  I don't think we're on the same page here.  From the Prov. Semirings perspective, the entire $\poly(X_i)$ is the annotation of a tuple in an arbitrary query over a $\mathbb R[x]$-relation (i.e., a relation who's tuples are annotated by polynomials over the reals).  The $X_i$s are not annotations, they're the variables of that polynomial.  (footnote: Presumably, there are tuples in the database who's annotations are just a single variable, but that's not the general case).
+}
+
+
+The annotation for arbitrary tuple $\tup$ can be viewed as an element of the image of $\rel$, where relation $\rel$ can be thought of as a function with preimage of all tuples in $\rel$, such that $\rel(\tup) = \poly(X_1,\ldots, X_\numTup)$.  Further, it is known that the algebraic semiring structure aptly models the translation and computation of query operations into tuple annotation, aka polynomials.  
 To make things more concrete, consider the $\{\mathbb{N}, \times, +, 1, 0\}$ bag semiring.  Here the set in which the tuple annotations (computed polynomials) exist is the natural numbers.  Query operations are translated into one of the two semiring operators, with $\project$ and $\union$ of agreeing tuples being the equivalent of the '+' opertator in polynomial $\poly$, $\join$ translating into the $\times$ operator, and finally, $\select$ is better modeled as a function that returns either $\rel(\tup)$ or $0$ based on some predicate.

+\OK{
+  A good summary to start.  We'll need to make this more precise for the final paper though.
+}
+
 Consider the translation of relational operators to polynomial operators in greater detail.

 \begin{align*}
-&\project_A(\rel)(\tup) = &&\sum_{\tup' s.t. \tup[A] = \tup'} \rel(\tup')\\
+&\project_A(\rel)(\tup) = &&\sum_{\tup' s.t. \tup'[A] = \tup} \rel(\tup')\\
 & (\rel_1 \union \rel_2)(\tup) = &&\rel_1(\tup) + \rel_2(\tup)\\
 &(\rel_1 \join_\theta \rel_2)(\tup) = &&\begin{cases}
 						\rel_1(\tup_1) \times \rel_2(\tup_2)	&\text{if }\theta(\tup_1, \tup_2)\\
@ -26,6 +42,7 @@ Consider the translation of relational operators to polynomial operators in grea
 \end{align*}

 Considering probabilistic databases, let $\prob(\wVec)$ denote the probability that a given world occurs.
+\OK{Might help to more precisely define $\wVec$ and its relation to the $X_i$s}
 The output we desire is over the tuple annotations, i.e. polynomial $\poly(X_1,\ldots, X_\numTup)$ is simply the expectation, i.e.
 \[\expct_{\wVec}\pbox{\poly(\wVec)} = \sum\limits_{\wVec \in \{0, 1\}^\numTup} \poly(\wVec)\cdot \prob(\wVec).\]

@ -35,5 +52,7 @@ There are features of $\ti$ that we can exploit.  Note that a $\ti$ naturally ha

 \[\expct_{\wVec}\pbox{\poly(\wVec)} = \sum\limits_{\wVec \in \{0, 1\}^\numTup} \poly(\wVec)\prod_{\substack{i \in [\numTup]\\ s.t. \wElem_i = 1}}\prob_i \prod_{\substack{i \in [\numTup]\\s.t. w_i = 0}}\left(1 - \prob_i\right).\]

-
+\OK{ 
+It would, again, be helpful here to have an explicitly stated mapping between $\wVec$ and the $X_i$s
+}