%\AH{This section will involve the set of queries (RA+) that we are interested in, the probabilistic/incomplete models we address, and the outer aggregate functions we perform over the output \textit{annotation}
An incomplete database $\idb$ is a set of deterministic databases $\db_i$ where each element is known as a possible world. Since $\idb$ is modeling all the possible worlds of an uncertain database, it follows that each $\db_i \in\idb$ has the same named set of relations, $\{\rel_1,\ldots, \rel_n\}$ (albeit not equivalent across all instances), whose schemas $(\sch(\rel_i))$ are unchanging across each $\db_j$. For the set of possible worlds, $\wSet$, i.e. the set of all $\db_i \in\idb$, define an injective mapping to the set $\{0, 1\}^M$, where for each vector $\vct{w}\in\{0, 1\}^M$ there is at most one element $\db_i \in\idb$ mapped to $\vct{w}$. When $\idb$ is a probabilistic database, $\idb$ can be viewed as a two tuple $(\wSet, \pd)$, where $\wSet$ as noted, is the set of possible worlds, and $\pd$ is the probability distribution over $\wSet$.
%Below may possibly need to be used again...we'll see.
%probability space $\left(\Omega, \mathcal{A}, P\right)$ over that set. \AR{I'm not sure why you are using the notation $\mathcal{A}$ and $P$, which you do not seem to use beyond this section. I would recommend that you only introduce a notation if you plan to use them later on.} Since the set of possible outcomes is the set of possible worlds, $\wSet$, and the set of outcomes is equivalent to the set of events, we will simplify notation and use $\left(\wSet, P\right)$ to denote the probability space of $\idb$. \AR{If you want to use $(\wSet,P)$ make sure you use the same notation in Sec 1.3 as well. If not, then use the notation from Sec 1.3 here}
Further define $\idb$ as an $\mathbb{N}[\vct{X}]$ database,\AR{There is a type error here: $\idb$ has alredy been defined as a PDB-- while here we are talking about an annotated DB: they are technically not the same thing so you cannot use the same notation. $\idb$ is used heavily in this sub-section so this change needs to be propagated. Am not sure if there is a standard notation-- if not $D(\vct{X})$ should work fine.} i.e., an incomplete/probabilistic database model where each tuple $\tup\in\idb$ is annotated with a polynomial over variables $X_1,\ldots, X_M$ for some value of $M$ that will be specified later. Intuitively, one can think of $\idb$ as a parameterized database, whose abstract form maps to each deterministic $\db_i \in\idb$.\AR{There is not need to connect back to possible world etc. in this sub-section.}
Since $\idb$ is a database that maps tuples to polynomials, it is customary for arbitrary table $\rel$ to be viewed as a function $\rel: \tup\in\idb\mapsto\mathbb{N}[\vct{X}]$,\AR{function notation is always a map from domain to range. Also you need a notation for set of all tuples.} where $\rel(\tup)$ denotes the polynomial mapped to tuple $\tup$.
It has been shown in previous work that commutative semirings precisely model translations of RA+ query operations to set annotations. Since $\idb$ is an $\mathbb{N}[\vct{X}]$ database, we are then working with the commutative semiring $\{\mathbb{N}[\vct{X}], +, \times, 0, 1\}$. %, where $\mathbb{N}[\vct{X}]$ is the set from which all annotations originate.
\AR{You should have the base case of the reduction explicitly stated as well-- i.e. what the poly of a tuple is. Also, in the RHS of the equality should also have the evaluation notation. Finally why is the join not just the product of $R_1(t)$ and $R_2(t)$, or more precisely $\llbracket R_1\rrbracket(t)\times\llbracket R_2\rrbracket(t)$?}
Query operations are translated into one of the two semiring operators, with $\project$ and $\union$ of agreeing tuples being the equivalent of the '+' opertator in polynomial $\poly$, $\join$ translating into the $\times$ operator, and finally, $\select$ is better modeled as a function that returns either $\rel(\tup)$ or $0$ based on some predicate.
Assume a bijective mapping between the polynomial variables $X_1,\ldots, X_M$ and each bit position of elements $\in\{0, 1\}^M$. In the general case, the binary value of $\vct{w}$ uniquely identifies a potential possible world. For example, in the case of the Tuple Independent Database $(\ti)$ data model, there are $\numTup$ tuples which yield $2^\numTup$ possible worlds, thus $\numTup= M$, and each $\vct{w}\in\{0, 1\}^M$ is indeed a possible world. However in the Block Independent Disjoint data model, because of the disjoint condition on tuples within the same block, it is not the general case that every element $\vct{w}\in\{0, 1\}^M$ is in fact a possible world. Denote a random world to be $\rw$. Provided that for any non-possible world $\vct{w}\in\{0, 1\}^M, \pd[\rw=\vct{w}]=0$, then, a probability distribution over $\{0, 1\}^M$ implies a distribution over $\Omega$, which we have already defined as $\pd$.
%This could be a way to think of world binary vectors in the general case
%Let $\vct{w}$ be a $\left\lceil\log_2\left(\left|\wSet\right|\right)\right\rceil = \numTup$ binary bit vector, uniquely identifying possible world $\db_i \in \idb$.
Since we can view $\llbracket\poly(\rel)\rrbracket(\tup)$ as a function $\{0, 1\}^M \mapsto\mathbb{N}$, coupled with that from this point on, our discussion will involve one polynomial for an arbirtrary $\tup$, we abuse notation by using $\poly(\vct{w})$ to mean $\llbracket\poly(\rel)\rrbracket(\tup)(\vct{w})$.
One of the aggregates we desire to compute over the annotated polynomial is the expectation, denoted,
\AH{With our notation, I no longer think that $\vct{w}\sim\pd$ is necessary footer for $\expct$. We can probably just have $\expct\limits_{\vct{w}}$ instead. Do you agree?}
$\ti$ is a database model in which each table is a set of tuples, each of which are independent of one another, and individually occur with a specific probability, $\prob_\tup$.
There are features of $\ti$ that we can exploit. Note that, because of independence, a $\ti$ with $\numTup$ tuples naturally has $2^\numTup$ possible worlds, each of which can be conveniently modeled by an $\numTup$ bit string. Since the powerset of $[\numTup]$ is exactly $\wSet$, the bit-string world value $\vct{w}$ can be used as indexing to determine which tuples are present in the $\vct{w}$ world. Given an $\numTup$-sized vector $\vct{p}$, where the $i^{th}$ element, $\prob_i$ is the probability of the $i^{th}$ tuple, we can then write an equivalent expectation for $\ti$ models,