We first formally define circuits, an encoding of polynomials that we use throughout the paper. Since we are particularly using \emph{lineage} circuits, we drop the term lineage and only refer to them as circuits.
For illustrative purposes consider the polynomial $\poly(\vct{X})=2X^2+3XY -2Y^2$ over $\vct{X}=[X, Y]$.
We represent query polynomials via {\em arithmetic circuits}~\cite{arith-complexity}, a standard way to represent polynomials over fields (particularly in the field of algebraic complexity) that we use for polynomials over $\mathbb N$ in the obvious way.
A circuit $\circuit$ is a Directed Acyclic Graph (DAG) whose source gates (in degree of $0$) consist of elements in either $\domN$ or $\vct{X}$. The internal gates and (the single) sink gate of $\circuit$ (corresponding to the result tuple $t$) have binary input and are either sum ($\circplus$) or product ($\circmult$) gates.
Each node in a circuit $\circuit$ has the following members: \type, \val, \vpartial, \vari{input}, \degval and \vari{Lweight}, \vari{Rweight}, where \type is the type of value stored in the gate (one of $\{\circplus, \circmult, \var, \tnum\}$, \val is the value stored (a constant or variable), and \vari{input} is the list of the gate's inputs. We use $\circuit_\linput$ to denote the left input and $\circuit_\rinput$ the right input of the sink of circuit $\circuit$.
When the underlying DAG is a tree (with edges pointing towards the root), we will refer to the structure as an expression tree \etree. Note that in such a case, the root of \etree is analogous to the sink of \circuit.
As stated in \Cref{def:circuit}, every internal node has at most two in-edges, is labeled as an addition or a multiplication node, and has no limit on its outdegree.
We ignore the fields \vari{partial}, \vari{Lweight}, and \vari{Rweight} until \Cref{sec:algo}.\AH{We omit degree here too, which {\emph I think} is used only in appendix proofs.}
The circuit \circuit in \Cref{fig:circuit-express-tree} encodes the polynomial $XY + WZ$. Note that circuit \circuit encodes a tree, with edges pointing towards the root.
Denote $\polyf(\circuit)$ to be the function from circuit $\circuit$ to its corresponding polynomial. $\polyf(\cdot)$ is recursively defined on $\circuit$ as follows, with addition and multiplication following the standard interpretation for polynomials:
\begin{equation*}
\polyf(\circuit) = \begin{cases}
\polyf(\circuit_\lchild) + \polyf(\circuit_\rchild) &\text{ if \circuit.\type} = \circplus\\
\polyf(\circuit_\lchild) \cdot\polyf(\circuit_\rchild) &\text{ if \circuit.\type} = \circmult\\
\circuit.\val&\text{ if \circuit.\type} = \var\text{ OR }\tnum.
Note that $\circuit$ need not encode an expression in SMB. For instance, $\circuit$ could represent a compressed form of the running example, such as $(X +2Y)(2X - Y)$, as shown in \Cref{fig:circuit}, while $\polyf(\circuit)=2X^2+3XY-2Y^2$.\footnote{As stated previously, unless otherwise mentioned all polynomials are considered in the $\abbrSMB$ representation, and this implies that the output of $\polyf\inparen{\cdot}$ is indeed $\abbrSMB$.}
$\circuitset{\polyX}$ is the set of all possible circuits $\circuit$ such that $\polyf(\circuit)=\polyX$.\footnote{Again, the representation of $\polyX$ is $\abbrSMB$.}
The circuit of \Cref{fig:circuit} is an element of $\circuitset{2X^2+3XY-2Y^2}$. One can think of $\circuitset{\polyX}$ as the infinite set of circuits each of which equal $\polyX$ when represented in $\abbrSMB$.
Let $\vct{X}=(X_1, \ldots, X_n)$, and $\pxdb$ be an $\semNX$-PDB over $\vct{X}$ with probability distribution $\pd$ over assignments $\vct{X}\to\{0,1\}$, $\query$ an n-ary query, and $t$ an n-ary tuple.