More work on lemmas 3, 4, and lin sys.

master
Aaron Huber 2020-12-04 13:14:12 -05:00
parent 12c65e6301
commit aaf920e9a3
7 changed files with 654 additions and 657 deletions

View File

@ -1,3 +1,9 @@
%root: main.tex
Abstract here.
\begin{abstract}
\AH{High-level intution}
Most people think that computing expected multiplicity of an output tuple in a probabilistic database (PDB) is easy. Due to the fact that most modern implementations of PDBs represent tuple lineage in their expanded form, it has to be the case that such a computation is linear in the size of the lineage. This follows since, when we have an uncompressed lineage, linearity allows for expectation to be pushed through the sum.
\AH{Low-level why-would-an-expert-read-this}
However, when we consider compressed representations of the tuple lineage, the complexity landscape changes. If we use a lineage computed over a factorized database, we find in general that computation time is not linear in the size of the compressed lineage.
\AH{Key technical contributions}
This work theoretically demonstrates that bags are not easy in general, and in the case of compressed lineage forms, the computation can be greater than linear. As such, it is then desirable to have an approximation algorithm to approximate the expected multiplicity in linear time. We introduce such an algorithm and give theoretical guarentees on its efficiency and accuracy. It then follows that computing an approximate value of the tuple's expected multiplicity on a bag PDB is equivalent to deterministic query processing complexity.
\end{abstract}

190
lin_sys.tex Normal file
View File

@ -0,0 +1,190 @@
%root: main.tex
\subsection{Developing a Linear System}
\AH{The changes in ~\cref{eq:2pd-3d} have been propagated 110420. Barring any errors, everything should be updated and correct.}
\begin{proof}[Proof of Lemma \ref{lem:lin-sys}]
Our goal is to build a linear system $M \cdot (x~y~z)^T = \vct{b}$, such that, assuming an indexing starting at $1$, each $i^{th}$ row in $M$ corresponds to the RHS of ~\cref{eq:LS-subtract} for $\graph{i}$ \textit{in} terms of $\graph{1}$. The vector $\vct{b}$ analogously has the terms computable in $O(\numedge)$ time for each $\graph{i}$ at its corresponing $i^{th}$ entry for the LHS of ~\cref{eq:LS-subtract}. Lemma ~\ref{lem:qE3-exp} gives the identity for $\rpoly_{G}(\prob,\ldots, \prob)$ when $\poly_{G}(\vct{X}) = q_E(X_1,\ldots, X_\numvar)^3$, and using
~\cref{eq:LS-subtract}, $\vct{b}[1] = \frac{\rpoly_{G}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{G}{\ed}}{6\prob} - \numocc{G}{\twopath} - \numocc{G}{\twodis}\prob - \numocc{G}{\oneint}\prob - \big(\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}\big)\prob^2$.
As previously outlined, assume graph $\graph{1}$ to be an arbitrary graph, with $\graph{2}, \graph{3}$ constructed from $\graph{1}$ as defined in \cref{def:Gk}.
\subsubsection{$\graph{2}$}
Let us call the linear equation for graph $\graph{2}$ $\linsys{2}$. Using the hard to compute terms of the RHS in ~\cref{eq:LS-subtract}, let us consider the RHS,
\begin{align}
& \numocc{\graph{2}}{\tri} + \numocc{\graph{2}}{\threepath}\prob - \numocc{\graph{2}}{\threedis}\left(3\prob^2 - \prob^3\right)\nonumber\\
= &\numocc{\graph{2}}{\threepath}\prob - \numocc{\graph{2}}{\threedis}\left(3\prob^2 - \prob^3\right)\label{eq:ls-2-1}\\
= &2 \cdot \numocc{\graph{1}}{\twopath}\prob - \pbrace{8 \cdot \numocc{\graph{1}}{\threedis} + 6 \cdot \numocc{\graph{1}}{\twopathdis} + 4 \cdot \numocc{\graph{1}}{\oneint} + 4 \cdot \numocc{\graph{1}}{\threepath} + 2 \cdot \numocc{\graph{1}}{\tri}}\left(3\prob^2 - \prob^3\right)\label{eq:ls-2-2}\\
= &\left(-2\cdot\numocc{\graph{1}}{\tri} - 4\cdot\numocc{\graph{1}}{\threepath} - 8\cdot\numocc{\graph{1}}{\threedis} - 6\cdot\numocc{\graph{1}}{\twopathdis}\right)\cdot\left(3\prob^2 - p^3\right) + 2\cdot\numocc{\graph{1}}{\twopath}\prob - 4\cdot\numocc{\graph{1}}{\oneint}\cdot\left(3\prob^2 - \prob^3\right).\label{eq:ls-2-3}
\end{align}
%define $\linsys{2} = \numocc{\graph{2}}{\tri} + \numocc{\graph{2}}{\threepath}\prob - \numocc{\graph{2}}{\threedis}\left(3\prob^2 - \prob^3\right)$. By \cref{claim:four-two} we can compute $\linsys{2}$ in $O(T(\numedge) + \numedge)$ time with $\numedge = |E_2|$, and more generally, $\numedge = |E_k|$ for a graph $\graph{k}$.
Equation ~\ref{eq:ls-2-1} follows by \cref{lem:tri}. Similarly ~\cref{eq:ls-2-2} follows by both \cref{lem:3m-G2} and \cref{lem:3p-G2}. Finally, ~\cref{eq:ls-2-3} follows by a simple rearrangement of terms.
Now, by simple algebraic manipulations of ~\cref{eq:LS-subtract}, we deduce,
\begin{align}
&\frac{\rpoly_{\graph{2}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{2}}{\ed}}{6\prob} - \numocc{\graph{2}}{\twopath} - \numocc{\graph{2}}{\twodis}\prob - \numocc{\graph{2}}{\oneint}\prob - \big(\numocc{\graph{2}}{\twopathdis} + 3\numocc{\graph{2}}{\threedis}\big)\prob^2\nonumber\\
&\qquad\qquad =\left(-2\cdot\numocc{\graph{1}}{\tri} - 4\cdot\numocc{\graph{1}}{\threepath} - 8\cdot\numocc{\graph{1}}{\threedis} - 6\cdot\numocc{\graph{1}}{\twopathdis}\right)\cdot\left(3\prob^2 - p^3\right) + 2\cdot\numocc{\graph{1}}{\twopath}\prob - 4\cdot\numocc{\graph{1}}{\oneint}\cdot\left(3\prob^2 - \prob^3\right)\label{eq:lem3-G2-1}\\
&\frac{\rpoly_{\graph{2}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{2}}{\ed}}{6\prob} - \numocc{\graph{2}}{\twopath} - \numocc{\graph{2}}{\twodis}\prob - \numocc{\graph{2}}{\oneint}\prob - \big(\numocc{\graph{2}}{\twopathdis} + 3\numocc{\graph{2}}{\threedis}\big)\prob^2 - 2\cdot\numocc{\graph{1}}{\twopath}\prob\nonumber\\
&\qquad + 4\cdot\numocc{\graph{1}}{\oneint}\left(3\prob^2 - \prob^3\right)\nonumber\\
&\qquad\qquad =\left(-2\cdot\numocc{\graph{1}}{\tri} - 4\cdot\numocc{\graph{1}}{\threepath} - 8\cdot\numocc{\graph{1}}{\threedis} - 6\cdot\numocc{\graph{1}}{\twopathdis}\right)\cdot\left(3\prob^2 - p^3\right)\label{eq:lem3-G2-2}\\
&\frac{\rpoly_{\graph{2}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{2}}{\ed}}{6\prob} - \numocc{\graph{2}}{\twopath} - \numocc{\graph{2}}{\twodis}\prob - \numocc{\graph{2}}{\oneint}\prob - \big(\numocc{\graph{2}}{\twopathdis} + 3\numocc{\graph{2}}{\threedis}\big)\prob^2 - 2\cdot\numocc{\graph{1}}{\twopath}\prob\nonumber\\
&\qquad + \left(4\cdot\numocc{\graph{1}}{\oneint}+ 6\cdot\left(\numocc{\graph{1}}{\twopathdis} + 3\cdot\numocc{\graph{1}}{\threedis}\right)\right)\left(3\prob^2 - \prob^3\right)\nonumber\\
&\qquad\qquad =\left(-2\cdot\numocc{\graph{1}}{\tri} - 4\cdot\numocc{\graph{1}}{\threepath} + 10\cdot\numocc{\graph{1}}{\threedis}\right)\cdot\left(3\prob^2 - \prob^3\right)\label{eq:lem3-G2-3}
\end{align}
Equation ~\ref{eq:lem3-G2-1} follows by substituting ~\cref{eq:ls-2-3} in the RHS. We then arrive with ~\cref{eq:lem3-G2-2} by adding the inverse of the last 3 terms of ~\cref{eq:ls-2-3} to both sides. Finally, we arrive at ~\cref{eq:lem3-G2-3} by adding the $O(\numedge)$ computable term (by ~\cref{eq:2pd-3d}) $6\left(\cdot\numocc{\graph{1}}{\twopathdis} + 3\cdot\numocc{\graph{1}}{\threedis}\right)$ to both sides.
Denote the matrix of the linear system as $\mtrix{\rpoly_{G}}$, where $\mtrix{\rpoly_{G}}[i]$ is the $i^{\text{th}}$ row of $\mtrix{\rpoly_{G}}$. From ~\cref{eq:lem3-G2-3} it follows that
\[\mtrix{\rpoly_{\graph{2}}}[2] = \left(-2 \cdot \numocc{\graph{1}}{\tri} - 4 \cdot \numocc{\graph{1}}{\threepath} + 10 \cdot \numocc{\graph{1}}{\threedis}\right)\cdot \left(3\prob^2 - \prob^3\right)\]
and
%By \cref{lem:tri}, the first term of $\linsys{2}$ is $0$, and then $\linsys{2} = \numocc{\graph{2}}{\threepath}\prob - \numocc{\graph{2}}{\threedis}\left(3\prob^2 - \prob^3\right)$.
%
%Replace the next term with the identity of \cref{lem:3p-G2} and the last term with the identity of \cref{lem:3m-G2},
%\begin{equation*}
%\linsys{2} = 2 \cdot \numocc{\graph{1}}{\twopath}\prob - \pbrace{8 \cdot \numocc{\graph{1}}{\threedis} + 6 \cdot \numocc{\graph{1}}{\twopathdis} + 4 \cdot \numocc{\graph{1}}{\oneint} + 4 \cdot \numocc{\graph{1}}{\threepath} + 2 \cdot \numocc{\graph{1}}{\tri}}\left(3\prob^2 - \prob^3\right).
%\end{equation*}
%Rearrange terms into groups of those patterns that are 'hard' to compute and those that can be computed in $O(\numedge)$,
%\begin{equation*}
%\linsys{2} = -\pbrace{2 \cdot \numocc{\graph{1}}{\tri} + 4 \cdot \numocc{\graph{1}}{\threepath} + \left(8 \cdot \numocc{\graph{1}}{\threedis} + 6 \cdot \numocc{\graph{1}}{\twopathdis}\right)}\left(3\prob^2 - \prob^3\right) + \pbrace{2 \cdot \numocc{\graph{1}}{\twopath}\prob - 4 \cdot \numocc{\graph{1}}{\oneint}\left(3\prob^2 - \prob^3\right)}.
%\end{equation*}
%
%Note that there are terms computable in $O(\numedge)$ time which can be subtracted from $\linsys{2}$ and added to the other side of \cref{eq:LS-subtract}, i.e., $\vct{b}[2]$. This leaves us with
%\begin{align}
%&\linsys{2} = \left(-2 \cdot \numocc{\graph{1}}{\tri} - 4 \cdot \numocc{\graph{1}}{\threepath} - 2 \cdot \numocc{\graph{1}}{\threedis} - 4\cdot \numocc{\graph{1}}{\twopathdis}\right) \cdot \left(3\prob^2 - \prob^3\right)\label{eq:LS-G2'}\\
%&\linsys{2} = \left(-2 \cdot \numocc{\graph{1}}{\tri} - 4 \cdot \numocc{\graph{1}}{\threepath} - 2 \cdot \numocc{\graph{1}}{\threedis} + 12 \cdot \numocc{\graph{1}}{\threedis}\right)\cdot \left(3\prob^2 - \prob^3\right)\label{eq:LS-G2'-1}\\
%&\linsys{2} = \left(-2 \cdot \numocc{\graph{1}}{\tri} - 4 \cdot \numocc{\graph{1}}{\threepath} + 10 \cdot \numocc{\graph{1}}{\threedis}\right)\cdot \left(3\prob^2 - \prob^3\right)\label{eq:LS-G2'-2}
%\end{align}
%
%Equation ~\ref{eq:LS-G2'} is the result of collecting $2\cdot\left(\numocc{\graph{1}}{\twopathdis} + 3\numocc{\graph{1}}{\threedis}\right)$ and moving them to the other side. Then ~\cref{eq:LS-G2'-1} results from adding $4\cdot\left(\numocc{\graph{1}}{\twopathdis} + 3\numocc{\graph{1}}{\threedis}\right)$ to both sides. Equation ~\ref{eq:LS-G2'-2} is the result of simplifying terms.
%
%For the left hand side, following the above steps, we obtain
\begin{align*}
\vct{b}[2] &= \frac{\rpoly(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{2}}{\ed}}{6\prob} - \numocc{\graph{2}}{\twopath} - \numocc{\graph{2}}{\twodis}\prob - \numocc{\graph{2}}{\oneint}\prob - \left(\numocc{\graph{2}}{\twopathdis} + 3\numocc{\graph{2}}{\threedis}\right)\prob^2\\
&- 2\cdot \numocc{\graph{1}}{\twopath}\prob + \left(4\cdot\numocc{\graph{1}}{\oneint}+ 6\cdot\left(\numocc{\graph{1}}{\twopathdis} + 3\cdot\numocc{\graph{1}}{\threedis}\right)\right)\left(3\prob^2 - \prob^3\right).
\end{align*}
We now have a linear equation in terms of $\graph{1}$ for $\graph{2}$. Note that by ~\cref{eq:2pd-3d}, it is the case that any term of the form $x \cdot \left(\numocc{\graph{i}}{\twopathdis} + 3\cdot \numocc{\graph{i}}{\threedis}\right)$ is computable in linear time. By ~\cref{eq:1e}, ~\cref{eq:2p}, ~\cref{eq:2m}, and ~\cref{eq:3s} the same is true for $\numocc{\graph{i}}{\ed}$, $\numocc{\graph{i}}{\twopath}$, $\numocc{\graph{i}}{\twodis}$, and $\numocc{\graph{i}}{\oneint}$ respectively.
\subsubsection{$\graph{3}$}
Following the same reasoning for $\graph{3}$, using \cref{lem:3m-G3}, \cref{lem:3p-G3}, and \cref{lem:tri}, starting with the RHS of ~\cref{eq:LS-subtract}, we derive
\begin{align}
&\numocc{\graph{3}}{\tri} + \numocc{\graph{3}}{\threepath}\prob - \numocc{\graph{3}}{\threedis}\left(3\prob^2 - \prob^3\right)\nonumber\\
=& \pbrace{\numocc{\graph{1}}{\ed} + 2 \cdot \numocc{\graph{1}}{\twopath}}\prob - \left\{4 \cdot \numocc{\graph{1}}{\twopath} + 6 \cdot \numocc{\graph{1}}{\twodis} + 18 \cdot \numocc{\graph{1}}{\tri} + 21 \cdot \numocc{\graph{1}}{\threepath} + 24 \cdot \numocc{\graph{1}}{\twopathdis} +\right.\nonumber\\
&\left.20 \cdot \numocc{\graph{1}}{\oneint} + 27 \cdot \numocc{\graph{1}}{\threedis}\right\}\left(3\prob^2 - \prob^3\right)\label{eq:LS-G3-sub}\\
=&\pbrace{ -18\numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} - 24 \cdot \numocc{\graph{1}}{\twopathdis} - 27 \cdot \numocc{\graph{1}}{\threedis}}\left(3\prob^2 - \prob^3\right) \nonumber\\
&+ \pbrace{-20 \cdot \numocc{\graph{1}}{\oneint} - 4\cdot \numocc{\graph{1}}{\twopath} - 6 \cdot \numocc{\graph{1}}{\twodis}}\left(3\prob^2 - \prob^3\right)+ \numocc{\graph{1}}{\ed}\prob + 2 \cdot \numocc{\graph{1}}{\twopath}\prob. \label{eq:lem3-G3-1}
\end{align}
Looking at ~\cref{eq:LS-subtract},
\begin{align}
&\frac{\rpoly_{\graph{3}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob - \numocc{\graph{3}}{\oneint}\prob - \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2\nonumber\\
&\qquad\qquad= \pbrace{ -18\numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} - 24 \cdot \numocc{\graph{1}}{\twopathdis} - 27 \cdot \numocc{\graph{1}}{\threedis}}\left(3\prob^2 - \prob^3\right) \nonumber\\
&\qquad\qquad\qquad+ \pbrace{-20 \cdot \numocc{\graph{1}}{\oneint} - 4\cdot \numocc{\graph{1}}{\twopath} - 6 \cdot \numocc{\graph{1}}{\twodis}}\left(3\prob^2 - \prob^3\right)+ \numocc{\graph{1}}{\ed}\prob + 2 \cdot \numocc{\graph{1}}{\twopath}\prob. \label{eq:lem3-G3-2}\\
&\frac{\rpoly_{\graph{3}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob - \numocc{\graph{3}}{\oneint}\prob - \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2 - \left(\numocc{\graph{1}}{\ed} + \numocc{\graph{1}}{\twopath}\right)\prob\nonumber\\
&\qquad + \left(24\left(\numocc{\graph{1}}{\twopathdis} + 3\cdot\numocc{\graph{1}}{\threedis}\right) + 20\cdot\numocc{\graph{1}}{\oneint} + 4\cdot\numocc{\graph{1}}{\twopath} + 6\cdot\numocc{\graph{1}}{\twodis}\right)\left(3\prob^2 - \prob^3\right)\nonumber\\
&\qquad\qquad = \pbrace{- 18 \cdot \numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} + 45 \cdot \numocc{\graph{1}}{\threedis}}\left(3p^2 - p^3\right)\label{eq:lem3-G3-3}
\end{align}
Equation ~\ref{eq:lem3-G3-2} follows from substituting ~\cref{eq:lem3-G3-2} in for the RHS of ~\cref{eq:LS-subtract}. We derive ~\cref{eq:lem3-G3-3} by adding the inverse of all $O(\numedge)$ computable terms, and for the case of $\twopathdis$ and $\threedis$, we add the $O(\numedge)$ computable term $24\cdot\left(\numocc{\graph{1}}{\twopathdis} + \numocc{\graph{1}}{\threedis}\right)$ to both sides.
Equation \ref{eq:LS-G3-sub} follows from simple substitution of all lemma identities in ~\cref{lem:3m-G3}, ~\cref{lem:3p-G3}, and ~\cref{lem:tri}. We then get \cref{eq:LS-G3-rearrange} by simply rearranging the operands.
It then follows that
%Removing $O(\numedge)$ computable terms to the other side of \cref{eq:LS-subtract}, we get
\begin{equation}
\mtrix{\rpoly_{G}}[3] = \pbrace{- 18 \cdot \numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} + 45 \cdot \numocc{\graph{1}}{\threedis}}\left(3p^2 - p^3\right)\label{eq:LS-G3'}
\end{equation}
and
%The same justification for the derivation of $\linsys{2}$ applies to the derivation above of $\linsys{3}$. To arrive at ~\cref{eq:LS-G3'}, we move $O(\numedge)$ computable terms to the left hand side. For the term $-24\cdot\numocc{\graph{1}}{\twopathdis}$ we need to add the inverse to both sides AND $72\cdot\numocc{\graph{1}}{\threedis}$ to both sides, in order to satisfy the constraint of $\cref{eq:2pd-3d}$.
%
%For the LHS we get
\begin{align*}
\vct{b}[3] =& \frac{\rpoly(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob - \numocc{\graph{3}}{\oneint}\prob - \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2 - \pbrace{\numocc{\graph{1}}{\ed} + 2 \cdot \numocc{\graph{1}}{\twopath}}\prob\\
& + \pbrace{24 \cdot \left(\numocc{\graph{1}}{\twopathdis} + 3\numocc{\graph{1}}{\threedis}\right) + 20 \cdot \numocc{\graph{1}}{\oneint} + 4\cdot \numocc{\graph{1}}{\twopath} + 6 \cdot \numocc{\graph{1}}{\twodis}}\left(3\prob^2 - \prob^3\right)
\end{align*}
We now have a linear system consisting of three linear combinations, for $\graph{1}, \graph{2}, \graph{3}$ in terms of $\graph{1}$. Note that the constants for $\graph{1}$ follow the RHS of ~\cref{eq:LS-subtract}. To make it easier, use the following variable representations: $x = \numocc{\graph{1}}{\tri}, y = \numocc{\graph{1}}{\threepath}, z = \numocc{\graph{1}}{\threedis}$. Using $\linsys{2}$ and $\linsys{3}$, the following matrix is obtained,
\[ \mtrix{\rpoly} = \begin{pmatrix}
1 & \prob & -(3\prob^2 - \prob^3)\\
-2(3\prob^2 - \prob^3) & -4(3\prob^2 - \prob^3) & 10(3\prob^2 - \prob^3)\\
-18(3\prob^2 - \prob^3) & -21(3\prob^2 - \prob^3) & 45(3\prob^2 - \prob^3)
\end{pmatrix},\]
and the following linear equation
\begin{equation}
\mtrix{\rpoly}\cdot (x~ y~ z~)^T = \vct{b}(\graph{1}).
\end{equation}
\AR{
Also the top right entry should be $-(p^2-p^3)$-- the negative sign is missing. This changes the rest of the calculations and has to be propagated. If my calculations are correct the final polynomial should be $-30p^2(1-p)^2(1-p-p^2+p^3)$. This still has no root in $(0,1)$}
\AH{While propagating changes in ~\cref{eq:2pd-3d}, I noticed and corrected some errors, most notably, that for pulling out the \textbf{$a^2$} factor as described next, I hadn't squared it. That has been addressed. 110220}
Now we seek to show that all rows of the system are indeed independent.
The method of minors can be used to compute the determinant, $\dtrm{\mtrix{\rpoly}}$.
We also make use of the fact that for a matrix with entries $ab, ac, ad,$ and $ae$, the determinant is $a^2be - a^2cd = a^2(be - cd)$.
\begin{equation*}
\begin{vmatrix}
1 & \prob & -(3\prob^2 - \prob^3)\\
-2(3\prob^2 - \prob^3) & -4(3\prob^2 - \prob^3) & 10(3\prob^2 - \prob^3)\\
-18(3\prob^2 - \prob^3) & -21(3\prob^2 - \prob^3) & 45(3\prob^2 - \prob^3)
\end{vmatrix}
= (3\prob^2 - \prob^3)^2 \cdot
\begin{vmatrix}
-4 & 10\\
-21 & 45
\end{vmatrix}
~ - ~ \prob(3\prob^2 - \prob^3)^2~ \cdot
\begin{vmatrix}
-2 & 10\\
-18 & 45
\end{vmatrix}
+ \left(- ~(3\prob^2 - \prob^3)^3\right)~ \cdot
\begin{vmatrix}
-2 & -4\\
-18 & -21
\end{vmatrix}.
\end{equation*}
Compute each RHS term starting with the left and working to the right,
\begin{equation}
(3\prob^2 - \prob^3)^2\cdot \left((-4 \cdot 45) - (-21 \cdot 10)\right) = (3\prob^2 - \prob^3)^2\cdot(-180 + 210) = 30(3\prob^2 - \prob^3)^2.\label{eq:det-1}
\end{equation}
The middle term then is
\begin{equation}
-\prob(3\prob^2 - \prob^3)^2 \cdot \left((-2 \cdot 45) - (-18 \cdot 10)\right) = -\prob(3\prob^2 - \prob^3)^2 \cdot (-90 + 180) = -90\prob(3\prob^2 - \prob^3)^2.\label{eq:det-2}
\end{equation}
Finally, the rightmost term,
\begin{equation}
-\left(3\prob^2 - \prob^3\right)^3 \cdot \left((-2 \cdot -21) - (-18 \cdot -4)\right) = -\left(3\prob^2 - \prob^3\right)^3 \cdot (42 - 72) = 30\left(3\prob^2 - \prob^3\right)^3.\label{eq:det-3}
\end{equation}
Putting \cref{eq:det-1}, \cref{eq:det-2}, \cref{eq:det-3} together, we have,
\begin{align}
\dtrm{\mtrix{\rpoly}} =& 30(3\prob^2 - \prob^3)^2 - 90\prob(3\prob^2 - \prob^3)^2 +30(3\prob^2 - \prob^3)^3 = 30(3\prob^2 - \prob^3)^2\left(1 - 3\prob + (3\prob^2 - \prob^3)\right) = 30\left(9\prob^4 - 6\prob^5 + \prob^6\right)\left(-\prob^3 + 3\prob^2 - 3\prob + 1\right)\nonumber\\
=&\left(30\prob^6 - 180\prob^5 + 270\prob^4\right)\cdot\left(-\prob^3 + 3\prob^2 - 3\prob + 1\right).\label{eq:det-final}
\end{align}
\AH{It appears that the equation below has roots at p = 0 (left factor) and p = 1, with NO roots $\in (0, 1)$.}
%Equation \cref{eq:det-final} has no roots in $(0, 1)$.
\AH{I need to understand how lemma ~\ref{lem:lin-sys} follows.}
\end{proof}\AH{End proof of Lemma \ref{lem:lin-sys}}
\qed
Thus, we have proved the ~\cref{lem:const-p} for fixed $p \in (0, 1)$.
\end{proof}
\qed

View File

@ -133,6 +133,8 @@ sensitive=true
\begin{document}
\input{abstract}
\lstset{language=sql}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -160,10 +162,13 @@ sensitive=true
%\input{pos}
%\input{sop}
%\input{davidscheme}
\input{abstract}
\input{intro}
\input{ra-to-poly}
\input{poly-form}
\input{mult_distinct_p}
\input{single_p}
\input{lin_sys}
\input{approx_alg}
\input{bi_cancellation}

156
mult_distinct_p.tex Normal file
View File

@ -0,0 +1,156 @@
%root:main.tex
\subsection{When $\poly$ is not in sum of monomials form}
We would like to argue in the general case that $\expct_{\vct{w}}\pbox{\poly(\vct{w})}$ cannot be computed in linear time.
To this end, consider the following graph $G(V, E)$, where $|E| = \numedge$, $|V| = \numvar$, and $i, j \in [\numvar]$.
Before proceeding, let us list all possible edge patterns in an arbitrary $G$ consisting of $\leq 3$ distinct edges.
\begin{itemize}
\item Single Edge $\left(\ed\right)$
\item 2-path ($\twopath$)
\item 2-matching ($\twodis$)
\item Triangle ($\tri$)
\item 3-path ($\threepath$)
\item 3-star ($\oneint$)--this is the graph that results when all three edges share exactly one common endpoint. The remaining endpoint for each edge is disconnected from any endpoint of the three edges.
\item Disjoint Two-Path ($\twopathdis$)--this subgraph consists of a two path and a remaining disjoint edge.
\item 3-matching ($\threedis$)--this subgraph is composed of three disjoint edges.
\end{itemize}
Let $\numocc{G}{H}$ denote the number of occurrences of pattern $H$ in graph $G$, where, for example, $\numocc{G}{\ed}$ means the number of single edges in $G$.
For any graph $G$, the following formulas compute $\numocc{G}{H}$ for their respective patterns in $O(\numedge)$ time, with $d_i$ representing the degree of vertex $i$.
\begin{align}
&\numocc{G}{\ed} = \numedge, \label{eq:1e}\\
&\numocc{G}{\twopath} = \sum_{i \in V} \binom{d_i}{2} \label{eq:2p}\\
&\numocc{G}{\twodis} = \sum_{(i, j) \in E} \frac{\numedge - d_i - d_j + 1}{2}\label{eq:2m}\\%\binom{\numedge - d_i - d_j + 1}{2}\label{eq:2m}\\
&\numocc{G}{\oneint} = \sum_{i \in V} \binom{d_i}{3}\label{eq:3s}\\
&\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis} = \sum_{(i, j) \in E} \binom{\numedge - d_i - d_j + 1}{2}\label{eq:2pd-3d}
\end{align}
A quick argument to why \cref{eq:2m} is true. Note that for edge $(i, j)$ connecting arbitrary vertices $i$ and $j$, finding all other edges in $G$ disjoint to $(i, j)$ is equivalent to finding all edges that are not connected to either vertex $i$ or $j$. The number of such edges is $m - d_i - d_j + 1$, where we add $1$ since edge $(i, j)$ is removed twice when subtracting both $d_i$ and $d_j$. Since the summation is iterating over all edges such that a pair $\left((i, j), (k, \ell)\right)$ will also be counted as $\left((k, \ell), (i, j)\right)$, division by $2$ then eliminates this double counting.
Equation ~\ref{eq:2pd-3d} is true for similar reasons. For edge $(i, j)$, it is necessary to find two additional edges, disjoint or connected. As in ~\cref{eq:2m}, once the number of edges disjoint to $(i, j)$ have been computed, then we only need to consider all possible combinations of two edges from the set of disjoint edges, since it doesn't matter if the two edges are connected or not. Note, the factor $3$ of $\threedis$ is necessary to account for the triple counting of $3$-matchings. It is also the case that, since the two path in $\twopathdis$ is connected, that there will be no double counting by the fact that the summation automatically 'disconnects' the current edge, meaning that a two matching at the current vertex will not be counted. The sum over all such edge combinations is precisely then $\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}$.
Now consider the query $\poly_{G}(\vct{X}) = q_E(X_1,\ldots, X_\numvar) = \sum\limits_{(i, j) \in E} X_i \cdot X_j$. For the following discussion, set $\poly_{G}^3(\vct{X}) = \left(q_E(X_1,\ldots, X_\numvar)\right)^3$.
%Original lemma proving the exact coefficient terms in qE3
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\begin{Lemma}\label{lem:qE3-exp}
%When we expand $\poly_{G}(\vct{X}) = \left(q_E(X_1,\ldots, X_\numvar)\right)^3$ out and assign all exponents $e \geq 1$ a value of $1$, we have the following,
% \begin{align}
% &\rpoly_{G}(\prob,\ldots, \prob) = \numocc{G}{\ed}\prob^2 + 6\numocc{G}{\twopath}\prob^3 + 6\numocc{G}{\twodis} + 6\numocc{G}{\tri}\prob^3 + 6\numocc{G}{\oneint}\prob^4 + 6\numocc{G}{\threepath}\prob^4 + 6\numocc{G}{\twopathdis}\prob^5 + 6\numocc{G}{\threedis}\prob^6.\label{claim:four-one}
% \end{align}
%\end{Lemma}
%
%\begin{proof}[Proof of \cref{lem:qE3-exp}]
%By definition we have that
% \[\poly_{G}(\vct{X}) = \sum_{\substack{(i_1, j_1),\\ (i_2, j_2),\\ (i_3, j_3) \in E}} \prod_{\ell = 1}^{3}X_{i_\ell}X_{j_\ell}.\]
% Rather than list all the expressions in full detail, let us make some observations regarding the sum. Let $e_1 = (i_1, j_1), e_2 = (i_2, j_2), e_3 = (i_3, j_3)$. Notice that each expression in the sum consists of a triple $(e_1, e_2, e_3)$. There are three forms the triple $(e_1, e_2, e_3)$ can take.
%
%\textsc{case 1:} $e_1 = e_2 = e_3$, where all edges are the same. There are exactly $\numedge$ such triples, each with a $\prob^2$ factor in $\rpoly_{G}\left(\prob_1,\ldots, \prob_\numvar\right)$.
%
%\textsc{case 2:} This case occurs when there are two distinct edges of the three, call them $e$ and $e'$. When there are two distinct edges, there is then the occurence when $2$ variables in the triple $(e_1, e_2, e_3)$ are bound to $e$. There are three combinations for this occurrence. It is the analogue for when there is only one occurrence of $e$, i.e. $2$ of the variables in $(e_1, e_2, e_3)$ are $e'$. Again, there are three combinations for this. All $3 + 3 = 6$ combinations of two distinct values consist of the same monomial in $\rpoly$, i.e. $(e_1, e_1, e_2)$ is the same as $(e_2, e_1, e_2)$. This case produces the following edge patterns: $\twopath, \twodis$.
%
%\textsc{case 3:} $e_1 \neq e_2 \neq e_3$, i.e., when all edges are distinct. For this case, we have $3! = 6$ permutations of $(e_1, e_2, e_3)$. This case consists of the following edge patterns: $\tri, \oneint, \threepath, \twopathdis, \threedis$.
%\end{proof}
%\qed
\begin{Lemma}\label{lem:alt-qE3-exp}
Given polynomial $\poly_{G}^3(\prob,\ldots, \prob)$, we can write $\rpoly_{G}^3$ as $\rpoly_{G}^3(\prob,\ldots, \prob) = \sum\limits_{i = 0}^6 c_i \cdot \prob^i$ for some fixed terms $\vct{c}$ and seven distinct $\prob$ values, one can compute each $c_i$ in $\vct{c}$ exactly.
\end{Lemma}
\begin{proof}[Proof of ~\cref{lem:alt-qE3-exp}]
By defintion we know that a polynomial consists of coefficient terms being multiplied to variables. In our case, one can readily expand the cubed expression by performing the $n^3$ product operations, yielding the polynomial in the sum of products form of the lemma statement. By definition $\rpoly_{G}^3$ reduces all variable exponents greater than $1$ to $1$. Thus, a monomial such as $X_i^3X_j^3$ is $X_iX_j$ in $\rpoly_{G}^3$, and the value after substition is $p_i\cdot p_j = p^2$. Further, that the number of terms in the sum is no greater than $7$, can be easily justified by the fact that each edge has two endpoints, and the most endpoints occur when we have $3$ distinct edges, with non-intersecting points, a case equivalent to $p^6$.
Given that we have at least $7$ distinct values of $\prob$ by the lemma statement, it follows that we then have $7$ linear equations which are distinct. Further, by definition of the summation, these seven equations collectively form the Vandermonde matrix, from which it follows that we have a matrix with full rank, and we can solve the linear system to determine $\vct{c}$ exactly.
\end{proof}
\qed
\begin{Lemma}\label{lem:alt-qE3-exp-3-match}
The number of $3$-matchings in $\poly_{G}(\vct{X})$ is exactly $6\cdot\numocc{G}{\threedis}$.
\end{Lemma}
\begin{proof}
A $3$-matching occurs when there are $3$ edges, $e_1, e_2, e_3$, such that all of them are disjoint, i.e., $e_1 \neq e_2 \neq e_3$. In $\poly_{G}(\vct{X})$ there are $3$ choices from the first factor to select an edge of a given three matching. In the second factor, only $2$ choices, and so on, yielding $3! = 6$ terms in the expansion of $\poly_{G}(\vct{X})$ of an arbitrary $3-matching$.
Thus, the product $6\cdot\numocc{G}{\threedis}$ is the exact number of $3$-matchings in $\poly_{G}(\vct{X})$.
\end{proof}
\qed
\begin{Corollary}\label{cor:lem-alt-qE3}
One can comute $\numocc{G}{\threedis}$ in $\query_{G}(\vct{X})$ exactly.
\end{Corollary}
\begin{proof}[Proof for Corollary ~\ref{cor:lem-alt-qE3}]
By ~\cref{lem:alt-qE3-exp}, the term $c_6$ can be exactly computed. By ~\cref{lem:alt-qE3-exp-3-match}, we know that $c_6$ can be broken into two factors, and by dividing $c_6$ by the factor $6$, it follows that the resulting value is indeed $\numocc{G}{\threedis}$.
\end{proof}
\qed
\begin{Lemma}\label{lem:alt-qEk}
Given $k$ distinct $\prob$ values and $\poly_{G}^k(\prob,\ldots, \prob)$, one can solve the number of $3$-matchings exactly.
\end{Lemma}
\begin{proof}[Proof for Lemma ~\ref{lem:alt-qEk}]
By the same logic as ~\cref{lem:alt-qE3-exp} it is the case that there are $k$ $\prob^i$ values for $i$ in $[0, k - 1]$. This, combined with $k$ distinct $\prob$ values yields the Vandermonde matrix with full rank, and thus all the values $c_i$ in $\vct{c}$ can be computed exactly. Finally, along the same lines as ~\cref{lem:alt-qE3-exp-3-match}, dividing by $k!$ yields the desired result, $\numocc{G}{k-matchings}$. This can be seen, since it is the case that only a $k-matching$ can have a $\prob^{2k}$ factor, and, secondly, for a $k-product$, there are $k$ choices in the first product, $k - 1$ choices in the second factor, and so on, yielding $k!$ copies of each $k-matching$.
\AH{Any suggestions for a better notation/representation of k-matching??}
\end{proof}
\qed
\begin{Corollary}\label{cor:reduct}
Since lemmas ~\ref{lem:alt-qE3-exp} and ~\ref{lem:alt-qEk} are true, it follows that computing $\rpoly(\vct{X})$ is hard.
\end{Corollary}
%Old proof
%%%%%%%%%%%%%%%%%%%%%
%Notice that ~\cref{lem:qE3-exp} is an example of a query that reduces to the hard problems in graph theory of counting triangles, three-matchings, three-paths, etc. Thus, in general, computing $\expct_{\vct{w}}\pbox{\poly(\vct{w})} = \rpoly\left(\prob_1,\ldots, \prob_\numvar\right)$ is a hard problem.
%
%\begin{Claim}\label{claim:four-two}
% If one can compute $\rpoly_{G}(\prob,\ldots, \prob)$ in time T(\numedge), then we can compute the following in O(T(\numedge) + \numedge):
%\[\numocc{G}{\tri} + \numocc{G}{\threepath} \cdot \prob - \numocc{G}{\threedis}\cdot(3\prob^2 - \prob^3).\]
%\end{Claim}
%\begin{proof}[Proof of Claim \ref{claim:four-two}]
%%We have shown that the following subgraph cardinalities can be computed in $O(\numedge)$ time:
%%\[\numocc{G}{\ed}, \numocc{G}{\twopath}, \numocc{G}{\twodis}, \numocc{G}{\oneint}, \numocc{G}{\twopathdis} + \numocc{G}{\threedis}.\]
%It has already been shown previously that $\numocc{G}{\ed}, \numocc{G}{\twopath}, \numocc{G}{\twodis},$ and $\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}$ can be computed in $O(\numedge)$ time.
%
%Using the result of \cref{lem:qE3-exp}, let us show a derivation to the identity of the consequent in \cref{claim:four-two}.
%
%All of \cref{eq:1e}, \cref{eq:2p}, \cref{eq:2m}, \cref{eq:3s}, \cref{eq:2pd-3d} show that we can compute the respective edge patterns in $O(\numedge)$ time. Rearrange ~\cref{claim:four-one}, $\rpoly_{G}$, with all linear time computations on one side, leaving only the hard computations,
%\begin{align}
%&\rpoly_{G}(\prob,\ldots, \prob) = \numocc{G}{\ed}\prob^2 + 6\numocc{G}{\twopath}\prob^3 + 6\numocc{G}{\twodis}\prob^4 + 6\numocc{G}{\oneint}\prob^4 + 6\numocc{G}{\tri}\prob^3 + 6\numocc{G}{\threepath}\prob^4 + 6\numocc{G}{\twopathdis}\prob^5 + 6\numocc{G}{\threedis}\prob^6\nonumber\\
%&\rpoly_{G}(\prob,\ldots, \prob) - \numocc{G}{\ed}\prob^2 - 6\numocc{G}{\twopath}\prob^3 - 6\numocc{G}{\twodis}\prob^4 - 6\numocc{G}{\oneint}\prob^4 = 6\numocc{G}{\tri}\prob^3 + 6\numocc{G}{\threepath}\prob^4 + 6\numocc{G}{\twopathdis}\prob^5 + 6\numocc{G}{\threedis}\prob^6\label{eq:LS-rearrange}\\
%&\frac{\rpoly_{G}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{G}{\ed}}{6\prob} - \numocc{G}{\twopath} - \numocc{G}{\twodis}\prob - \numocc{G}{\oneint}\prob = \numocc{G}{\tri} + \numocc{G}{\threepath}\prob + \numocc{G}{\twopathdis}\prob^2 + \numocc{G}{\threedis}\prob^3\label{eq:LS-reduce}\\
%&\frac{\rpoly_{G}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{G}{\ed}}{6\prob} - \numocc{G}{\twopath} - \numocc{G}{\twodis}\prob - \numocc{G}{\oneint}\prob - \big(\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}\big)\prob^2 = \numocc{G}{\tri} + \numocc{G}{\threepath}\prob - \numocc{G}{\threedis}\left(3\prob^2 - \prob^3\right)\label{eq:LS-subtract}
%\end{align}
%
%\cref{eq:LS-rearrange} is the result of simply subtracting from both sides terms that have $O(\numedge)$ complexity. Dividing all terms by the common factor of $6\prob^3$ gives \cref{eq:LS-reduce}. Equation ~\ref{eq:LS-subtract}, is the result of subtracting the $O(\numedge)$ computable term $\left(\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}\right)\prob^2$ from both sides.
%
%%\begin{equation}
%%\frac{\rpoly_{G}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{G}{\ed}}{6\prob} - \numocc{G}{\twopath} - \numocc{G}{\twodis}\prob - \numocc{G}{\oneint}\prob - \big(\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}\big)\prob^2 = \numocc{G}{\tri} + \numocc{G}{\threepath}\prob - \numocc{G}{\threedis}\left(3\prob^2 - \prob^3\right)
%%\end{equation}
%
%
%The implication in \cref{claim:four-two} follows by the above and \cref{lem:qE3-exp}.
%\end{proof}
%\qed
%
%\begin{Lemma}\label{lem:gen-p}
%If we can compute $\rpoly_{G}(\vct{X})$ in $T(\numedge)$ time for $O(1)$ distinct values $\vct{\prob}$ such that all $\prob_i = \prob$ for all $i \in [\numvar], \prob_i \in \vct{\prob}$, then we can count the number of triangles, 3-paths, and 3-matchings in $G$ in $T(\numedge) + O(\numedge)$ time.
%\end{Lemma}
%
%\begin{proof}[Proof of \cref{lem:gen-p}]
%
%\cref{claim:four-two} says that if we know $\rpoly_{G}(\prob,\ldots, \prob)$, then we can know in O(\numedge) additional time
%\[\numocc{G}{\tri} + \numocc{G}{\threepath} \cdot \prob - \numocc{G}{\threedis}\cdot(3\prob^2 - \prob^3).\] We can think of each term in the above equation as a variable, where one can solve a linear system given 3 distinct $\prob$ values, assuming independence of the three linear equations. In the worst case, without independence, 4 distinct values of $\prob$ would suffice. This follows from the fact that the corresponding coefficient matrix is the so called Vandermonde matrix, which has full rank
%\end{proof}
%\qed

View File

@ -1,7 +1,7 @@
%root: main.tex
%!TEX root = ./main.tex
%\onecolumn
\section{Polynomial Formulation}
\section{Polynomial Formulation and Equivalences}
Before proceeding, note that the following is assuming $\ti$s in the setting of \textit{bag} semantics.
@ -108,657 +108,13 @@ Finally, observe \cref{p1-s5} by construction in \cref{lem:pre-poly-rpoly}, that
\end{proof}
\begin{Corollary}\label{cor:expct-sop}
If $\poly$ is given as a sum of monomials, the expectation of $\poly$, i.e., $\expct{\poly} = \rpoly\left(\prob_1,\ldots, \prob_\numvar\right)$ can be computed in $O(|\poly|)$, where $|\poly|$ denotes the total number of multiplication/addition operators.
If $\poly$ is given as a sum of monomials, the expectation of $\poly$, i.e., $\expct\pbox{\poly} = \rpoly\left(\prob_1,\ldots, \prob_\numvar\right)$ can be computed in $O(|\poly|)$, where $|\poly|$ denotes the total number of multiplication/addition operators.
\end{Corollary}
\begin{proof}[Proof For Corollary ~\ref{cor:expct-sop}]
Note that \cref{lem:exp-poly-rpoly} shows that $\expct{\poly} = \rpoly(\prob_1,\ldots, \prob_\numvar)$. Therefore, if $\poly$ is already in sum of products form, one only needs to compute $\poly(\prob_1,\ldots, \prob_\numvar)$ ignoring exponent terms (note that such a polynomial is $\rpoly(\prob_1,\ldots, \prob_\numvar)$), which is indeed has $O(|\poly|)$ compututations.\qed
Note that \cref{lem:exp-poly-rpoly} shows that $\expct\pbox{\poly} = \rpoly(\prob_1,\ldots, \prob_\numvar)$. Therefore, if $\poly$ is already in sum of products form, one only needs to compute $\poly(\prob_1,\ldots, \prob_\numvar)$ ignoring exponent terms (note that such a polynomial is $\rpoly(\prob_1,\ldots, \prob_\numvar)$), which is indeed has $O(|\poly|)$ compututations.\qed
\end{proof}
\subsection{When $\poly$ is not in sum of monomials form}
We would like to argue in the general case that $\expct_{\vct{w}}\pbox{\poly(\vct{w})}$ cannot be computed in linear time.
To this end, consider the following graph $G(V, E)$, where $|E| = \numedge$, $|V| = \numvar$, and $i, j \in [\numvar]$.
Before proceeding, let us list all possible edge patterns in an arbitrary $G$ consisting of $\leq 3$ distinct edges.
\begin{itemize}
\item Single Edge $\left(\ed\right)$
\item 2-path ($\twopath$)
\item 2-matching ($\twodis$)
\item Triangle ($\tri$)
\item 3-path ($\threepath$)
\item 3-star ($\oneint$)--this is the graph that results when all three edges share exactly one common endpoint. The remaining endpoint for each edge is disconnected from any endpoint of the three edges.
\item Disjoint Two-Path ($\twopathdis$)--this subgraph consists of a two path and a remaining disjoint edge.
\item 3-matching ($\threedis$)--this subgraph is composed of three disjoint edges.
\end{itemize}
Let $\numocc{G}{H}$ denote the number of occurrences of pattern $H$ in graph $G$, where, for example, $\numocc{G}{\ed}$ means the number of single edges in $G$.
For any graph $G$, the following formulas compute $\numocc{G}{H}$ for their respective patterns in $O(\numedge)$ time, with $d_i$ representing the degree of vertex $i$.
\begin{align}
&\numocc{G}{\ed} = \numedge, \label{eq:1e}\\
&\numocc{G}{\twopath} = \sum_{i \in V} \binom{d_i}{2} \label{eq:2p}\\
&\numocc{G}{\twodis} = \sum_{(i, j) \in E} \frac{\numedge - d_i - d_j + 1}{2}\label{eq:2m}\\%\binom{\numedge - d_i - d_j + 1}{2}\label{eq:2m}\\
&\numocc{G}{\oneint} = \sum_{i \in V} \binom{d_i}{3}\label{eq:3s}\\
&\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis} = \sum_{(i, j) \in E} \binom{\numedge - d_i - d_j + 1}{2}\label{eq:2pd-3d}
\end{align}
A quick argument to why \cref{eq:2m} is true. Note that for edge $(i, j)$ connecting arbitrary vertices $i$ and $j$, finding all other edges in $G$ disjoint to $(i, j)$ is equivalent to finding all edges that are not connected to either vertex $i$ or $j$. The number of such edges is $m - d_i - d_j + 1$, where we add $1$ since edge $(i, j)$ is removed twice when subtracting both $d_i$ and $d_j$. Since the summation is iterating over all edges such that a pair $\left((i, j), (k, \ell)\right)$ will also be counted as $\left((k, \ell), (i, j)\right)$, division by $2$ then eliminates this double counting.
\AH{The formula ~\cref{eq:2pd-3d} has been fixed to reflect the triple counting of 3-matchings. Notice the factor of 3 on the right term (3-matchings) in the LHS. 110220}
Equation ~\ref{eq:2pd-3d} is true for similar reasons. For edge $(i, j)$, it is necessary to find two additional edges, disjoint or connected. As in ~\cref{eq:2m}, once the number of edges disjoint to $(i, j)$ have been computed, then we only need to consider all possible combinations of two edges from the set of disjoint edges, since it doesn't matter if the two edges are connected or not. Note, the factor $3$ of $\threedis$ is necessary to account for the triple counting of $3$-matchings. It is also the case that, since the two path in $\twopathdis$ is connected, that there will be no double counting by the fact that the summation automatically 'disconnects' the current edge, meaning that a two matching at the current edge will not be counted. The sum over all such edge combinations is precisely then $\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}$.
Now consider the query $\poly_{G}(\vct{X}) = q_E(X_1,\ldots, X_\numvar) = \sum\limits_{(i, j) \in E} X_i \cdot X_j$. For the following discussion, set $\poly_{G}^3(\vct{X}) = \left(q_E(X_1,\ldots, X_\numvar)\right)^3$.
%Original lemma proving the exact coefficient terms in qE3
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\begin{Lemma}\label{lem:qE3-exp}
%When we expand $\poly_{G}(\vct{X}) = \left(q_E(X_1,\ldots, X_\numvar)\right)^3$ out and assign all exponents $e \geq 1$ a value of $1$, we have the following,
% \begin{align}
% &\rpoly_{G}(\prob,\ldots, \prob) = \numocc{G}{\ed}\prob^2 + 6\numocc{G}{\twopath}\prob^3 + 6\numocc{G}{\twodis} + 6\numocc{G}{\tri}\prob^3 + 6\numocc{G}{\oneint}\prob^4 + 6\numocc{G}{\threepath}\prob^4 + 6\numocc{G}{\twopathdis}\prob^5 + 6\numocc{G}{\threedis}\prob^6.\label{claim:four-one}
% \end{align}
%\end{Lemma}
%
%\begin{proof}[Proof of \cref{lem:qE3-exp}]
%By definition we have that
% \[\poly_{G}(\vct{X}) = \sum_{\substack{(i_1, j_1),\\ (i_2, j_2),\\ (i_3, j_3) \in E}} \prod_{\ell = 1}^{3}X_{i_\ell}X_{j_\ell}.\]
% Rather than list all the expressions in full detail, let us make some observations regarding the sum. Let $e_1 = (i_1, j_1), e_2 = (i_2, j_2), e_3 = (i_3, j_3)$. Notice that each expression in the sum consists of a triple $(e_1, e_2, e_3)$. There are three forms the triple $(e_1, e_2, e_3)$ can take.
%
%\textsc{case 1:} $e_1 = e_2 = e_3$, where all edges are the same. There are exactly $\numedge$ such triples, each with a $\prob^2$ factor in $\rpoly_{G}\left(\prob_1,\ldots, \prob_\numvar\right)$.
%
%\textsc{case 2:} This case occurs when there are two distinct edges of the three, call them $e$ and $e'$. When there are two distinct edges, there is then the occurence when $2$ variables in the triple $(e_1, e_2, e_3)$ are bound to $e$. There are three combinations for this occurrence. It is the analogue for when there is only one occurrence of $e$, i.e. $2$ of the variables in $(e_1, e_2, e_3)$ are $e'$. Again, there are three combinations for this. All $3 + 3 = 6$ combinations of two distinct values consist of the same monomial in $\rpoly$, i.e. $(e_1, e_1, e_2)$ is the same as $(e_2, e_1, e_2)$. This case produces the following edge patterns: $\twopath, \twodis$.
%
%\textsc{case 3:} $e_1 \neq e_2 \neq e_3$, i.e., when all edges are distinct. For this case, we have $3! = 6$ permutations of $(e_1, e_2, e_3)$. This case consists of the following edge patterns: $\tri, \oneint, \threepath, \twopathdis, \threedis$.
%\end{proof}
%\qed
\begin{Lemma}\label{lem:alt-qE3-exp}
Given polynomial $\poly_{G}(\prob,\ldots, \prob) = \sum\limits_{i = 0}^6 c_i \cdot \prob^i$ for some fixed terms $\vct{c}$ and seven distinct $\prob$ values, one can compute each $c_i$ in $\vct{c}$ exactly.
\end{Lemma}
\begin{proof}[Proof of ~\cref{lem:alt-qE3-exp} Alternate Version]
By defintion we know that a polynomial consists of coefficient terms being multiplied to variables. In our case, one can readily expand the cubed expression by performing the necessary $27$ product operations, yielding the polynomial in the sum of products form in the lemma statement. Further, that the number of terms in the sum is no greater than $7$, can be easily justified by the fact that each edge has two endpoints, and the most endpoints occur when we have $3$ distinct edges, with non-intersecting points.
Given that we have at least $7$ distinct values of $\prob$ by the lemma statement, it follows that we then have $7$ linear equations which are distinct. Further, by definition of the summation, these seven equations put together form the Vandermonde matrix, from which it follows that we have a matrix with full rank, and we can solve the linear system to determine $\vct{c}$ exactly.
\end{proof}
\qed
\begin{Lemma}\label{lem:alt-qE3-exp-3-match}
The number of $3$-matchings in $\poly_{G}(\vct{X})$ is exactly $6\cdot\numocc{G}{\threedis}$.
\end{Lemma}
\begin{proof}
A $3$-matching occurs when there are $3$ edges, $e_1, e_2, e_3$, such that all of them are disjoint, i.e., $e_1 \neq e_2 \neq e_3$. In $\poly_{G}(\vct{X})$ there are $3$ choices from the first factor to select an edge of a given three matching. In the second factor, only $2$ choices, and so on, yielding $3! = 6$ terms in the expansion of $\poly_{G}(\vct{X})$ of the arbitrary $3-matching$.
Thus, the product $6\cdot\numocc{G}{\threedis}$ is the exact number of $3$-matchings in $\poly_{G}(\vct{X})$.
\end{proof}
\qed
\begin{Corollary}\label{cor:lem-alt-qE3}
One can comute $\numocc{G}{\threedis}$ in $\query_{G}(\vct{X})$ exactly.
\end{Corollary}
\begin{proof}[Proof for Corollary ~\ref{cor:lem-alt-qE3}]
By ~\cref{lem:alt-qE3-exp}, the term $c_6$ can be exactly computed. By ~\cref{lem:alt-qE3-exp-3-match}, we know that $c_6$ can be broken into two factors, and by dividing $c_6$ by the factor $6$, it follows that the resulting value is indeed $\numocc{G}{\threedis}$.
\end{proof}
\qed
\begin{Lemma}\label{lem:alt-qEk}
Given $k$ distinct $\prob$ values and $\poly_{G}^k(\prob,\ldots, \prob)$, one can solve the number of $3$-matchings exactly.
\end{Lemma}
\begin{proof}[Proof for Lemma ~\ref{lem:alt-qEk}]
By the same logic as ~\cref{lem:alt-qE3-exp} it is the case that there are $k$ $\prob^i$ values for $i$ in $[0, k - 1]$. This, combined with $k$ distinct $\prob$ values yields the Vandermonde matrix with full rank, and thus the all values $c_i$ in $\vct{c}$ can be computed exactly. Finally, by ~\cref{lem:alt-qE3-exp-3-match}, dividing by $6$ yields the desired result, $\numocc{G}{\threedis}$.
\end{proof}
\qed
%Notice that ~\cref{lem:qE3-exp} is an example of a query that reduces to the hard problems in graph theory of counting triangles, three-matchings, three-paths, etc. Thus, in general, computing $\expct_{\vct{w}}\pbox{\poly(\vct{w})} = \rpoly\left(\prob_1,\ldots, \prob_\numvar\right)$ is a hard problem.
%
%\begin{Claim}\label{claim:four-two}
% If one can compute $\rpoly_{G}(\prob,\ldots, \prob)$ in time T(\numedge), then we can compute the following in O(T(\numedge) + \numedge):
%\[\numocc{G}{\tri} + \numocc{G}{\threepath} \cdot \prob - \numocc{G}{\threedis}\cdot(3\prob^2 - \prob^3).\]
%\end{Claim}
%\begin{proof}[Proof of Claim \ref{claim:four-two}]
%%We have shown that the following subgraph cardinalities can be computed in $O(\numedge)$ time:
%%\[\numocc{G}{\ed}, \numocc{G}{\twopath}, \numocc{G}{\twodis}, \numocc{G}{\oneint}, \numocc{G}{\twopathdis} + \numocc{G}{\threedis}.\]
%It has already been shown previously that $\numocc{G}{\ed}, \numocc{G}{\twopath}, \numocc{G}{\twodis},$ and $\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}$ can be computed in $O(\numedge)$ time.
%
%Using the result of \cref{lem:qE3-exp}, let us show a derivation to the identity of the consequent in \cref{claim:four-two}.
%
%All of \cref{eq:1e}, \cref{eq:2p}, \cref{eq:2m}, \cref{eq:3s}, \cref{eq:2pd-3d} show that we can compute the respective edge patterns in $O(\numedge)$ time. Rearrange ~\cref{claim:four-one}, $\rpoly_{G}$, with all linear time computations on one side, leaving only the hard computations,
%\begin{align}
%&\rpoly_{G}(\prob,\ldots, \prob) = \numocc{G}{\ed}\prob^2 + 6\numocc{G}{\twopath}\prob^3 + 6\numocc{G}{\twodis}\prob^4 + 6\numocc{G}{\oneint}\prob^4 + 6\numocc{G}{\tri}\prob^3 + 6\numocc{G}{\threepath}\prob^4 + 6\numocc{G}{\twopathdis}\prob^5 + 6\numocc{G}{\threedis}\prob^6\nonumber\\
%&\rpoly_{G}(\prob,\ldots, \prob) - \numocc{G}{\ed}\prob^2 - 6\numocc{G}{\twopath}\prob^3 - 6\numocc{G}{\twodis}\prob^4 - 6\numocc{G}{\oneint}\prob^4 = 6\numocc{G}{\tri}\prob^3 + 6\numocc{G}{\threepath}\prob^4 + 6\numocc{G}{\twopathdis}\prob^5 + 6\numocc{G}{\threedis}\prob^6\label{eq:LS-rearrange}\\
%&\frac{\rpoly_{G}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{G}{\ed}}{6\prob} - \numocc{G}{\twopath} - \numocc{G}{\twodis}\prob - \numocc{G}{\oneint}\prob = \numocc{G}{\tri} + \numocc{G}{\threepath}\prob + \numocc{G}{\twopathdis}\prob^2 + \numocc{G}{\threedis}\prob^3\label{eq:LS-reduce}\\
%&\frac{\rpoly_{G}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{G}{\ed}}{6\prob} - \numocc{G}{\twopath} - \numocc{G}{\twodis}\prob - \numocc{G}{\oneint}\prob - \big(\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}\big)\prob^2 = \numocc{G}{\tri} + \numocc{G}{\threepath}\prob - \numocc{G}{\threedis}\left(3\prob^2 - \prob^3\right)\label{eq:LS-subtract}
%\end{align}
%
%\cref{eq:LS-rearrange} is the result of simply subtracting from both sides terms that have $O(\numedge)$ complexity. Dividing all terms by the common factor of $6\prob^3$ gives \cref{eq:LS-reduce}. Equation ~\ref{eq:LS-subtract}, is the result of subtracting the $O(\numedge)$ computable term $\left(\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}\right)\prob^2$ from both sides.
%
%%\begin{equation}
%%\frac{\rpoly_{G}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{G}{\ed}}{6\prob} - \numocc{G}{\twopath} - \numocc{G}{\twodis}\prob - \numocc{G}{\oneint}\prob - \big(\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}\big)\prob^2 = \numocc{G}{\tri} + \numocc{G}{\threepath}\prob - \numocc{G}{\threedis}\left(3\prob^2 - \prob^3\right)
%%\end{equation}
%
%
%The implication in \cref{claim:four-two} follows by the above and \cref{lem:qE3-exp}.
%\end{proof}
%\qed
%
%\begin{Lemma}\label{lem:gen-p}
%If we can compute $\rpoly_{G}(\vct{X})$ in $T(\numedge)$ time for $O(1)$ distinct values $\vct{\prob}$ such that all $\prob_i = \prob$ for all $i \in [\numvar], \prob_i \in \vct{\prob}$, then we can count the number of triangles, 3-paths, and 3-matchings in $G$ in $T(\numedge) + O(\numedge)$ time.
%\end{Lemma}
%
%\begin{proof}[Proof of \cref{lem:gen-p}]
%
%\cref{claim:four-two} says that if we know $\rpoly_{G}(\prob,\ldots, \prob)$, then we can know in O(\numedge) additional time
%\[\numocc{G}{\tri} + \numocc{G}{\threepath} \cdot \prob - \numocc{G}{\threedis}\cdot(3\prob^2 - \prob^3).\] We can think of each term in the above equation as a variable, where one can solve a linear system given 3 distinct $\prob$ values, assuming independence of the three linear equations. In the worst case, without independence, 4 distinct values of $\prob$ would suffice. This follows from the fact that the corresponding coefficient matrix is the so called Vandermonde matrix, which has full rank
%\end{proof}
%\qed
\AR{Follows from the fact that the corresponding coefficient matrix is the so called Vandermonde matrix, which has full rank.}
\AH{This Vandermonde matrix I need to research. Right now, the last sentences are just parrotting Atri.}
\AH{We need a citation for Vandermonde matrix.}
\AR{Jul 31: Did not make a pass on anything above this.}
\AH{\Large From this point on, Atri has not made another pass on this since I have implemented his suggestions.}
\begin{Theorem}\label{lem:const-p}
If we can compute $\rpoly_{G}(\vct{X})$ in T(\numedge) time for $X_1 =\cdots= X_\numvar = \prob$, then we can count the number of triangles, 3-paths, and 3-matchings in $G$ in $T(\numedge) + O(\numedge)$ time.
\end{Theorem}
\AH{Theorem ~\ref{lem:const-p} is different from ~\cref{lem:gen-p} is that ~\cref{lem:const-p} has exactly one $\vct{p}$, while for ~\cref{lem:gen-p} there are $O(1)$ \textit{distinct} $\vct{p}_i$, where as explicitly stated in both statements for distinct $\vct{p}$, all values $\prob_j \in \vct{p}$ are equal to one another.}
%-------------------------------------WARM-UP------------------------------
%\AH{The warm-up below is fine for now, but will need to be removed for the final draft}
%First, let us do a warm-up by computing $\rpoly(\wElem_1,\dots, \wElem_\numvar)$ when $\poly = q_E(\wElem_1,\ldots, \wElem_\numvar)$. Before doing so, we introduce a notation. Let $\numocc{G}{H}$ denote the number of occurrences that $H$ occurs in $G$. So, e.g., $\numocc{G}{\ed}$ is the number of edges ($\numedge$) in $G$.
%
%\AH{We need to make a decision on subgraph notation, and number of occurrences notation.}
%\AH{UPDATE: I did a quick google, and it \textit{appears} that there is a bit of a learning curve to implement node/edge symbols in LaTeX. So, maybe, if time is of the essence, we go with another notation.}
%
%\begin{Claim}
%We can compute $\rpoly(\prob,\ldots, \prob)^2$ in O(\numedge) time.
%\end{Claim}
% \begin{proof}
% The proof basically follows by definition. When we expand $\poly^2$, and make all exponents $e = 1$, substituting $\prob$ for all $\wElem_i$ we get $\rpoly_2(\prob,\ldots, \prob) = \numocc{G}{\ed} \cdot \prob^2 + 2\cdot \numocc{G}{\twopath}\cdot \prob^3 + 2\cdot \numocc{G}{\twodis}\cdot \prob^4$.
% \begin{enumerate}
% \item First note that
% \begin{align*}
% \poly^2(\vct{w}) &= \sum_{(i, j) \in E} (\wElem_i\wElem_j)^2 + \sum_{(i, j), (k, \ell) \in E s.t. (i, j) \neq (k, \ell)} \wElem_i\wElem_j\wElem_k\wElem_\ell\\
% &= \sum_{(i, j) \in E} (\wElem_i\wElem_j)^2 + \sum_{\substack{(i, j), (j, \ell) \in E\\s.t. i \neq \ell}}\wElem_i
% \wElem_j^2\wElem_\ell + \sum_{\substack{(i, j), (k, \ell) \in E\\s.t. i \neq j \neq k \neq \ell}} \wElem_i\wElem_j\wElem_k\wElem_\ell\\
% \end{align*}
% By definition of $\rpoly$,
% \begin{equation}
% \rpoly^2(\vct{w}) = \sum_{(i, j) \in E} \wElem_i\wElem_j + \sum_{\substack{(i, j), (j, \ell) \in E\\s.t. i \neq \ell}}\wElem_i\wElem_j\wElem_\ell + \sum_{\substack{(i, j), (k, \ell) \in E\\s.t. i \neq j \neq k \neq \ell}} \wElem_i\wElem_j\wElem_k\wElem_\ell\label{eq:part-1}
% \end{equation}
% Notice that the first term is $\numocc{G}{\ed}\cdot \prob^2$, the second $\numocc{G}{\twopath}\cdot \prob^3$, and the third $\numocc{G}{\twodis}\cdot \prob^4.$
% \item Note that
%\AH{We need the correct formula for two-matchings below.}
%
% \end{enumerate}
% Thus, since each of the summations can be computed in O(\numedge) time, this implies that by \cref{eq:part-1} $\rpoly(\prob,\ldots, \prob)$ can be computed in O(\numedge) time.\qed
% \end{proof}
%\AH{END of the 'warm-up'}
%We are now ready to state the claim we need to prove \cref{lem:const-p} and \cref{lem:gen-p}.
%
%Let $\poly(\vct{w}) = q_E(\vct{w})^3$.
%----------------------------------------------------------------------------
\begin{proof}[Proof of \cref{lem:const-p}]
\begin{Definition}\label{def:Gk}
For $k > 1$, let graph $\graph{k}$ be a graph generated from an arbitrary graph $\graph{1}$, by replacing every edge $e$ of $\graph{1}$ with a $k$-path, such that all $k$-path replacement edges are disjoint in the sense that they only intersect at the original intersection endpoints as seen in $\graph{1}$.
\end{Definition}
\begin{Lemma}\label{lem:3m-G2}
The number of $3$-matchings in graph $\graph{2}$ satisfies the following identity,
\[\numocc{\graph{2}}{\threedis} = 8 \cdot \numocc{\graph{1}}{\threedis} + 6 \cdot \numocc{\graph{1}}{\twopathdis} + 4 \cdot \numocc{\graph{1}}{\oneint} + 4 \cdot \numocc{\graph{1}}{\threepath} + 2 \cdot \numocc{\graph{1}}{\tri}.\]
\end{Lemma}
\begin{Lemma}\label{lem:3m-G3}
The number of 3-matchings in $\graph{3}$ satisfy the following identity,
\begin{align*}
\numocc{\graph{3}}{\threedis} = &4\numocc{\graph{1}}{\twopath} + 6\numocc{\graph{1}}{\twodis} + 18\numocc{\graph{1}}{\tri} + 21\numocc{\graph{1}}{\threepath}\\
&+ 24\numocc{\graph{1}}{\twopathdis} + 20\numocc{\graph{1}}{\oneint} + 27\numocc{\graph{1}}{\threedis}.
\end{align*}
\end{Lemma}
\begin{Lemma}\label{lem:3p-G2}
The number of $3$-paths in $\graph{2}$ satisfies the following identity,
\[\numocc{\graph{2}}{\threepath} = 2 \cdot \numocc{\graph{1}}{\twopath}.\]
\end{Lemma}
\begin{Lemma}\label{lem:3p-G3}
The number of $3$-paths in $\graph{3}$ satisfies the following identity,
\[\numocc{\graph{3}}{\threepath} = \numocc{\graph{1}}{\ed} + 2 \cdot \numocc{\graph{1}}{\twopath}.\]
\end{Lemma}
\begin{Lemma}\label{lem:tri}
For $k > 1$, any graph $\graph{k}$ has the property of $\numocc{\graph{k}}{\tri} = 0$.
\end{Lemma}
\begin{Lemma}\label{lem:lin-sys}
Using the identities of lemmas [\ref{lem:3m-G2}, \ref{lem:3m-G3}, \ref{lem:3p-G2}, \ref{lem:3p-G3}, \ref{lem:tri}] to compute $\numocc{G}{\threedis}, \numocc{G}{\threepath}, \numocc{G}{\tri}$ for $G \in \{\graph{2}, \graph{3}\}$, there exists a linear system $\mtrix{\rpoly}\cdot (x~y~z~)^T = \vct{b}$ which can then be solved to determine the unknown quantities of $\numocc{\graph{1}}{\threedis}, \numocc{\graph{1}}{\threepath}$, and $\numocc{\graph{1}}{\tri}$.
\end{Lemma}
\AH{I didn't think of a more appropriate name for $\vct{b}$, so I have just stuck with what Atri called it on chat.}
Using \cref{def:Gk} we construct graphs $\graph{2}$ and $\graph{3}$ from arbitrary graph $\graph{1}$.
We then show that for any of the patterns $\threedis, \threepath, \tri$ which are all known to be hard to compute, we can use linear combinations in terms of $\graph{1}$ from Lemmas \ref{lem:3m-G2}, \ref{lem:3m-G3}, \ref{lem:3p-G2}, \ref{lem:3p-G3}, \ref{lem:tri} to compute $\numocc{\graph{i}}{S}$, where $i$ in $\{2, 3\}$ and $S \in \{\threedis, \threepath, \tri\}$. Then, using ~\cref{claim:four-two}, \cref{lem:qE3-exp} and \cref{lem:lin-sys}, we can combine all three linear combinations into a linear system, solving for $\numocc{\graph{1}}{S}$.
%$%^&*(
Before proceeding, let us introduce a few more helpful definitions.
\subsubsection{$f_k$ and $\graph{k}$}
\begin{Definition}\label{def:ed-nota}
For the set of edges in $\graph{k}$ we write $E_k$. For any graph $\graph{k}$, its edges are denoted by the a pair $(e, b)$, such that $b \in \{0,\ldots, k-1\}$ and $e\in E_1$.
\end{Definition}
\begin{Definition}[$\eset{k}$]
Given an arbitrary subgraph $S\graph{1}$ of $\graph{1}$, let $\eset{1}$ denote the set of edges in $S\graph{1}$. Define then $\eset{k}$ for $k > 1$ as the set of edges in the generated subgraph $S\graph{k}$.
\end{Definition}
For example, consider $S\graph{1}$ with edges $\eset{1} = \{e_1\}$. Then the edges of $S\graph{2}$, $\eset{2} = \{(e_1, 0), (e_1, 1)\}$.
\begin{Definition}\label{def:ed-sub}
Let $\binom{S}{t}$ denote the set of subsets in $S$ with exactly $t$ edges. In a similar manner, $\binom{S}{\leq t}$ is used to mean the subsets of $S$ with $t$ or fewer edges.
\end{Definition}
The following function $f_k$ is a mapping from every $3$-edge shape in $\graph{k}$ to its `projection' in $\graph{1}$.
\begin{Definition}\label{def:fk}
Let $f_k: \binom{E_k}{3} \mapsto \binom{E_1}{\leq3}$ be defined as follows. For any $S \in \binom{E_k}{3}$, such that $S = \pbrace{(e_1, b_1), (e_2, b_2), (e_3, b_3)}$, define:
\[ f_k\left(\pbrace{(e_1, b_1), (e_2, b_2), (e_3, b_3)}\right) = \pbrace{e_1, e_2, e_3}.\]
\end{Definition}
\AH{I found ~\cref{def:fk-inv} a bit imprecise and bulky and have attempted to refine it.
\par Since this an inverse function, the signature is reversed, \vari{but},
\par...the challenge is in quantifying the size of the set (of 3 edge subsets) that is returned...
\par...where the main observation is that for an input edge of size $r$, a set of size $\leq \binom{r\cdot k}{3}$ is returned...
\par...but the catch is that for $r \geq 3$, the set will be strictly less than $\binom{r\cdot k}{3}$ since $f_k$ does not map e.g. an input $\{(e_a, b_1), (e_a, b_2), (e_a, b_3)\}$ (where $a$ is constant and $b_1, b_2, b_3 \in \{0,\ldots, k -1\}$) to more than one edge, \textit{and} it is the case for $r \geq 3$ that $f_k^{-1}$ will not map such an input to its input of size $r$, meaning we must subtract off all such subsets of $\binom{E_k}{3}$.
\par My fix was to use a variable in the exponent and explain in prose. Perhaps there is a better, simpler notation/solution.}
%\AH{I found ~\cref{def:fk} to be inconsistent and not precisely defined. There are 2 ways we could define $f_k$.
%\par 1) As a function that will only return a set of $3$ edge subsets from $S^{(k)}$,
%\par 2) As a function that will return all subsets that map to the input $s^{(1)}$.
%\par At this point I have chosen the latter. We use $k$ in $\graph{k}$, noting that there are $k^{\abs{input}}$ subsets $s^{(k)}$ that all map to $s^{(1)}$, where each subset is \textit{exactly} the size of the input pattern.
%\par I am also unsure for the correct notation to use to precisely indicate the size of the output, whether to choose a notation such as $\leq 3$ for the exponent of $k$, or to use a constant to denote this. I will for now choose the latter of these options.}
\begin{Definition}[$f_k^{-1}$]\label{def:fk-inv}
The inverse function $f_k^{-1}: \binom{E_1}{\leq 3}\mapsto \left\{\binom{E_k}{3}\right\}^{h}$ takes an arbitrary $\eset{1}$ of at most $3$ edges and outputs the set of all subsets of $\binom{\eset{k}}{3}$ such that each subset $s^{(k)}$ of the output set is mapped to the input set $s^{(1)}$ by $f_k$, i.e. $f_k(s^{(k)}) = s^{(1)}$. The set returned by $f_k^{-1}$ is of size $h$, where $h$ depends on $\abs{s^{(1)}}$, such that $h \leq \binom{\abs{s^{(1)}} \cdot k}{3}$.
\end{Definition}
Note, importantly, that when we discuss $f_k^{-1}$, that, although potentially counterintuitive, each \textit{edge} present in $s^{(1)}$ must have an edge in $s^{(k)}$ that `projects` down to it. \textit{Meaning}, if $|s^{(1)}| = 3$, then it must be the case that each $s^{(k)}$ be a set $\{ (e_i, b), (e_j, b), e_\ell, b) \}$ where $i \neq j \neq \ell$.
\begin{Lemma}\label{lem:fk-func}
$f_k$ is a function.
\end{Lemma}
\begin{proof}[Proof of Lemma \ref{lem:fk-func}]
Note that $f_k$ is properly defined. For any $S \in \binom{E_k}{3}$, $|f(S)| \leq 3$, since it has to be the case that any subset of $3$ edges in $E_k$ will map to at most 3 edges in $\graph{1}$. All mappings are in the required range. Then, since for any $b \in \{0,\ldots, k-1\}$ the edge $(e, b) \mapsto e$ is a mapping for which $(e, b)$ maps to no other edge than $e$, and this implies that $f_k$ is a function.
\end{proof}
\qed
%\subsubsection{Subgraph patterns with 3 edges}
%We wish to briefly state the possible subgraphs $S$ containing exactly three edges.
%\begin{itemize}
% \item Triangle ($\tri$)
% \item 3-path ($\threepath$)
% \item 3-star ($\oneint$)--this is the graph that results when all three edges share exactly one common endpoint. The remaining endpoint for each edge is disconnected from any endpoint of the three edges.
% \item Disjoint Two-Path ($\twopathdis$)--this subgraph consists of a two path and a remaining disjoint edge.
% \item 3-matching ($\threedis$)--this subgraph is composed of three disjoint edges.
%\end{itemize}
\subsection{Three Matchings in $\graph{2}$}
\AR{TODO for {\em later}: I think the proof will be much easier to follow with figures: just drawing out $S\times \{0,1\}$ along with the $(e_i,b_i)$ explicity notated on the edges will make the proof much easier to follow.}
\begin{proof}[Proof of Lemma \ref{lem:3m-G2}]
For each edge pattern $S$, we count the number of $3$-matchings in the $3$-edge subgraphs of $\graph{2}$ in $f_2^{-1}(S)$. We start with $S \in \binom{E_1}{3}$, where $S$ is composed of the edges $e_1, e_2, e_3$ and $f_2^{-1}(S)$ is the set of all $3$-edge subsets of the set $\{(e_1, 0), (e_1, 1), (e_2, 0), (e_2, 1), (e_3, 0), (e_3, 1)\}$.
%\begin{tikzpicture}
% \node[
%\end{tikzpicture}
\begin{itemize}
\item $3$-matching ($\threedis$)
\end{itemize}
Consider the $\eset{1} = \threedis$ pattern. Note that edges in $\eset{2}$ are {\em not} disjoint only for the pairs $(e_i, 0), (e_i, 1)$ for $i\in \{1,2,3\}$. All subsets for $b_1, b_2, b_3 \in \{0, 1\}$, $(e_1, b_1), (e_2, b_2), (e_3, b_3)$ will compose a 3-matching. One can see that we have a total of two possible choices for each edge $e_i$ in $\graph{1}$ yielding $2^3 = 8$ possible 3-matchings in $f_2^{-1}(S)$.
%\AH{The comment below is an important comment.}
%\AR{I think your argument seems to implicitly assume that $\graph{1}$ is the subset $S$ and $\graph{2}$ is the corresponding mapping under $f^{-1}$. This is {\bf not} correct. You should present the argument as in the outline above above. I.e. fix an $S\in\binom{E_1}{\le 3}$ in $\graph{1}$ and then consider all possible subgraphs in $\graph{2}$ in $f^{-1}(S)$. {\bf Propagate} this change to the rest of the proof.}
\begin{itemize}
\item Disjoint Two-Path ($\twopathdis$)
\end{itemize}
For $\eset{1} = \twopathdis$ edges $e_2, e_3$ form a $2$-path with $e_1$ being disjoint. This means that $(e_2, 0), (e_2, 1), (e_3, 0), (e_3, 1)$ form a $4$-path while $(e_1, 0), (e_1, 1)$ is its own disjoint $2$-path. We can only pick either $(e_1, 0)$ or $(e_1, 1)$ from $f_2^{-1}(S)$, and then we need to pick a $2$-matching from $e_2$ and $e_3$. Note that a four path allows there to be 3 possible 2 matchings, specifically, $\pbrace{(e_2, 0), (e_3, 0)}, \pbrace{(e_2, 0), (e_3, 1)}, \pbrace{(e_2, 1), (e_3, 1)}$. Since these two selections can be made independently, there are $2 \cdot 3 = 6$ choices for $3$-matchings in $f_2^{-1}(S)$.
\begin{itemize}
\item $3$-star ($\oneint$)
\end{itemize}
When $\eset{1} = \oneint$, the inner edges $(e_i, 1)$ of $\eset{2}$ are all connected, and the outer edges $(e_i, 0)$ are all disjoint. Note that for a valid 3 matching it must be the case that at most one inner edge can be part of the set of disjoint edges. When exactly one inner edge is chosen, there are 3 such possibilities. The remaining possible 3-matching occurs when all 3 outer edges are chosen. Thus, there are $3 + 1 = 4$ 3-matchings in $f_2^{-1}(S)$.
\begin{itemize}
\item $3$-path ($\threepath$)
\end{itemize}
When $\eset{1} =\threepath$ it is the case that all edges beginning with $e_1$ and ending with $e_3$ are successively connected. This means that the edges of $\eset{2}$ form a $6$-path in the edges of $f_2^{-1}(S)$, where all edges from $(e_1, 0),\ldots,(e_3, 1)$ are successively connected. For a $3$-matching to exist, there must be at least one edge separating edges picked from a sequence. There are four such possibilities: $\pbrace{(e_1, 0), (e_2, 0), (e_3, 0)}, \pbrace{(e_1, 0), (e_2, 0), (e_3, 1)}, \pbrace{(e_1, 0), (e_2, 1), (e_3, 1)},$\newline $\pbrace{(e_1, 1), (e_2, 1), (e_3, 1)}$ . Thus, there are four possible 3-matchings in $f_2^{-1}(S)$.
\begin{itemize}
\item Triangle ($\tri$)
\end{itemize}
For $\eset{1} = \tri$, note that it is the case that the edges in $\eset{2}$ are connected in a successive manner, but this time in a cycle, such that $(e_1, 0)$ and $(e_3, 1)$ are also connected. While this is similar to the discussion of the three path above, the first and last edges are not disjoint, since they are connected. This rules out both subsets of $(e_1, 0), (e_2, 0), (e_3, 1)$ and $(e_1, 0), (e_2, 1), (e_3, 1)$ leaving us with $2$ remaining edge combinations that produce a 3 matching.
\begin{itemize}
\item $2$-matching ($\twodis$), $2$-path ($\twopath$), $1$ edge ($\ed$)
\end{itemize}
Let us also consider when $S \in \binom{E_1}{\leq 2}$. When $|S| = 2$, we can only pick one from each of two pairs, $\pbrace{(e_1, 0), (e_1, 1)}$ and $\pbrace{(e_2, 0), (e_2, 1)}$. This implies that a $3$-matching cannot exist in $f_2^{-1}(S)$. The same argument holds for $|S| = 1$, where we can only pick one edge from the pair $\pbrace{(e_1, 0), (e_1, 1)}$, thus no $3$-matching exists in $f_2^{-1}(S)$.
Observe that all of the arguments above focused solely on the shape/pattern of $S$. In other words, all $S$ of a given shape yield the same number of $3$-matchings, and this is why we get the required identity.
\end{proof}
\qed
\subsection{Three matchings in $\graph{3}$}
% Let $S'$ be all the edges of $\graph{k}$ which 'project' down to any set of edges $S$ in $\graph{1}$, formally, for $|S| = 1$, then $S = \{e_1\}$ and $S' = \pbrace{(e_1, 0),\ldots, (e_1, k-1)}$. Similarly, when $|S| = 2$, then $S = \pbrace{e_1, e_2}$ and $S' = \pbrace{(e_1, 0),\ldots, (e_2, k -1)}$, and so on for $|S| = 3$.
\begin{proof}[Proof of Lemma \ref{lem:3m-G3}]
For any $S \in \binom{E_1}{\leq3}$, we again then count the number of $3$-matchings in $f_3^{-1}(S)$.
\begin{itemize}
\item $1$ edge ($\ed$)
\end{itemize}
When $\eset{1} = \ed$, $f_3^{-1}(\eset{1})$ has one subset, $(e_1, 0), (e_1, 1), (e_1, 2)$, which clearly does not contain a $3$-matching. Thus there are no $3$-matchings in $f_3^{-1}(\eset{1})$ for this case.% All edges in the subset are a $3$-path, and it is the case as alluded in $\graph{2}$ discussion that no 3-matching can exist in a single $3$-path.
\begin{itemize}
\item $2$-path ($\twopath$)
\end{itemize}
Fix then $\eset{1} = \twopath$ and now we have all edges in $\eset{3}$ form a $6$-path, and similar to the discussion in the proof of \cref{lem:3m-G2} (when $eset{1} = \threepath$ in $\graph{2}$), this leads to $4$ $3$-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item $2$-matching ($\twodis$)
\end{itemize}
For $\eset{1} = \twodis$, all edges of $\eset{3}$ are predicated on the fact that $(e_i, b)$ is disjoint with $(e_j, b)$ for $i \neq j\in \{1,2\}$ and $b \in \{0, 1, 2\}$. Pick an aribitrary $e_i$ and note, that $(e_i, 0), (e_i, 2)$ is a $2$-matching, which can combine with any of the $3$ edges in $(e_j, 0), (e_j, 1), (e_j, 2)$ again for $i \neq j$. Since the selections are independent, it follows that there exist $2 \cdot 3 = 6$ $3$-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item Triangle ($\tri$)
\end{itemize}
Now, we consider the 3-edge subgraphs of $\graph{1}$, starting with $\eset{1} = \tri$. As discussed in proof of \cref{lem:3m-G2} for the case of $\tri$, the edges of $\eset{3}$ are a cyclic sequence, and we must be careful not to pair $(e_1, 0)$ with $(e_3, 2)$ in a $3$-matching. For any $s \in f_3^{-1}(S)$, $s$ is a $3$-matching when we have that for the edges $(e_1, b_1), (e_2, b_2), (e_3, b_3)$ where $b_1, b_2, b_3 \in \{0, 1, 2\}$, such that, for all $i \in [3]$ it is the case that if $b_i = 2$ then $b_{i \mod{3} + 1} \neq 0$. Iterating through all possible combinations, we have
\begin{itemize}
\item \textsc{$(e_1, 0)$}
\begin{itemize}
\item $\pbrace{(e_1, 0), (e_2, 0), (e_3, 0)}$
\item $\pbrace{(e_1, 0), (e_2, 0), (e_3, 1)}$
\item $\pbrace{(e_1, 0), (e_2, 1), (e_3, 0)}$
\item $\pbrace{(e_1, 0), (e_2, 1), (e_3, 1)}$
\item $\pbrace{(e_1, 0), (e_2, 2), (e_3, 1)}$
\end{itemize}
\item \textsc{$(e_1, 1)$}
\begin{itemize}
\item $\pbrace{(e_1, 1), (e_2, 0), (e_3, 0)}, \ldots\pbrace{(e_1, 1), (e_2, 1), (e_3, 2)}$
\item $\pbrace{(e_1, 1), (e_2, 2), (e_3, 1)}$
\item $\pbrace{(e_1, 1), (e_2, 2), (e_3, 2)}$
\end{itemize}
\item \textsc{$(e_1, 2)$}
\begin{itemize}
\item $\pbrace{(e_1, 2), (e_2, 1), (e_3, 0)}$
\item $\pbrace{(e_1, 2), (e_2, 1), (e_3, 1)}$
\item $\pbrace{(e_1, 2), (e_2, 1), (e_3, 2)}$
\item $\pbrace{(e_1, 2), (e_2, 2), (e_3, 1)}$
\item $\pbrace{(e_1, 2), (e_2, 2), (e_3, 2)}$
\end{itemize}
\end{itemize}
for a total of 18 3-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item $3$-path ($\threepath$)
\end{itemize}
Consider when $\eset{1} = \threepath$ and all edges in $\eset{3}$ are successively connected to form a $9$-path. Since $(e_1, 0)$ is disjoint to $(e_3, 2)$, both of these edges can exist in a $3$-matching. This relaxation yields 3 other 3-matchings that couldn't be counted in the case of the $\eset{1} = \tri$, namely $\pbrace{(e_1, 0), (e_2, 0), (e_3, 2)},\pbrace{(e_1, 0), (e_2, 1), (e_3, 2)}, \pbrace{(e_1, 0), (e_2, 2), (e_3, 2)}$. There are therefore $18 + 3 = 21$ $3$-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item Disjoint Two-Path ($\twopathdis$)
\end{itemize}
Assume $\eset{1} = \twopathdis$, then the edges of $\eset{3}$ have successive connectivity from $(e_1, 0)$ through $(e_1, 2)$, and successive connectivity from $(e_2, 0)$ through $(e_3, 2)$. It is the case that the edges in $\eset{3}$ form a 6-path with a disjoint 3-path. There exist $8$ distinct two matchings (with at least one $(e_2,\cdot)$ and at least one $(e_3,\cdot)$ edge) in the $6$-path $(e_2, 0),\ldots, (e_3, 2)$ of the form $\pbrace{(e_2, 0), (e_3, 0)},\ldots, \pbrace{(e_2, 1), (e_3, 2)}, \pbrace{(e_2, 2), (e_3, 1)}, \pbrace{(e_2, 2), (e_3, 2)}$. These matchings can be paired independently with either of the $3$ remaining edges of $(e_1, b)$, for a total of $8 \cdot 3 = 24$ 3-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item $3$-star ($\oneint$)
\end{itemize}
Given $\eset{1} = \oneint$, the edges of $\eset{3}$ are restricted such that the outer edges $(e_i, 0)$ are disjoint from another, the middle edges $(e_i, 1)$ are also disjoint to each other, and only the inner edges $(e_i, 2)$ intersect with one another at exactly one common endpoint. To be precise, any outer edge $(e_i, 0)$ is disjoint to every middle edge $(e_j, 1)$ for $i \neq j$. As previously mentioned in the proof of \cref{lem:3m-G2}, at most one inner edge may appear in a $3$-matching. For arbitrary inner edge $(e_i, 2)$, we have $4$ combinations of the middle and outer edges of $e_j, e_k$, where $i \neq j \neq k$. These choices are independent and we have $4 \cdot 3 = 12$ 3-matchings. We are not done yet, as we need to consider the middle and outer edge combinations. Notice that for each $e_i$, we have $2$ choices, i.e. a middle or outer edge, contributing $2^3 = 8$ additional $3$-matchings, for a total of $8 + 12 = 20$ $3$-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item $3$-matching ($\threedis$)
\end{itemize}
Given $\eset{1} = \threedis$ subgraph, we have the case that all edges in $\eset{3}$ have the property that $(e_i, b)$ is disjoint to $(e_j, b)$ for $i \neq j$. For each $e_i$, there are then $3$ choices, independent of each other, and it results that there are $3^3 = 27$ 3-matchings in $f_3^{-1}(\eset{1})$.
All of the observations above focused only on the shape of $S$, and since we see that for fixed $S$, we have a fixed number of $3$-matchings, this implies the identity.
\end{proof}
\qed
\subsection{Three Paths}
Computing the number of 3-paths in $\graph{2}$ and $\graph{3}$ consists of much simpler linear combinations.
\subsubsection{$\graph{2}$}
\begin{proof}[Proof of Lemma \ref{lem:3p-G2}]
For $\mathcal{P} \subseteq \eset{2}$ such that $\mathcal{P} $ is a $3$-path, it \textit{must} be the case by definition of $f$ that all edges in $f_2(\mathcal{P} )$ have at least one mapping from an edge in $\mathcal{P} $ (and recall that $\mathcal{P} $ is connected). This constraint rules out every pattern $\eset{1}$ consisting of $3$ edges, as well as when $\eset{1} = \twodis$. For $\eset{1} = \ed$, note that $\eset{1}$ doesn't have enough edges to have any output in $f_2^{-1}(\eset{1})$, i.e., there exists no $s \in \binom{E_2}{3}$ such that $f_2(\mathcal{P} ) = \eset{1}$. The only surviving pattern is $\eset{1} = \twopath$, where the edges of $\eset{2}$ have successive connectivity from $(e_1, 0)$ to $(e_2, 1)$. There are then $2$ $3$-paths sharing edges $e_1$ and $e_2$ in $f_2^{-1}(\eset{1}), \pbrace{(e_1, 0), (e_1, 1), (e_2, 0)} \text{ and }\pbrace{(e_1, 1), (e_2, 0), (e_2, 1)}$.
\end{proof}
\qed
%we have two 3-paths generated: $\pbox{(e_1, 0), (e_1, 1), (e_2, 0)}$ and $\pbox{(e_1, 1), (e_2, 0), (e_2, 1)}$. Thus,
\subsubsection{$\graph{3}$}
\begin{proof}[Proof of Lemma \ref{lem:3p-G3}]
The argument follows along the same lines as in the proof of \cref{lem:3p-G2}. Given $\mathcal{P} \subseteq \eset{3}$, it \textit{must} be that every edge in $f_3(\mathcal{P})$ has at least one edge in $\mathcal{P}$ mapped to it (and $\mathcal{P}$ is connected). Notice again that this cannot be the case for any $\eset{1} \in \binom{E_1}{3}$, nor is it the case when $\eset{1} = \twodis$. This leaves us with two patterns, $\eset{1} = \twopath$ and $\eset{1} = \ed$. For the former, it is the case that we have $2$ $3$-paths across $e_1$ and $e_2$, $\pbrace{(e_1, 1), (e_1, 2), (e_2, 0)}$ and $\pbrace{(e_1, 2), (e_2, 0), (e_2, 1)}$. For the latter pattern $\ed$, it it trivial to see that an edge in $\graph{1}$ becomes a $3$-path in $\graph{3}$, and this proves the identity.
\end{proof}
\qed
\subsection{Triangle}
\begin{proof}[Proof of Lemma \ref{lem:tri}]
The number of triangles in $\graph{k}$ for $k \geq 2$ will always be $0$ for the simple fact that all cycles in $\graph{k}$ will have at least six edges.
\end{proof}
\qed
\subsection{Developing a Linear System}
\AH{The changes in ~\cref{eq:2pd-3d} have been propagated 110420. Barring any errors, everything should be updated and correct.}
\begin{proof}[Proof of Lemma \ref{lem:lin-sys}]
Our goal is to build a linear system $M \cdot (x~y~z)^T = \vct{b}$, such that, assuming an indexing starting at $1$, each $i^{th}$ row in $M$ corresponds to the RHS of ~\cref{eq:LS-subtract} for $\graph{i}$ \textit{in} terms of $\graph{1}$. The vector $\vct{b}$ analogously has the terms computable in $O(\numedge)$ time for each $\graph{i}$ at its corresponing $i^{th}$ entry for the LHS of ~\cref{eq:LS-subtract}. Lemma ~\ref{lem:qE3-exp} gives the identity for $\rpoly_{G}(\prob,\ldots, \prob)$ when $\poly_{G}(\vct{X}) = q_E(X_1,\ldots, X_\numvar)^3$, and using
%Let us maintain a vector $\vct{b}$ to hold the entries for the terms that are computable in $O(\numedge)$ time, for each of $\graph{1}, \graph{2},$ and $\graph{3}$. From
~\cref{eq:LS-subtract}, $\vct{b}[1] = \frac{\rpoly_{G}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{G}{\ed}}{6\prob} - \numocc{G}{\twopath} - \numocc{G}{\twodis}\prob - \numocc{G}{\oneint}\prob - \big(\numocc{G}{\twopathdis} + 3\numocc{G}{\threedis}\big)\prob^2$.
As previously outlined, assume graph $\graph{1}$ to be an arbitrary graph, with $\graph{2}, \graph{3}$ constructed from $\graph{1}$ as defined in \cref{def:Gk}.
\subsubsection{$\graph{2}$}
Let us call the linear equation for graph $\graph{2}$ $\linsys{2}$. Using the hard to compute terms of the RHS in ~\cref{eq:LS-subtract}, let us consider the RHS,
\begin{align}
& \numocc{\graph{2}}{\tri} + \numocc{\graph{2}}{\threepath}\prob - \numocc{\graph{2}}{\threedis}\left(3\prob^2 - \prob^3\right)\nonumber\\
= &\numocc{\graph{2}}{\threepath}\prob - \numocc{\graph{2}}{\threedis}\left(3\prob^2 - \prob^3\right)\label{eq:ls-2-1}\\
= &2 \cdot \numocc{\graph{1}}{\twopath}\prob - \pbrace{8 \cdot \numocc{\graph{1}}{\threedis} + 6 \cdot \numocc{\graph{1}}{\twopathdis} + 4 \cdot \numocc{\graph{1}}{\oneint} + 4 \cdot \numocc{\graph{1}}{\threepath} + 2 \cdot \numocc{\graph{1}}{\tri}}\left(3\prob^2 - \prob^3\right)\label{eq:ls-2-2}\\
= &\left(-2\cdot\numocc{\graph{1}}{\tri} - 4\cdot\numocc{\graph{1}}{\threepath} - 8\cdot\numocc{\graph{1}}{\threedis} - 6\cdot\numocc{\graph{1}}{\twopathdis}\right)\cdot\left(3\prob^2 - p^3\right) + 2\cdot\numocc{\graph{1}}{\twopath}\prob - 4\cdot\numocc{\graph{1}}{\oneint}\cdot\left(3\prob^2 - \prob^3\right).\label{eq:ls-2-3}
\end{align}
%define $\linsys{2} = \numocc{\graph{2}}{\tri} + \numocc{\graph{2}}{\threepath}\prob - \numocc{\graph{2}}{\threedis}\left(3\prob^2 - \prob^3\right)$. By \cref{claim:four-two} we can compute $\linsys{2}$ in $O(T(\numedge) + \numedge)$ time with $\numedge = |E_2|$, and more generally, $\numedge = |E_k|$ for a graph $\graph{k}$.
Equation ~\ref{eq:ls-2-1} follows by \cref{lem:tri}. Similarly ~\cref{eq:ls-2-2} follows by both \cref{lem:3m-G2} and \cref{lem:3p-G2}. Finally, ~\cref{eq:ls-2-3} follows by a simple rearrangement of terms.
Now, by simple algebraic manipulations of ~\cref{eq:LS-subtract}, we deduce,
\begin{align}
&\frac{\rpoly_{\graph{2}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{2}}{\ed}}{6\prob} - \numocc{\graph{2}}{\twopath} - \numocc{\graph{2}}{\twodis}\prob - \numocc{\graph{2}}{\oneint}\prob - \big(\numocc{\graph{2}}{\twopathdis} + 3\numocc{\graph{2}}{\threedis}\big)\prob^2\nonumber\\
&\qquad\qquad =\left(-2\cdot\numocc{\graph{1}}{\tri} - 4\cdot\numocc{\graph{1}}{\threepath} - 8\cdot\numocc{\graph{1}}{\threedis} - 6\cdot\numocc{\graph{1}}{\twopathdis}\right)\cdot\left(3\prob^2 - p^3\right) + 2\cdot\numocc{\graph{1}}{\twopath}\prob - 4\cdot\numocc{\graph{1}}{\oneint}\cdot\left(3\prob^2 - \prob^3\right)\label{eq:lem3-G2-1}\\
&\frac{\rpoly_{\graph{2}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{2}}{\ed}}{6\prob} - \numocc{\graph{2}}{\twopath} - \numocc{\graph{2}}{\twodis}\prob - \numocc{\graph{2}}{\oneint}\prob - \big(\numocc{\graph{2}}{\twopathdis} + 3\numocc{\graph{2}}{\threedis}\big)\prob^2 - 2\cdot\numocc{\graph{1}}{\twopath}\prob\nonumber\\
&\qquad + 4\cdot\numocc{\graph{1}}{\oneint}\left(3\prob^2 - \prob^3\right)\nonumber\\
&\qquad\qquad =\left(-2\cdot\numocc{\graph{1}}{\tri} - 4\cdot\numocc{\graph{1}}{\threepath} - 8\cdot\numocc{\graph{1}}{\threedis} - 6\cdot\numocc{\graph{1}}{\twopathdis}\right)\cdot\left(3\prob^2 - p^3\right)\label{eq:lem3-G2-2}\\
&\frac{\rpoly_{\graph{2}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{2}}{\ed}}{6\prob} - \numocc{\graph{2}}{\twopath} - \numocc{\graph{2}}{\twodis}\prob - \numocc{\graph{2}}{\oneint}\prob - \big(\numocc{\graph{2}}{\twopathdis} + 3\numocc{\graph{2}}{\threedis}\big)\prob^2 - 2\cdot\numocc{\graph{1}}{\twopath}\prob\nonumber\\
&\qquad + \left(4\cdot\numocc{\graph{1}}{\oneint}+ 6\cdot\left(\numocc{\graph{1}}{\twopathdis} + 3\cdot\numocc{\graph{1}}{\threedis}\right)\right)\left(3\prob^2 - \prob^3\right)\nonumber\\
&\qquad\qquad =\left(-2\cdot\numocc{\graph{1}}{\tri} - 4\cdot\numocc{\graph{1}}{\threepath} + 10\cdot\numocc{\graph{1}}{\threedis}\right)\cdot\left(3\prob^2 - \prob^3\right)\label{eq:lem3-G2-3}
\end{align}
Equation ~\ref{eq:lem3-G2-1} follows by substituting ~\cref{eq:ls-2-3} in the RHS. We then arrive with ~\cref{eq:lem3-G2-2} by adding the inverse of the last 3 terms of ~\cref{eq:ls-2-3} to both sides. Finally, we arrive at ~\cref{eq:lem3-G2-3} by adding the $O(\numedge)$ computable term (by ~\cref{eq:2pd-3d}) $6\left(\cdot\numocc{\graph{1}}{\twopathdis} + 3\cdot\numocc{\graph{1}}{\threedis}\right)$ to both sides.
Denote the matrix of the linear system as $\mtrix{\rpoly_{G}}$, where $\mtrix{\rpoly_{G}}[i]$ is the $i^{\text{th}}$ row of $\mtrix{\rpoly_{G}}$. From ~\cref{eq:lem3-G2-3} it follows that
\[\mtrix{\rpoly_{\graph{2}}}[2] = \left(-2 \cdot \numocc{\graph{1}}{\tri} - 4 \cdot \numocc{\graph{1}}{\threepath} + 10 \cdot \numocc{\graph{1}}{\threedis}\right)\cdot \left(3\prob^2 - \prob^3\right)\]
and
%By \cref{lem:tri}, the first term of $\linsys{2}$ is $0$, and then $\linsys{2} = \numocc{\graph{2}}{\threepath}\prob - \numocc{\graph{2}}{\threedis}\left(3\prob^2 - \prob^3\right)$.
%
%Replace the next term with the identity of \cref{lem:3p-G2} and the last term with the identity of \cref{lem:3m-G2},
%\begin{equation*}
%\linsys{2} = 2 \cdot \numocc{\graph{1}}{\twopath}\prob - \pbrace{8 \cdot \numocc{\graph{1}}{\threedis} + 6 \cdot \numocc{\graph{1}}{\twopathdis} + 4 \cdot \numocc{\graph{1}}{\oneint} + 4 \cdot \numocc{\graph{1}}{\threepath} + 2 \cdot \numocc{\graph{1}}{\tri}}\left(3\prob^2 - \prob^3\right).
%\end{equation*}
%Rearrange terms into groups of those patterns that are 'hard' to compute and those that can be computed in $O(\numedge)$,
%\begin{equation*}
%\linsys{2} = -\pbrace{2 \cdot \numocc{\graph{1}}{\tri} + 4 \cdot \numocc{\graph{1}}{\threepath} + \left(8 \cdot \numocc{\graph{1}}{\threedis} + 6 \cdot \numocc{\graph{1}}{\twopathdis}\right)}\left(3\prob^2 - \prob^3\right) + \pbrace{2 \cdot \numocc{\graph{1}}{\twopath}\prob - 4 \cdot \numocc{\graph{1}}{\oneint}\left(3\prob^2 - \prob^3\right)}.
%\end{equation*}
%
%Note that there are terms computable in $O(\numedge)$ time which can be subtracted from $\linsys{2}$ and added to the other side of \cref{eq:LS-subtract}, i.e., $\vct{b}[2]$. This leaves us with
%\begin{align}
%&\linsys{2} = \left(-2 \cdot \numocc{\graph{1}}{\tri} - 4 \cdot \numocc{\graph{1}}{\threepath} - 2 \cdot \numocc{\graph{1}}{\threedis} - 4\cdot \numocc{\graph{1}}{\twopathdis}\right) \cdot \left(3\prob^2 - \prob^3\right)\label{eq:LS-G2'}\\
%&\linsys{2} = \left(-2 \cdot \numocc{\graph{1}}{\tri} - 4 \cdot \numocc{\graph{1}}{\threepath} - 2 \cdot \numocc{\graph{1}}{\threedis} + 12 \cdot \numocc{\graph{1}}{\threedis}\right)\cdot \left(3\prob^2 - \prob^3\right)\label{eq:LS-G2'-1}\\
%&\linsys{2} = \left(-2 \cdot \numocc{\graph{1}}{\tri} - 4 \cdot \numocc{\graph{1}}{\threepath} + 10 \cdot \numocc{\graph{1}}{\threedis}\right)\cdot \left(3\prob^2 - \prob^3\right)\label{eq:LS-G2'-2}
%\end{align}
%
%Equation ~\ref{eq:LS-G2'} is the result of collecting $2\cdot\left(\numocc{\graph{1}}{\twopathdis} + 3\numocc{\graph{1}}{\threedis}\right)$ and moving them to the other side. Then ~\cref{eq:LS-G2'-1} results from adding $4\cdot\left(\numocc{\graph{1}}{\twopathdis} + 3\numocc{\graph{1}}{\threedis}\right)$ to both sides. Equation ~\ref{eq:LS-G2'-2} is the result of simplifying terms.
%
%For the left hand side, following the above steps, we obtain
\begin{align*}
\vct{b}[2] &= \frac{\rpoly(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{2}}{\ed}}{6\prob} - \numocc{\graph{2}}{\twopath} - \numocc{\graph{2}}{\twodis}\prob - \numocc{\graph{2}}{\oneint}\prob - \left(\numocc{\graph{2}}{\twopathdis} + 3\numocc{\graph{2}}{\threedis}\right)\prob^2\\
&- 2\cdot \numocc{\graph{1}}{\twopath}\prob + \left(4\cdot\numocc{\graph{1}}{\oneint}+ 6\cdot\left(\numocc{\graph{1}}{\twopathdis} + 3\cdot\numocc{\graph{1}}{\threedis}\right)\right)\left(3\prob^2 - \prob^3\right).
\end{align*}
We now have a linear equation in terms of $\graph{1}$ for $\graph{2}$. Note that by ~\cref{eq:2pd-3d}, it is the case that any term of the form $x \cdot \left(\numocc{\graph{i}}{\twopathdis} + 3\cdot \numocc{\graph{i}}{\threedis}\right)$ is computable in linear time. By ~\cref{eq:1e}, ~\cref{eq:2p}, ~\cref{eq:2m}, and ~\cref{eq:3s} the same is true for $\numocc{\graph{i}}{\ed}$, $\numocc{\graph{i}}{\twopath}$, $\numocc{\graph{i}}{\twodis}$, and $\numocc{\graph{i}}{\oneint}$ respectively.
\subsubsection{$\graph{3}$}
Following the same reasoning for $\graph{3}$, using \cref{lem:3m-G3}, \cref{lem:3p-G3}, and \cref{lem:tri}, starting with the RHS of ~\cref{eq:LS-subtract}, we derive
\begin{align}
&\numocc{\graph{3}}{\tri} + \numocc{\graph{3}}{\threepath}\prob - \numocc{\graph{3}}{\threedis}\left(3\prob^2 - \prob^3\right)\nonumber\\
=& \pbrace{\numocc{\graph{1}}{\ed} + 2 \cdot \numocc{\graph{1}}{\twopath}}\prob - \left\{4 \cdot \numocc{\graph{1}}{\twopath} + 6 \cdot \numocc{\graph{1}}{\twodis} + 18 \cdot \numocc{\graph{1}}{\tri} + 21 \cdot \numocc{\graph{1}}{\threepath} + 24 \cdot \numocc{\graph{1}}{\twopathdis} +\right.\nonumber\\
&\left.20 \cdot \numocc{\graph{1}}{\oneint} + 27 \cdot \numocc{\graph{1}}{\threedis}\right\}\left(3\prob^2 - \prob^3\right)\label{eq:LS-G3-sub}\\
=&\pbrace{ -18\numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} - 24 \cdot \numocc{\graph{1}}{\twopathdis} - 27 \cdot \numocc{\graph{1}}{\threedis}}\left(3\prob^2 - \prob^3\right) \nonumber\\
&+ \pbrace{-20 \cdot \numocc{\graph{1}}{\oneint} - 4\cdot \numocc{\graph{1}}{\twopath} - 6 \cdot \numocc{\graph{1}}{\twodis}}\left(3\prob^2 - \prob^3\right)+ \numocc{\graph{1}}{\ed}\prob + 2 \cdot \numocc{\graph{1}}{\twopath}\prob. \label{eq:lem3-G3-1}
\end{align}
Looking at ~\cref{eq:LS-subtract},
\begin{align}
&\frac{\rpoly_{\graph{3}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob - \numocc{\graph{3}}{\oneint}\prob - \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2\nonumber\\
&\qquad\qquad= \pbrace{ -18\numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} - 24 \cdot \numocc{\graph{1}}{\twopathdis} - 27 \cdot \numocc{\graph{1}}{\threedis}}\left(3\prob^2 - \prob^3\right) \nonumber\\
&\qquad\qquad\qquad+ \pbrace{-20 \cdot \numocc{\graph{1}}{\oneint} - 4\cdot \numocc{\graph{1}}{\twopath} - 6 \cdot \numocc{\graph{1}}{\twodis}}\left(3\prob^2 - \prob^3\right)+ \numocc{\graph{1}}{\ed}\prob + 2 \cdot \numocc{\graph{1}}{\twopath}\prob. \label{eq:lem3-G3-2}\\
&\frac{\rpoly_{\graph{3}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob - \numocc{\graph{3}}{\oneint}\prob - \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2 - \left(\numocc{\graph{1}}{\ed} + \numocc{\graph{1}}{\twopath}\right)\prob\nonumber\\
&\qquad + \left(24\left(\numocc{\graph{1}}{\twopathdis} + 3\cdot\numocc{\graph{1}}{\threedis}\right) + 20\cdot\numocc{\graph{1}}{\oneint} + 4\cdot\numocc{\graph{1}}{\twopath} + 6\cdot\numocc{\graph{1}}{\twodis}\right)\left(3\prob^2 - \prob^3\right)\nonumber\\
&\qquad\qquad = \pbrace{- 18 \cdot \numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} + 45 \cdot \numocc{\graph{1}}{\threedis}}\left(3p^2 - p^3\right)\label{eq:lem3-G3-3}
\end{align}
Equation ~\ref{eq:lem3-G3-2} follows from substituting ~\cref{eq:lem3-G3-2} in for the RHS of ~\cref{eq:LS-subtract}. We derive ~\cref{eq:lem3-G3-3} by adding the inverse of all $O(\numedge)$ computable terms, and for the case of $\twopathdis$ and $\threedis$, we add the $O(\numedge)$ computable term $24\cdot\left(\numocc{\graph{1}}{\twopathdis} + \numocc{\graph{1}}{\threedis}\right)$ to both sides.
Equation \ref{eq:LS-G3-sub} follows from simple substitution of all lemma identities in ~\cref{lem:3m-G3}, ~\cref{lem:3p-G3}, and ~\cref{lem:tri}. We then get \cref{eq:LS-G3-rearrange} by simply rearranging the operands.
It then follows that
%Removing $O(\numedge)$ computable terms to the other side of \cref{eq:LS-subtract}, we get
\begin{equation}
\mtrix{\rpoly_{G}}[3] = \pbrace{- 18 \cdot \numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} + 45 \cdot \numocc{\graph{1}}{\threedis}}\left(3p^2 - p^3\right)\label{eq:LS-G3'}
\end{equation}
and
%The same justification for the derivation of $\linsys{2}$ applies to the derivation above of $\linsys{3}$. To arrive at ~\cref{eq:LS-G3'}, we move $O(\numedge)$ computable terms to the left hand side. For the term $-24\cdot\numocc{\graph{1}}{\twopathdis}$ we need to add the inverse to both sides AND $72\cdot\numocc{\graph{1}}{\threedis}$ to both sides, in order to satisfy the constraint of $\cref{eq:2pd-3d}$.
%
%For the LHS we get
\begin{align*}
\vct{b}[3] =& \frac{\rpoly(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob - \numocc{\graph{3}}{\oneint}\prob - \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2 - \pbrace{\numocc{\graph{1}}{\ed} + 2 \cdot \numocc{\graph{1}}{\twopath}}\prob\\
& + \pbrace{24 \cdot \left(\numocc{\graph{1}}{\twopathdis} + 3\numocc{\graph{1}}{\threedis}\right) + 20 \cdot \numocc{\graph{1}}{\oneint} + 4\cdot \numocc{\graph{1}}{\twopath} + 6 \cdot \numocc{\graph{1}}{\twodis}}\left(3\prob^2 - \prob^3\right)
\end{align*}
We now have a linear system consisting of three linear combinations, for $\graph{1}, \graph{2}, \graph{3}$ in terms of $\graph{1}$. Note that the constants for $\graph{1}$ follow the RHS of ~\cref{eq:LS-subtract}. To make it easier, use the following variable representations: $x = \numocc{\graph{1}}{\tri}, y = \numocc{\graph{1}}{\threepath}, z = \numocc{\graph{1}}{\threedis}$. Using $\linsys{2}$ and $\linsys{3}$, the following matrix is obtained,
\[ \mtrix{\rpoly} = \begin{pmatrix}
1 & \prob & -(3\prob^2 - \prob^3)\\
-2(3\prob^2 - \prob^3) & -4(3\prob^2 - \prob^3) & 10(3\prob^2 - \prob^3)\\
-18(3\prob^2 - \prob^3) & -21(3\prob^2 - \prob^3) & 45(3\prob^2 - \prob^3)
\end{pmatrix},\]
and the following linear equation
\begin{equation}
\mtrix{\rpoly}\cdot (x~ y~ z~)^T = \vct{b}(\graph{1}).
\end{equation}
\AR{
Also the top right entry should be $-(p^2-p^3)$-- the negative sign is missing. This changes the rest of the calculations and has to be propagated. If my calculations are correct the final polynomial should be $-30p^2(1-p)^2(1-p-p^2+p^3)$. This still has no root in $(0,1)$}
\AH{While propagating changes in ~\cref{eq:2pd-3d}, I noticed and corrected some errors, most notably, that for pulling out the \textbf{$a^2$} factor as described next, I hadn't squared it. That has been addressed. 110220}
Now we seek to show that all rows of the system are indeed independent.
The method of minors can be used to compute the determinant, $\dtrm{\mtrix{\rpoly}}$.
We also make use of the fact that for a matrix with entries $ab, ac, ad,$ and $ae$, the determinant is $a^2be - a^2cd = a^2(be - cd)$.
\begin{equation*}
\begin{vmatrix}
1 & \prob & -(3\prob^2 - \prob^3)\\
-2(3\prob^2 - \prob^3) & -4(3\prob^2 - \prob^3) & 10(3\prob^2 - \prob^3)\\
-18(3\prob^2 - \prob^3) & -21(3\prob^2 - \prob^3) & 45(3\prob^2 - \prob^3)
\end{vmatrix}
= (3\prob^2 - \prob^3)^2 \cdot
\begin{vmatrix}
-4 & 10\\
-21 & 45
\end{vmatrix}
~ - ~ \prob(3\prob^2 - \prob^3)^2~ \cdot
\begin{vmatrix}
-2 & 10\\
-18 & 45
\end{vmatrix}
+ \left(- ~(3\prob^2 - \prob^3)^3\right)~ \cdot
\begin{vmatrix}
-2 & -4\\
-18 & -21
\end{vmatrix}.
\end{equation*}
Compute each RHS term starting with the left and working to the right,
\begin{equation}
(3\prob^2 - \prob^3)^2\cdot \left((-4 \cdot 45) - (-21 \cdot 10)\right) = (3\prob^2 - \prob^3)^2\cdot(-180 + 210) = 30(3\prob^2 - \prob^3)^2.\label{eq:det-1}
\end{equation}
The middle term then is
\begin{equation}
-\prob(3\prob^2 - \prob^3)^2 \cdot \left((-2 \cdot 45) - (-18 \cdot 10)\right) = -\prob(3\prob^2 - \prob^3)^2 \cdot (-90 + 180) = -90\prob(3\prob^2 - \prob^3)^2.\label{eq:det-2}
\end{equation}
Finally, the rightmost term,
\begin{equation}
-\left(3\prob^2 - \prob^3\right)^3 \cdot \left((-2 \cdot -21) - (-18 \cdot -4)\right) = -\left(3\prob^2 - \prob^3\right)^3 \cdot (42 - 72) = 30\left(3\prob^2 - \prob^3\right)^3.\label{eq:det-3}
\end{equation}
Putting \cref{eq:det-1}, \cref{eq:det-2}, \cref{eq:det-3} together, we have,
\begin{align}
\dtrm{\mtrix{\rpoly}} =& 30(3\prob^2 - \prob^3)^2 - 90\prob(3\prob^2 - \prob^3)^2 +30(3\prob^2 - \prob^3)^3 = 30(3\prob^2 - \prob^3)^2\left(1 - 3\prob + (3\prob^2 - \prob^3)\right) = 30\left(9\prob^4 - 6\prob^5 + \prob^6\right)\left(-\prob^3 + 3\prob^2 - 3\prob + 1\right)\nonumber\\
=&\left(30\prob^6 - 180\prob^5 + 270\prob^4\right)\cdot\left(-\prob^3 + 3\prob^2 - 3\prob + 1\right).\label{eq:det-final}
\end{align}
\AH{It appears that the equation below has roots at p = 0 (left factor) and p = 1, with NO roots $\in (0, 1)$.}
%Equation \cref{eq:det-final} has no roots in $(0, 1)$.
\AH{I need to understand how lemma ~\ref{lem:lin-sys} follows.}
\end{proof}\AH{End proof of Lemma \ref{lem:lin-sys}}
\qed
Thus, we have proved the ~\cref{lem:const-p} for fixed $p \in (0, 1)$.
\end{proof}
\qed

View File

@ -1,17 +1,13 @@
%root: main.tex
%!TEX root=./main.tex
\onecolumn
%\onecolumn
\section{Query translation into polynomials}
%\AH{This section will involve the set of queries (RA+) that we are interested in, the probabilistic/incomplete models we address, and the outer aggregate functions we perform over the output \textit{annotation}
%1) RA notation
%2) DB (TIDB) notation
%3) How queries translate into polynomials
%}
\subsection{Introduction}
An incomplete database $\idb$ is a set of deterministic databases $\db$ where each element is known as a possible world. %Since $\idb$ is modeling all the possible worlds of an uncertain database, it follows that each $\db \in \idb$ has the same named set of relations, $\{\rel_1,\ldots, \rel_n\}$ (albeit not equivalent across all instances), whose schemas $(\sch(\rel_i))$are unchanging across each $\db_j$.
An incomplete database $\idb$ is a set of deterministic databases $\db$ where each element is known as a possible world.
Denote the schema of $\db$ as $\sch(\db)$. When $\idb$ is a probabilistic database, $\idb$ can be viewed as a two-tuple $(\wSet, \pd)$, where $\wSet$ as noted, is the set of possible worlds, and $\pd$ is a probability distribution over $\wSet$.
The possible worlds semantics gives a framework for how to think about running queries over $\idb$. Given a query $\query$, $\query$ is deterministically run over each $\db \in \idb$, and the output of $\query(\idb)$ is defined as the set of results (worlds) from running $\query$ over each $\db_i \in \idb$. We write this formally as,

288
single_p.tex Normal file
View File

@ -0,0 +1,288 @@
%root: main.tex
\begin{Theorem}\label{lem:const-p}
If we can compute $\rpoly_{G}^3(\vct{X})$ in T(\numedge) time for $X_1 =\cdots= X_\numvar = \prob$, then we can count the number of triangles, 3-paths, and 3-matchings in $G$ in $T(\numedge) + O(\numedge)$ time.
\end{Theorem}
Before moving on to prove ~\cref{lem:const-p}, let us state the lemmas and defintions that will be useful in the proof.
%Original lemma proving the exact coefficient terms in qE3
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{Lemma}\label{lem:qE3-exp}
When we expand $\poly_{G}(\vct{X}) = \left(q_E(X_1,\ldots, X_\numvar)\right)^3$ out and assign all exponents $e \geq 1$ a value of $1$, we have the following,
\begin{align}
&\rpoly_{G}(\prob,\ldots, \prob) = \numocc{G}{\ed}\prob^2 + 6\numocc{G}{\twopath}\prob^3 + 6\numocc{G}{\twodis} + 6\numocc{G}{\tri}\prob^3\nonumber\\
&+ 6\numocc{G}{\oneint}\prob^4 + 6\numocc{G}{\threepath}\prob^4 + 6\numocc{G}{\twopathdis}\prob^5 + 6\numocc{G}{\threedis}\prob^6.\label{claim:four-one}
\end{align}
\end{Lemma}
\begin{proof}[Proof of \cref{lem:qE3-exp}]
By definition we have that
\[\poly_{G}(\vct{X}) = \sum_{\substack{(i_1, j_1),\\ (i_2, j_2),\\ (i_3, j_3) \in E}} \prod_{\ell = 1}^{3}X_{i_\ell}X_{j_\ell}.\]
Rather than list all the expressions in full detail, let us make some observations regarding the sum. Let $e_1 = (i_1, j_1), e_2 = (i_2, j_2), e_3 = (i_3, j_3)$. Notice that each expression in the sum consists of a triple $(e_1, e_2, e_3)$. There are three forms the triple $(e_1, e_2, e_3)$ can take.
\textsc{case 1:} $e_1 = e_2 = e_3$, where all edges are the same. There are exactly $\numedge$ such triples, each with a $\prob^2$ factor in $\rpoly_{G}\left(\prob_1,\ldots, \prob_\numvar\right)$.
\textsc{case 2:} This case occurs when there are two distinct edges of the three, call them $e$ and $e'$. When there are two distinct edges, there is then the occurence when $2$ variables in the triple $(e_1, e_2, e_3)$ are bound to $e$. There are three combinations for this occurrence. It is the analogue for when there is only one occurrence of $e$, i.e. $2$ of the variables in $(e_1, e_2, e_3)$ are $e'$. Again, there are three combinations for this. All $3 + 3 = 6$ combinations of two distinct values consist of the same monomial in $\rpoly$, i.e. $(e_1, e_1, e_2)$ is the same as $(e_2, e_1, e_2)$. This case produces the following edge patterns: $\twopath, \twodis$.
\textsc{case 3:} $e_1 \neq e_2 \neq e_3$, i.e., when all edges are distinct. For this case, we have $3! = 6$ permutations of $(e_1, e_2, e_3)$. This case consists of the following edge patterns: $\tri, \oneint, \threepath, \twopathdis, \threedis$.
\end{proof}
\qed
\begin{proof}[Proof of \cref{lem:const-p}]
\begin{Definition}\label{def:Gk}
For $k > 1$, let graph $\graph{k}$ be a graph generated from an arbitrary graph $\graph{1}$, by replacing every edge $e$ of $\graph{1}$ with a $k$-path, such that all $k$-path replacement edges are disjoint in the sense that they only intersect at the original intersection endpoints as seen in $\graph{1}$.
\end{Definition}
\begin{Lemma}\label{lem:3m-G2}
The number of $3$-matchings in graph $\graph{2}$ satisfies the following identity,
\begin{align*}
\numocc{\graph{2}}{\threedis} &= 8 \cdot \numocc{\graph{1}}{\threedis} + 6 \cdot \numocc{\graph{1}}{\twopathdis}\\
&+ 4 \cdot \numocc{\graph{1}}{\oneint} + 4 \cdot \numocc{\graph{1}}{\threepath} + 2 \cdot \numocc{\graph{1}}{\tri}.
\end{align*}
\end{Lemma}
\begin{Lemma}\label{lem:3m-G3}
The number of 3-matchings in $\graph{3}$ satisfy the following identity,
\begin{align*}
\numocc{\graph{3}}{\threedis} &= 4\numocc{\graph{1}}{\twopath} + 6\numocc{\graph{1}}{\twodis} + 18\numocc{\graph{1}}{\tri}\\
&+ 21\numocc{\graph{1}}{\threepath}+ 24\numocc{\graph{1}}{\twopathdis} + 20\numocc{\graph{1}}{\oneint}\\
&+ 27\numocc{\graph{1}}{\threedis}.
\end{align*}
\end{Lemma}
\begin{Lemma}\label{lem:3p-G2}
The number of $3$-paths in $\graph{2}$ satisfies the following identity,
\[\numocc{\graph{2}}{\threepath} = 2 \cdot \numocc{\graph{1}}{\twopath}.\]
\end{Lemma}
\begin{Lemma}\label{lem:3p-G3}
The number of $3$-paths in $\graph{3}$ satisfies the following identity,
\[\numocc{\graph{3}}{\threepath} = \numocc{\graph{1}}{\ed} + 2 \cdot \numocc{\graph{1}}{\twopath}.\]
\end{Lemma}
\begin{Lemma}\label{lem:tri}
For $k > 1$, any graph $\graph{k}$ has the property that $\numocc{\graph{k}}{\tri} = 0$.
\end{Lemma}
\begin{Lemma}\label{lem:lin-sys}
Using the identities of lemmas [\ref{lem:3m-G2}, \ref{lem:3m-G3}, \ref{lem:3p-G2}, \ref{lem:3p-G3}, \ref{lem:tri}] to compute $\numocc{G}{\threedis}, \numocc{G}{\threepath}, \numocc{G}{\tri}$ for $G \in \{\graph{2}, \graph{3}\}$, there exists a linear system $\mtrix{\rpoly}\cdot (x~y~z~)^T = \vct{b}$ which can then be solved to determine the unknown quantities of $\numocc{\graph{1}}{\threedis}, \numocc{\graph{1}}{\threepath}$, and $\numocc{\graph{1}}{\tri}$.
\end{Lemma}
\AH{I didn't think of a more appropriate name for $\vct{b}$, so I have just stuck with what Atri called it on chat.}
Using \cref{def:Gk} we construct graphs $\graph{2}$ and $\graph{3}$ from arbitrary graph $\graph{1}$.
We then show that for any of the patterns $\threedis, \threepath, \tri$ which are all known to be hard to compute, we can use linear combinations in terms of $\graph{1}$ from Lemmas \ref{lem:3m-G2}, \ref{lem:3m-G3}, \ref{lem:3p-G2}, \ref{lem:3p-G3}, \ref{lem:tri} to compute $\numocc{\graph{i}}{S}$, where $i$ in $\{2, 3\}$ and $S \in \{\threedis, \threepath, \tri\}$. Then, using ~\cref{claim:four-two}, \cref{lem:qE3-exp} and \cref{lem:lin-sys}, we can combine all three linear combinations into a linear system, solving for $\numocc{\graph{1}}{S}$.
%$%^&*(
Before proceeding, let us introduce a few more helpful definitions.
\subsubsection{$f_k$ and $\graph{k}$}
\begin{Definition}\label{def:ed-nota}
For the set of edges in $\graph{k}$ we write $E_k$. For any graph $\graph{k}$, its edges are denoted by the a pair $(e, b)$, such that $b \in \{0,\ldots, k-1\}$ and $e\in E_1$.
\end{Definition}
\begin{Definition}[$\eset{k}$]
Given an arbitrary subgraph $S\graph{1}$ of $\graph{1}$, let $\eset{1}$ denote the set of edges in $S\graph{1}$. Define then $\eset{k}$ for $k > 1$ as the set of edges in the generated subgraph $S\graph{k}$.
\end{Definition}
For example, consider $S\graph{1}$ with edges $\eset{1} = \{e_1\}$. Then the edges of $S\graph{2}$, $\eset{2} = \{(e_1, 0), (e_1, 1)\}$.
\begin{Definition}\label{def:ed-sub}
Let $\binom{S}{t}$ denote the set of subsets in $S$ with exactly $t$ edges. In a similar manner, $\binom{S}{\leq t}$ is used to mean the subsets of $S$ with $t$ or fewer edges.
\end{Definition}
The following function $f_k$ is a mapping from every $3$-edge shape in $\graph{k}$ to its `projection' in $\graph{1}$.
\begin{Definition}\label{def:fk}
Let $f_k: \binom{E_k}{3} \mapsto \binom{E_1}{\leq3}$ be defined as follows. For any $S \in \binom{E_k}{3}$, such that $S = \pbrace{(e_1, b_1), (e_2, b_2), (e_3, b_3)}$, define:
\[ f_k\left(\pbrace{(e_1, b_1), (e_2, b_2), (e_3, b_3)}\right) = \pbrace{e_1, e_2, e_3}.\]
\end{Definition}
\AH{Just questioning if the notation is clear in the ~\cref{def:fk-inv}. For more details, see the immediately following todo note which is commented out.}
%\AH{I found ~\cref{def:fk-inv} a bit imprecise and bulky and have attempted to refine it.
%\par Since this an inverse function, the signature is reversed, \vari{but},
%\par...the challenge is in quantifying the size of the set (of 3 edge subsets) that is returned...
%\par...where the main observation is that for an input edge of size $r$, a set of size $\leq \binom{r\cdot k}{3}$ is returned...
%\par...but the catch is that for $r \geq 3$, the set will be strictly less than $\binom{r\cdot k}{3}$ since $f_k$ does not map e.g. an input $\{(e_a, b_1), (e_a, b_2), (e_a, b_3)\}$ (where $a$ is constant and $b_1, b_2, b_3 \in \{0,\ldots, k -1\}$) to more than one edge, \textit{and} it is the case for $r \geq 3$ that $f_k^{-1}$ will not map such an input to its input of size $r$, meaning we must subtract off all such subsets of $\binom{E_k}{3}$.
%\par My fix was to use a variable in the exponent and explain in prose. Perhaps there is a better, simpler notation/solution.}
\begin{Definition}[$f_k^{-1}$]\label{def:fk-inv}
The inverse function $f_k^{-1}: \binom{E_1}{\leq 3}\mapsto \left\{\binom{E_k}{3}\right\}^{h}$ takes an arbitrary $\eset{1}$ of at most $3$ edges and outputs the set of all subsets of $\binom{\eset{k}}{3}$ such that each subset $s^{(k)}$ of the output set is mapped to the input set $s^{(1)}$ by $f_k$, i.e. $f_k(s^{(k)}) = s^{(1)}$. The set returned by $f_k^{-1}$ is of size $h$, where $h$ depends on $\abs{s^{(1)}}$, such that $h \leq \binom{\abs{s^{(1)}} \cdot k}{3}$.
\end{Definition}
Note, importantly, that when we discuss $f_k^{-1}$, that, although potentially counterintuitive, each \textit{edge} present in $s^{(1)}$ must have an edge in $s^{(k)}$ that `projects` down to it. \textit{Meaning}, if $|s^{(1)}| = 3$, then it must be the case that each $s^{(k)}$ be a set $\{ (e_i, b), (e_j, b), e_\ell, b) \}$ where $i \neq j \neq \ell$.
\begin{Lemma}\label{lem:fk-func}
$f_k$ is a function.
\end{Lemma}
\begin{proof}[Proof of Lemma \ref{lem:fk-func}]
Note that $f_k$ is properly defined. For any $S \in \binom{E_k}{3}$, $|f(S)| \leq 3$, since it has to be the case that any subset of $3$ edges in $E_k$ will map to at most 3 edges in $\graph{1}$. All mappings are in the required range. Then, since for any $b \in \{0,\ldots, k-1\}$ the edge $(e, b) \mapsto e$ is a mapping for which $(e, b)$ maps to no other edge than $e$, and this implies that $f_k$ is a function.
\end{proof}
\qed
\subsection{Three Matchings in $\graph{2}$}
\AR{TODO for {\em later}: I think the proof will be much easier to follow with figures: just drawing out $S\times \{0,1\}$ along with the $(e_i,b_i)$ explicity notated on the edges will make the proof much easier to follow.}
\begin{proof}[Proof of Lemma \ref{lem:3m-G2}]
For each edge pattern $S$, we count the number of $3$-matchings in the $3$-edge subgraphs of $\graph{2}$ in $f_2^{-1}(S)$. We start with $S \in \binom{E_1}{3}$, where $S$ is composed of the edges $e_1, e_2, e_3$ and $f_2^{-1}(S)$ is the set of all $3$-edge subsets of the set
\begin{equation*}
\{(e_1, 0), (e_1, 1), (e_2, 0), (e_2, 1), (e_3, 0), (e_3, 1)\}.
\end{equation*}
\begin{itemize}
\item $3$-matching ($\threedis$)
\end{itemize}
Consider the $\eset{1} = \threedis$ pattern. Note that edges in $\eset{2}$ are {\em not} disjoint only for the pairs $(e_i, 0), (e_i, 1)$ for $i\in \{1,2,3\}$. All subsets for $b_1, b_2, b_3 \in \{0, 1\}$, $(e_1, b_1), (e_2, b_2), (e_3, b_3)$ will compose a 3-matching. One can see that we have a total of two possible choices for each edge $e_i$ in $\graph{1}$ yielding $2^3 = 8$ possible 3-matchings in $f_2^{-1}(S)$.
\begin{itemize}
\item Disjoint Two-Path ($\twopathdis$)
\end{itemize}
For $\eset{1} = \twopathdis$ edges $e_2, e_3$ form a $2$-path with $e_1$ being disjoint. This means that $(e_2, 0), (e_2, 1), (e_3, 0), (e_3, 1)$ form a $4$-path while $(e_1, 0), (e_1, 1)$ is its own disjoint $2$-path. We can only pick either $(e_1, 0)$ or $(e_1, 1)$ from $f_2^{-1}(S)$, and then we need to pick a $2$-matching from $e_2$ and $e_3$. Note that a four path allows there to be 3 possible 2 matchings, specifically, $\pbrace{(e_2, 0), (e_3, 0)}, \pbrace{(e_2, 0), (e_3, 1)}, \pbrace{(e_2, 1), (e_3, 1)}$. Since these two selections can be made independently, there are $2 \cdot 3 = 6$ choices for $3$-matchings in $f_2^{-1}(S)$.
\begin{itemize}
\item $3$-star ($\oneint$)
\end{itemize}
When $\eset{1} = \oneint$, the inner edges $(e_i, 1)$ of $\eset{2}$ are all connected, and the outer edges $(e_i, 0)$ are all disjoint. Note that for a valid 3 matching it must be the case that at most one inner edge can be part of the set of disjoint edges. When exactly one inner edge is chosen, there are 3 such possibilities. The remaining possible 3-matching occurs when all 3 outer edges are chosen. Thus, there are $3 + 1 = 4$ 3-matchings in $f_2^{-1}(S)$.
\begin{itemize}
\item $3$-path ($\threepath$)
\end{itemize}
When $\eset{1} =\threepath$ it is the case that all edges beginning with $e_1$ and ending with $e_3$ are successively connected. This means that the edges of $\eset{2}$ form a $6$-path in the edges of $f_2^{-1}(S)$, where all edges from $(e_1, 0),\ldots,(e_3, 1)$ are successively connected. For a $3$-matching to exist, there must be at least one edge separating edges picked from a sequence. There are four such possibilities: $\pbrace{(e_1, 0), (e_2, 0), (e_3, 0)}, \pbrace{(e_1, 0), (e_2, 0), (e_3, 1)}, \pbrace{(e_1, 0), (e_2, 1), (e_3, 1)},$\newline $\pbrace{(e_1, 1), (e_2, 1), (e_3, 1)}$ . Thus, there are four possible 3-matchings in $f_2^{-1}(S)$.
\begin{itemize}
\item Triangle ($\tri$)
\end{itemize}
For $\eset{1} = \tri$, note that it is the case that the edges in $\eset{2}$ are connected in a successive manner, but this time in a cycle, such that $(e_1, 0)$ and $(e_3, 1)$ are also connected. While this is similar to the discussion of the three path above, the first and last edges are not disjoint, since they are connected. This rules out both subsets of $(e_1, 0), (e_2, 0), (e_3, 1)$ and $(e_1, 0), (e_2, 1), (e_3, 1)$ leaving us with $2$ remaining edge combinations that produce a 3 matching.
\begin{itemize}
\item $2$-matching ($\twodis$), $2$-path ($\twopath$), $1$ edge ($\ed$)
\end{itemize}
Let us also consider when $S \in \binom{E_1}{\leq 2}$. When $|S| = 2$, we can only pick one from each of two pairs, $\pbrace{(e_1, 0), (e_1, 1)}$ and $\pbrace{(e_2, 0), (e_2, 1)}$. This implies that a $3$-matching cannot exist in $f_2^{-1}(S)$. The same argument holds for $|S| = 1$, where we can only pick one edge from the pair $\pbrace{(e_1, 0), (e_1, 1)}$, thus no $3$-matching exists in $f_2^{-1}(S)$.
Observe that all of the arguments above focused solely on the shape/pattern of $S$. In other words, all $S$ of a given shape yield the same number of $3$-matchings, and this is why we get the required identity.
\end{proof}
\qed
\subsection{Three matchings in $\graph{3}$}
\begin{proof}[Proof of Lemma \ref{lem:3m-G3}]
For any $S \in \binom{E_1}{\leq3}$, we again then count the number of $3$-matchings in $f_3^{-1}(S)$.
\begin{itemize}
\item $1$ edge ($\ed$)
\end{itemize}
When $\eset{1} = \ed$, $f_3^{-1}(\eset{1})$ has one subset, $(e_1, 0), (e_1, 1), (e_1, 2)$, which clearly does not contain a $3$-matching. Thus there are no $3$-matchings in $f_3^{-1}(\eset{1})$ for this case.
\begin{itemize}
\item $2$-path ($\twopath$)
\end{itemize}
Fix then $\eset{1} = \twopath$ and now we have all edges in $\eset{3}$ form a $6$-path, and similar to the discussion in the proof of \cref{lem:3m-G2} (when $eset{1} = \threepath$ in $\graph{2}$), this leads to $4$ $3$-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item $2$-matching ($\twodis$)
\end{itemize}
For $\eset{1} = \twodis$, all edges of $\eset{3}$ are predicated on the fact that $(e_i, b)$ is disjoint with $(e_j, b)$ for $i \neq j\in \{1,2\}$ and $b \in \{0, 1, 2\}$. Pick an aribitrary $e_i$ and note, that $(e_i, 0), (e_i, 2)$ is a $2$-matching, which can combine with any of the $3$ edges in $(e_j, 0), (e_j, 1), (e_j, 2)$ again for $i \neq j$. Since the selections are independent, it follows that there exist $2 \cdot 3 = 6$ $3$-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item Triangle ($\tri$)
\end{itemize}
Now, we consider the 3-edge subgraphs of $\graph{1}$, starting with $\eset{1} = \tri$. As discussed in proof of \cref{lem:3m-G2} for the case of $\tri$, the edges of $\eset{3}$ are a cyclic sequence, and we must be careful not to pair $(e_1, 0)$ with $(e_3, 2)$ in a $3$-matching. For any $s \in f_3^{-1}(S)$, $s$ is a $3$-matching when we have that for the edges $(e_1, b_1), (e_2, b_2), (e_3, b_3)$ where $b_1, b_2, b_3 \in \{0, 1, 2\}$, such that, for all $i \in [3]$ it is the case that if $b_i = 2$ then $b_{i \mod{3} + 1} \neq 0$. Iterating through all possible combinations, we have
\begin{itemize}
\item \textsc{$(e_1, 0)$}
\begin{itemize}
\item $\pbrace{(e_1, 0), (e_2, 0), (e_3, 0)}$
\item $\pbrace{(e_1, 0), (e_2, 0), (e_3, 1)}$
\item $\pbrace{(e_1, 0), (e_2, 1), (e_3, 0)}$
\item $\pbrace{(e_1, 0), (e_2, 1), (e_3, 1)}$
\item $\pbrace{(e_1, 0), (e_2, 2), (e_3, 1)}$
\end{itemize}
\item \textsc{$(e_1, 1)$}
\begin{itemize}
\item $\pbrace{(e_1, 1), (e_2, 0), (e_3, 0)}, \ldots\pbrace{(e_1, 1), (e_2, 1), (e_3, 2)}$
\item $\pbrace{(e_1, 1), (e_2, 2), (e_3, 1)}$
\item $\pbrace{(e_1, 1), (e_2, 2), (e_3, 2)}$
\end{itemize}
\item \textsc{$(e_1, 2)$}
\begin{itemize}
\item $\pbrace{(e_1, 2), (e_2, 1), (e_3, 0)}$
\item $\pbrace{(e_1, 2), (e_2, 1), (e_3, 1)}$
\item $\pbrace{(e_1, 2), (e_2, 1), (e_3, 2)}$
\item $\pbrace{(e_1, 2), (e_2, 2), (e_3, 1)}$
\item $\pbrace{(e_1, 2), (e_2, 2), (e_3, 2)}$
\end{itemize}
\end{itemize}
for a total of 18 3-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item $3$-path ($\threepath$)
\end{itemize}
Consider when $\eset{1} = \threepath$ and all edges in $\eset{3}$ are successively connected to form a $9$-path. Since $(e_1, 0)$ is disjoint to $(e_3, 2)$, both of these edges can exist in a $3$-matching. This relaxation yields 3 other 3-matchings that couldn't be counted in the case of the $\eset{1} = \tri$, namely
\begin{equation*}
\pbrace{(e_1, 0), (e_2, 0), (e_3, 2)},\pbrace{(e_1, 0), (e_2, 1), (e_3, 2)}, \pbrace{(e_1, 0), (e_2, 2), (e_3, 2)}.
\end{equation*}
There are therefore $18 + 3 = 21$ $3$-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item Disjoint Two-Path ($\twopathdis$)
\end{itemize}
Assume $\eset{1} = \twopathdis$, then the edges of $\eset{3}$ have successive connectivity from $(e_1, 0)$ through $(e_1, 2)$, and successive connectivity from $(e_2, 0)$ through $(e_3, 2)$. It is the case that the edges in $\eset{3}$ form a 6-path with a disjoint 3-path. There exist $8$ distinct two matchings (with at least one $(e_2,\cdot)$ and at least one $(e_3,\cdot)$ edge) in the $6$-path $(e_2, 0),\ldots, (e_3, 2)$ of the form
\begin{equation*}
\pbrace{(e_2, 0), (e_3, 0)},\ldots, \pbrace{(e_2, 1), (e_3, 2)}, \pbrace{(e_2, 2), (e_3, 1)}, \pbrace{(e_2, 2), (e_3, 2)}.
\end{equation*}
These matchings can be paired independently with either of the $3$ remaining edges of $(e_1, b)$, for a total of $8 \cdot 3 = 24$ 3-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item $3$-star ($\oneint$)
\end{itemize}
Given $\eset{1} = \oneint$, the edges of $\eset{3}$ are restricted such that the outer edges $(e_i, 0)$ are disjoint from another, the middle edges $(e_i, 1)$ are also disjoint to each other, and only the inner edges $(e_i, 2)$ intersect with one another at exactly one common endpoint. To be precise, any outer edge $(e_i, 0)$ is disjoint to every middle edge $(e_j, 1)$ for $i \neq j$. As previously mentioned in the proof of \cref{lem:3m-G2}, at most one inner edge may appear in a $3$-matching. For arbitrary inner edge $(e_i, 2)$, we have $4$ combinations of the middle and outer edges of $e_j, e_k$, where $i \neq j \neq k$. These choices are independent and we have $4 \cdot 3 = 12$ 3-matchings. We are not done yet, as we need to consider the middle and outer edge combinations. Notice that for each $e_i$, we have $2$ choices, i.e. a middle or outer edge, contributing $2^3 = 8$ additional $3$-matchings, for a total of $8 + 12 = 20$ $3$-matchings in $f_3^{-1}(\eset{1})$.
\begin{itemize}
\item $3$-matching ($\threedis$)
\end{itemize}
Given $\eset{1} = \threedis$ subgraph, we have the case that all edges in $\eset{3}$ have the property that $(e_i, b)$ is disjoint to $(e_j, b)$ for $i \neq j$. For each $e_i$, there are then $3$ choices, independent of each other, and it results that there are $3^3 = 27$ 3-matchings in $f_3^{-1}(\eset{1})$.
All of the observations above focused only on the shape of $S$, and since we see that for fixed $S$, we have a fixed number of $3$-matchings, this implies the identity.
\end{proof}
\qed
\subsection{Three Paths}
Computing the number of 3-paths in $\graph{2}$ and $\graph{3}$ consists of much simpler linear combinations.
\subsubsection{$\graph{2}$}
\begin{proof}[Proof of Lemma \ref{lem:3p-G2}]
For $\mathcal{P} \subseteq \eset{2}$ such that $\mathcal{P} $ is a $3$-path, it \textit{must} be the case by definition of $f$ that all edges in $f_2(\mathcal{P} )$ have at least one mapping from an edge in $\mathcal{P} $ (and recall that $\mathcal{P} $ is connected). This constraint rules out every pattern $\eset{1}$ consisting of $3$ edges, as well as when $\eset{1} = \twodis$. For $\eset{1} = \ed$, note that $\eset{1}$ doesn't have enough edges to have any output in $f_2^{-1}(\eset{1})$, i.e., there exists no $s \in \binom{E_2}{3}$ such that $f_2(\mathcal{P} ) = \eset{1}$. The only surviving pattern is $\eset{1} = \twopath$, where the edges of $\eset{2}$ have successive connectivity from $(e_1, 0)$ to $(e_2, 1)$. There are then $2$ $3$-paths sharing edges $e_1$ and $e_2$ in $f_2^{-1}(\eset{1}), \pbrace{(e_1, 0), (e_1, 1), (e_2, 0)} \text{ and }\pbrace{(e_1, 1), (e_2, 0), (e_2, 1)}$.
\end{proof}
\qed
\subsubsection{$\graph{3}$}
\begin{proof}[Proof of Lemma \ref{lem:3p-G3}]
The argument follows along the same lines as in the proof of \cref{lem:3p-G2}. Given $\mathcal{P} \subseteq \eset{3}$, it \textit{must} be that every edge in $f_3(\mathcal{P})$ has at least one edge in $\mathcal{P}$ mapped to it (and $\mathcal{P}$ is connected). Notice again that this cannot be the case for any $\eset{1} \in \binom{E_1}{3}$, nor is it the case when $\eset{1} = \twodis$. This leaves us with two patterns, $\eset{1} = \twopath$ and $\eset{1} = \ed$. For the former, it is the case that we have $2$ $3$-paths across $e_1$ and $e_2$, $\pbrace{(e_1, 1), (e_1, 2), (e_2, 0)}$ and $\pbrace{(e_1, 2), (e_2, 0), (e_2, 1)}$. For the latter pattern $\ed$, it it trivial to see that an edge in $\graph{1}$ becomes a $3$-path in $\graph{3}$, and this proves the identity.
\end{proof}
\qed
\subsection{Triangle}
\begin{proof}[Proof of Lemma \ref{lem:tri}]
The number of triangles in $\graph{k}$ for $k \geq 2$ will always be $0$ for the simple fact that all cycles in $\graph{k}$ will have at least six edges.
\end{proof}
\qed