More touch up on the 2-col format.

2020-12-08 15:45:41 -05:00 · 2020-12-08 15:45:41 -05:00 · d21244a4e7
parent 0b397d728d
commit d21244a4e7
4 changed files with 29 additions and 21 deletions
--- a/approx_alg.tex
+++ b/approx_alg.tex
@ -340,8 +340,8 @@ Consider when $\etree$ encodes the expression $(x_1 + x_2)(x_1 - x_2) + x_2^2$.

 \begin{figure}[h!]
 	\begin{tikzpicture}[thick, every tree node/.style={default_node, thick, draw=black, black, circle, text width=0.3cm, font=\bfseries, minimum size=0.65cm}, every child/.style={black}, edge from parent/.style={draw, thick},
-level 1/.style={sibling distance=1.25cm},
-level 2/.style={sibling distance=1.0cm},
+level 1/.style={sibling distance=0.95cm},
+level 2/.style={sibling distance=0.7cm},
 %level 2+/.style={sibling distance=0.625cm}
 %level distance = 1.25cm,
 %sibling distance = 1cm,
--- a/intro.tex
+++ b/intro.tex
@ -74,7 +74,7 @@ Assume a set semantics setting.  Suppose we are given a Tuple Independent Databa
 This query is hard in set semantics because of correlations in the lineage formula, but under bag semantics with a polynomial formula representing the multiple contributing tuples from the input set $\ti$, it is easy since we enjoy linearity of expectation.
 \end{Example}

-Our work also handles Block Independent Disjoint Databases($\bi$), a PDB model in which tuples are arranged in blocks, where all blocks are independent from one another, but tuples within the same block are mutually exclusive.  For now, let us consider the $\ti$ model.  In the example, consider a fixed probability for all tuples.
+Our work also handles Block Independent Disjoint Databases ($\bi$), a PDB model in which tuples are arranged in blocks, where all blocks are independent from one another, but tuples within the same block are mutually exclusive.  For now, let us consider the $\ti$ model.  In the example, consider a fixed probability for all tuples.
 Note that computing the probability of the query of ~\cref{ex:intro} in set semantics is indeed \#-P hard, since it is a query that is non-hierarchical
 %, i.e., for $Vars(\poly)$ denoting the set of variables occuring across all atoms of $\poly$, a function $sg(x)$ whose output is the set of all atoms that contain variable $x$, we have that $sg(A) \cap sg(B) \neq \emptyset$ and $sg(A)\not\subseteq sg(B)$ and $sg(B)\not\subseteq sg(A)$,
 as defined by Dalvi and Suciu in ~\cite{10.1145/1265530.1265571}.  For the purposes of this work, we define hard to be anything greater than linear time.  %Thus, computing $\expct\pbox{\poly(W_a, W_b, W_c)}$, i.e. the probability of the output with annotation $\poly(W_a, W_b, W_c)$, ($\prob(q)$ in Dalvi, Sucui) is hard in set semantics.
--- a/lin_sys.tex
+++ b/lin_sys.tex
@ -29,8 +29,8 @@ Equation ~\ref{eq:ls-2-1} follows by \cref{lem:tri}.  Similarly ~\cref{eq:ls-2-2

 Now, by simple algebraic manipulations of ~\cref{lem:qE3-exp}, we deduce,
 \begin{align}
-&\frac{\rpoly_{\graph{2}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{2}}{\ed}}{6\prob} - \numocc{\graph{2}}{\twopath} - \numocc{\graph{2}}{\twodis}\prob - \numocc{\graph{2}}{\oneint}\prob\nonumber\\
-& - \big(\numocc{\graph{2}}{\twopathdis} + 3\numocc{\graph{2}}{\threedis}\big)\prob^2 \nonumber\\
+&\frac{\rpoly_{\graph{2}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{2}}{\ed}}{6\prob} - \numocc{\graph{2}}{\twopath} - \numocc{\graph{2}}{\twodis}\prob \nonumber\\
+&- \numocc{\graph{2}}{\oneint}\prob - \big(\numocc{\graph{2}}{\twopathdis} + 3\numocc{\graph{2}}{\threedis}\big)\prob^2 \nonumber\\
 &=\left(-2\cdot\numocc{\graph{1}}{\tri} - 4\cdot\numocc{\graph{1}}{\threepath}\right.\nonumber\\
 &\left. - 8\cdot\numocc{\graph{1}}{\threedis} - 6\cdot\numocc{\graph{1}}{\twopathdis}\right)\cdot\left(3\prob^2 - p^3\right) + 2\cdot\numocc{\graph{1}}{\twopath}\prob\nonumber\\
 &- 4\cdot\numocc{\graph{1}}{\oneint}\cdot\left(3\prob^2 - \prob^3\right)\label{eq:lem3-G2-1}\\
@ -103,17 +103,18 @@ Following the same reasoning for $\graph{3}$, using \cref{lem:3m-G3}, \cref{lem:

 Looking at ~\cref{eq:LS-subtract}, 
 \begin{align}
-&\frac{\rpoly_{\graph{3}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob - \numocc{\graph{3}}{\oneint}\prob \nonumber\\
-&- \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2\nonumber\\
+&\frac{\rpoly_{\graph{3}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob \nonumber\\
+& - \numocc{\graph{3}}{\oneint}\prob - \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2\nonumber\\
 &= \left\{ -18\numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} - 24 \cdot \numocc{\graph{1}}{\twopathdis}\right. \nonumber\\
 &\left.- 27 \cdot \numocc{\graph{1}}{\threedis}\right\}\left(3\prob^2 - \prob^3\right) \nonumber\\
 &+ \pbrace{-20 \cdot \numocc{\graph{1}}{\oneint} - 4\cdot \numocc{\graph{1}}{\twopath} - 6 \cdot \numocc{\graph{1}}{\twodis}}\left(3\prob^2 - \prob^3\right)\nonumber\\
 &+ \numocc{\graph{1}}{\ed}\prob + 2 \cdot \numocc{\graph{1}}{\twopath}\prob. \label{eq:lem3-G3-2}\\
-&\frac{\rpoly_{\graph{3}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob - \numocc{\graph{3}}{\oneint}\prob\nonumber\\
-&- \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2 - \left(\numocc{\graph{1}}{\ed} + \numocc{\graph{1}}{\twopath}\right)\prob\nonumber\\
-&+ \left(24\left(\numocc{\graph{1}}{\twopathdis} + 3\cdot\numocc{\graph{1}}{\threedis}\right) + 20\cdot\numocc{\graph{1}}{\oneint} + 4\cdot\numocc{\graph{1}}{\twopath}\right.\nonumber\\
-&\left.+ 6\cdot\numocc{\graph{1}}{\twodis}\right)\left(3\prob^2 - \prob^3\right)\nonumber\\
-&= \pbrace{- 18 \cdot \numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} + 45 \cdot \numocc{\graph{1}}{\threedis}}\left(3p^2 - p^3\right)\label{eq:lem3-G3-3} 
+&\frac{\rpoly_{\graph{3}}(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob \nonumber\\
+&- \numocc{\graph{3}}{\oneint}\prob - \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2 - \left(\numocc{\graph{1}}{\ed}\right.\nonumber\\
+&\left.+ \numocc{\graph{1}}{\twopath}\right)\prob+ \left(24\left(\numocc{\graph{1}}{\twopathdis} + 3\cdot\numocc{\graph{1}}{\threedis}\right) \right.\nonumber\\
+&\left.+ 20\cdot\numocc{\graph{1}}{\oneint} + 4\cdot\numocc{\graph{1}}{\twopath}+ 6\cdot\numocc{\graph{1}}{\twodis}\right)\left(3\prob^2 - \prob^3\right)\nonumber\\
+&= \pbrace{- 18 \cdot \numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} + 45 \cdot \numocc{\graph{1}}{\threedis}}\nonumber\\
+&\cdot\left(3p^2 - p^3\right)\label{eq:lem3-G3-3} 
 \end{align}

 Equation ~\ref{eq:lem3-G3-2} follows from substituting ~\cref{eq:lem3-G3-2} in for the RHS of ~\cref{eq:LS-subtract}.  We derive ~\cref{eq:lem3-G3-3} by adding the inverse of all $O(\numedge)$ computable terms, and for the case of $\twopathdis$ and $\threedis$, we add the $O(\numedge)$ computable term $24\cdot\left(\numocc{\graph{1}}{\twopathdis} + \numocc{\graph{1}}{\threedis}\right)$ to both sides.
@ -122,9 +123,10 @@ Equation \ref{eq:LS-G3-sub} follows from simple substitution of all lemma identi

 It then follows that
 %Removing $O(\numedge)$ computable terms to the other side of \cref{eq:LS-subtract}, we get
-\begin{equation}
-\mtrix{\rpoly_{G}}[3] = \pbrace{- 18 \cdot \numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} + 45 \cdot \numocc{\graph{1}}{\threedis}}\left(3p^2 - p^3\right)\label{eq:LS-G3'} 
-\end{equation}
+\begin{align}
+&\mtrix{\rpoly_{G}}[3] = \pbrace{- 18 \cdot \numocc{\graph{1}}{\tri} - 21 \cdot \numocc{\graph{1}}{\threepath} + 45 \cdot \numocc{\graph{1}}{\threedis}}\nonumber\\
+&\cdot\left(3p^2 - p^3\right)\label{eq:LS-G3'} 
+\end{align}
 and

 %The same justification for the derivation of $\linsys{2}$ applies to the derivation above of $\linsys{3}$.  To arrive at ~\cref{eq:LS-G3'}, we move $O(\numedge)$ computable terms to the left hand side.  For the term $-24\cdot\numocc{\graph{1}}{\twopathdis}$ we need to add the inverse to both sides AND $72\cdot\numocc{\graph{1}}{\threedis}$ to both sides, in order to satisfy the constraint of $\cref{eq:2pd-3d}$.
@ -132,10 +134,11 @@ and
 %For the LHS we get

 \begin{align*}
-&\vct{b}[3] = \frac{\rpoly(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob - \numocc{\graph{3}}{\oneint}\prob \nonumber\\
-&- \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2 - \pbrace{\numocc{\graph{1}}{\ed} + 2 \cdot \numocc{\graph{1}}{\twopath}}\prob\\
-&+ \left\{24 \cdot \left(\numocc{\graph{1}}{\twopathdis} + 3\numocc{\graph{1}}{\threedis}\right) + 20 \cdot \numocc{\graph{1}}{\oneint} + 4\cdot \numocc{\graph{1}}{\twopath} \right.\nonumber\\
-&\left.+ 6 \cdot \numocc{\graph{1}}{\twodis}\right\}\left(3\prob^2 - \prob^3\right) 
+&\vct{b}[3] = \frac{\rpoly(\prob,\ldots, \prob)}{6\prob^3} - \frac{\numocc{\graph{3}}{\ed}}{6\prob} - \numocc{\graph{3}}{\twopath} - \numocc{\graph{3}}{\twodis}\prob \nonumber\\
+& - \numocc{\graph{3}}{\oneint}\prob - \big(\numocc{\graph{3}}{\twopathdis} + 3\numocc{\graph{3}}{\threedis}\big)\prob^2\\
+& - \pbrace{\numocc{\graph{1}}{\ed} + 2 \cdot \numocc{\graph{1}}{\twopath}}\prob + \left\{24 \cdot \left(\numocc{\graph{1}}{\twopathdis} \right.  \right.\nonumber\\
+&\left.\left.+ 3\numocc{\graph{1}}{\threedis}\right) + 20 \cdot \numocc{\graph{1}}{\oneint} + 4\cdot \numocc{\graph{1}}{\twopath}\right.\\
+&\left.+ 6 \cdot \numocc{\graph{1}}{\twodis}\right\}\cdot\left(3\prob^2 - \prob^3\right) 
 \end{align*}

 We now have a linear system consisting of three linear combinations, for $\graph{1}, \graph{2}, \graph{3}$ in terms of $\graph{1}$.  Note that the constants for $\graph{1}$ follow the RHS of ~\cref{eq:LS-subtract}.  To make it easier, use the following variable representations: $x = \numocc{\graph{1}}{\tri}, y = \numocc{\graph{1}}{\threepath}, z = \numocc{\graph{1}}{\threedis}$.  Using $\linsys{2}$ and $\linsys{3}$, the following matrix is obtained,
--- a/single_p.tex
+++ b/single_p.tex
@ -155,9 +155,10 @@ Note that $f_k$ is properly defined.  For any $S \in \binom{E_k}{3}$, $|f(S)| \l
 \end{proof}
 \qed   

+\AR{TODO for {\em later}: I think the proof will be much easier to follow with figures: just drawing out $S\times \{0,1\}$ along with the $(e_i,b_i)$ explicity notated on the edges will make the proof much easier to follow.}

 \subsubsection{Three Matchings in $\graph{2}$}
-\AR{TODO for {\em later}: I think the proof will be much easier to follow with figures: just drawing out $S\times \{0,1\}$ along with the $(e_i,b_i)$ explicity notated on the edges will make the proof much easier to follow.}
+
 \begin{proof}[Proof of Lemma \ref{lem:3m-G2}]
 For each edge pattern $S$, we count the number of $3$-matchings in the $3$-edge subgraphs of $\graph{2}$ in $f_2^{-1}(S)$.  We start with $S \in \binom{E_1}{3}$, where $S$ is composed of the edges $e_1, e_2, e_3$ and $f_2^{-1}(S)$ is the set of all $3$-edge subsets of the set 
 \begin{equation*}
@ -172,7 +173,11 @@ Consider the $\eset{1} = \threedis$ pattern.  Note that edges in $\eset{2}$ are
 \begin{itemize}
 	\item Disjoint Two-Path ($\twopathdis$)
 \end{itemize}
-For $\eset{1} = \twopathdis$ edges $e_2, e_3$ form a $2$-path with $e_1$ being disjoint.  This means that $(e_2, 0), (e_2, 1), (e_3, 0), (e_3, 1)$ form a $4$-path while $(e_1, 0), (e_1, 1)$ is its own disjoint $2$-path.  We can only pick either $(e_1, 0)$ or $(e_1, 1)$ from $f_2^{-1}(S)$, and then we need to pick a $2$-matching from $e_2$ and $e_3$.  Note that a four path allows there to be 3 possible 2 matchings, specifically, $\pbrace{(e_2, 0), (e_3, 0)}, \pbrace{(e_2, 0), (e_3, 1)}, \pbrace{(e_2, 1), (e_3, 1)}$.  Since these two selections can be made independently, there are $2 \cdot 3 = 6$ choices for $3$-matchings in $f_2^{-1}(S)$.
+For $\eset{1} = \twopathdis$ edges $e_2, e_3$ form a $2$-path with $e_1$ being disjoint.  This means that $(e_2, 0), (e_2, 1), (e_3, 0), (e_3, 1)$ form a $4$-path while $(e_1, 0), (e_1, 1)$ is its own disjoint $2$-path.  We can only pick either $(e_1, 0)$ or $(e_1, 1)$ from $f_2^{-1}(S)$, and then we need to pick a $2$-matching from $e_2$ and $e_3$.  Note that a four path allows there to be 3 possible 2 matchings, specifically, 
+\begin{equation*}
+\pbrace{(e_2, 0), (e_3, 0)}, \pbrace{(e_2, 0), (e_3, 1)}, \pbrace{(e_2, 1), (e_3, 1)}.
+\end{equation*}
+Since these two selections can be made independently, there are $2 \cdot 3 = 6$ choices for $3$-matchings in $f_2^{-1}(S)$.

 \begin{itemize}
 	\item $3$-star ($\oneint$)