Partial pass on S.3 with notes.

master
Aaron Huber 2021-08-25 12:36:08 -04:00
parent e48bbb27c6
commit 53581b2e40
3 changed files with 10 additions and 7 deletions

View File

@ -8,7 +8,7 @@ In this section, we will prove that computing $\expct\limits_{\vct{W} \sim \pd}\
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Preliminaries}
\AH{One of the complaints with reviewers is that the term \emph{subgraph} is not precise enough.}
Our hardness results are based on (exactly) counting the number of occurrences of a subgraph $H$ in $G$. Let $\numocc{G}{H}$ denote the number of occurrences of $H$ in graph $G$. We can think of $H$ as being of constant size and $G$ as growing. %In query processing, $H$ can be viewed as the query while $G$ as the database instance.
In particular, we will consider the problems of computing the following counts (given $G$ as an input and its adjacency list representation): $\numocc{G}{\tri}$ (the number of triangles), $\numocc{G}{\threedis}$ (the number of $3$-matchings), and the latter's generalization $\numocc{G}{\kmatch}$ (the number of $k$-matchings). Our hardness result in \Cref{sec:multiple-p} is based on the following result:
@ -49,15 +49,17 @@ Our hardness results only need a \ti instance; We also consider the special case
even hold for the expression trees. %this polynomial can be encoded in an expression tree of size $\Theta(km)$.
\AH{Dangling pointers needs to be fixed.}
\noindent Returning to \Cref{fig:ex-shipping-simp}, it is easy to see that $\poly_{G}^\kElem(\vct{X})$ generalizes our running example query:
\resizebox{1\linewidth}{!}{
\begin{minipage}{1.05\linewidth}
\[\poly^k_G\dlImp OnTime(C_1),Route(C_1, C_1'),OnTime(C_1'),\dots,OnTime(C_\kElem),Route(C_\kElem,C_\kElem'),OnTime(C_\kElem')\]
\end{minipage}
}
where adapting the PDB instance in \Cref{fig:ex-shipping-simp}, relation $OnTime$ has $n$ tuples corresponding to each vertex in $\vset=[n]$ each with probability $\prob$ and $Route(\text{City}_1, \text{City}_2)$ has tuples corresponding to the edges $\edgeSet$ (each with probability of $1$).\footnote{Technically, $\poly_{G}^\kElem(\vct{X})$ should have variables corresponding to tuples in $Route$ as well, but since they always are present with probability $1$, we drop those. Our argument also works when all the tuples in $Route$ also are present with probability $\prob$ but to simplify notation we assign probability $1$ to edges.}
Note that this implies that our hard query polynomial can be represented as an expression tree produced by a project-join query with same probability value for each input tuple $\prob_i$.
where adapting the PDB instance in \Cref{fig:ex-shipping-simp}, relation $OnTime$ has $n$ tuples corresponding to each vertex in $\vset=[n]$ each with probability $\prob$ and $Route(\text{City}_1, \text{City}_2)$ has tuples corresponding to the edges $\edgeSet$ (each with probability of $1$).\AH{This footnote is probably unnecessary now since we changed the example.}\footnote{Technically, $\poly_{G}^\kElem(\vct{X})$ should have variables corresponding to tuples in $Route$ as well, but since they always are present with probability $1$, we drop those. Our argument also works when all the tuples in $Route$ also are present with probability $\prob$ but to simplify notation we assign probability $1$ to edges.}
Note that this implies that our hard query polynomial can be represented as an expression tree produced by a project-join query with same probability value for each input tuple $\prob_i$.
\AH{The above discussion from \cref{def:qk} seems to be a bit ambiguous. I'm not sure it's entirely accurate to end with $\prob_i$.}
\subsection{Multiple Distinct $\prob$ Values}
\label{sec:multiple-p}

View File

@ -95,7 +95,7 @@ The circuit \circuit in \Cref{fig:circuit-express-tree} encodes the polynomial $
The semantics of circuits follows the obvious interpretation. We next define its relationship with polynomials formally:
\begin{Definition}[$\polyf(\cdot)$]\label{def:poly-func}
Denote $\polyf(\circuit)$ to be the function from circuit $\circuit$ to its corresponding polynomial. $\polyf(\cdot)$ is recursively defined on $\circuit$ as follows, with addition and multiplication following the standard interpretation for polynomials:
Denote $\polyf(\circuit)$ to be the function from circuit $\circuit$ to its corresponding polynomial (in \abbrSMB).\footnote{Recall our assumption that unless otherwise mentioned, all polynomials are considered in $\abbrSMB$.} $\polyf(\cdot)$ is recursively defined on $\circuit$ as follows, with addition and multiplication following the standard interpretation for polynomials:
\begin{equation*}
\polyf(\circuit) = \begin{cases}
\polyf(\circuit_\lchild) + \polyf(\circuit_\rchild) &\text{ if \circuit.\type } = \circplus\\
@ -105,7 +105,7 @@ Denote $\polyf(\circuit)$ to be the function from circuit $\circuit$ to its corr
\end{equation*}
\end{Definition}
Note that $\circuit$ need not encode an expression in SMB. For instance, $\circuit$ could represent a compressed form of the running example, such as $(X + 2Y)(2X - Y)$, as shown in \Cref{fig:circuit}, while $\polyf(\circuit) = 2X^2+3XY-2Y^2$.\footnote{As stated previously, unless otherwise mentioned all polynomials are considered in the $\abbrSMB$ representation, and this implies that the output of $\polyf\inparen{\cdot}$ is indeed $\abbrSMB$.}
Note that $\circuit$ need not encode an expression in SMB. For instance, $\circuit$ could represent a compressed form of the running example, such as $(X + 2Y)(2X - Y)$, as shown in \Cref{fig:circuit}, while $\polyf(\circuit) = 2X^2+3XY-2Y^2$.
\begin{Definition}[Circuit Set]\label{def:circuit-set}
$\circuitset{\polyX}$ is the set of all possible circuits $\circuit$ such that $\polyf(\circuit) = \polyX$.\footnote{Again, the representation of $\polyX$ is $\abbrSMB$.}

View File

@ -17,7 +17,7 @@ Fix $\prob\in (0,1)$. Then assuming \Cref{conj:graph} is true, any algorithm tha
%\end{proof}
The above shows the hardness for a very specific query polynomial but it is easy to come up with an infinite family of hard query polynomials by `embedding' $\rpoly_{G}^3$ into an infinite family of trivial query polynomials.
Unlike \Cref{thm:mult-p-hard-result} the above result does not show that computing $\rpoly_{G}^3(\prob,\dots,\prob)$ for a fixed $\prob\in (0,1)$ is \sharpwonehard.
However, in \Cref{sec:algo} we show that if we are willing to compute an approximation that this problem (and indeed solving our problem for a much more general setting) is in linear time.
However, in \Cref{sec:algo} we show that if we are willing to compute an approximation, then this problem (and indeed solving our problem for a much more general setting) is in linear time.
%\AH{@atri needs to put in the result for triangles of $\numvar^{\frac{4}{3}}$ runtime.}
We will prove the above result by the following reduction:
@ -54,6 +54,7 @@ Fix $\prob\in (0,1)$. Given $\rpoly_{\graph{\ell}}^3(\prob,\dots,\prob)$ for $\e
allowing us to compute $\numocc{G}{\tri}$ and $\numocc{G}{\threedis}$ in $O(1)$ time.
\end{Lemma}
\AH{Corollary needs refinement.}
\begin{Corollary}
The lower bounds of \cref{thm:mult-p-hard-result} and \cref{th:single-p-hard} hold with respect to $\timeOf{\abbrStepOne}$.
\end{Corollary}