Done with S3 pass

2021-09-18 01:05:31 -04:00 · 2021-09-18 01:05:31 -04:00 · e5ca8a4b24
parent 985c12ecd9
commit e5ca8a4b24
2 changed files with 13 additions and 9 deletions
--- a/mult_distinct_p.tex
+++ b/mult_distinct_p.tex
@ -4,23 +4,26 @@
 \label{sec:hard}

 In this section, we will prove the hardness results claimed in Table~\ref{tab:lbs} for a specific (family) of hard instance $(\query,\pdb)$ for \Cref{prob:bag-pdb-poly-expected} where $\pdb$ is a \abbrTIDB.
- Note that this implies hardness for \bis and general \abbrBPDB, answering \Cref{prob:bag-pdb-poly-expected} (and hence the equivalent \Cref{prob:bag-pdb-query-eval}) in the negative. 
+ Note that this implies hardness for \bis and general \abbrBPDB, answering \Cref{prob:bag-pdb-poly-expected} 
+%(and hence the equivalent \Cref{prob:bag-pdb-query-eval})
+ in the negative. 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \subsection{Preliminaries}\label{sec:hard:sub:pre}
 Our hardness results are based on (exactly) counting the number of (not necessarily induced) subgraphs in $G$ isomorphic to $H$. Let $\numocc{G}{H}$ denote this quantity.  We can think of $H$ as being of constant size and $G$ as growing.  
-In particular, we will consider the problems of computing the following counts (given $G$ in its adjacency list representation): $\numocc{G}{\tri}$ (the number of triangles), $\numocc{G}{\threedis}$ (the number of $3$-matchings), and the latter's generalization $\numocc{G}{\kmatch}$ (the number of $k$-matchings).  We use $\kmatchtime$ to denote the optimal runtime of computing $\numocc{G}{\kmatch}$.  Our hardness results in \Cref{sec:multiple-p} are based on the following hardness results/conjectures:
+In particular, we will consider the problems of computing the following counts (given $G$ in its adjacency list representation): $\numocc{G}{\tri}$ (the number of triangles), $\numocc{G}{\threedis}$ (the number of $3$-matchings), and the latter's generalization $\numocc{G}{\kmatch}$ (the number of $k$-matchings).  We use $\kmatchtime$ to denote the optimal runtime of computing $\numocc{G}{\kmatch}$ exactly.  Our hardness results in \Cref{sec:multiple-p} are based on the following hardness results/conjectures:

 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{Theorem}[\cite{k-match}]
 \label{thm:k-match-hard}
-Given positive integer $k$ and undirected graph $G=(\vset,\edgeSet)$ with no self-loops or parallel edges, the time $\kmatchtime$ to compute $\numocc{G}{\kmatch}$ exactly is $\littleomega{f(k)\cdot |\edgeSet|^c}$ for any function $f$ and fixed constant $c$ independent of $\numedge$ and $k$ (assuming $\sharpwzero\ne\sharpwone$. 
+Given positive integer $k$ and undirected graph $G=(\vset,\edgeSet)$ with no self-loops or parallel edges,  $\kmatchtime\ge \littleomega{f(k)\cdot |\edgeSet|^c}$ for any function $f$ and fixed constant $c$ independent of $\numedge$ and $k$ (assuming $\sharpwzero\ne\sharpwone$). 
 \end{Theorem}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{hypo}\label{conj:known-algo-kmatch}
-There exists an absolute constant $c_0>0$ such that for every $G=(\vset,\edgeSet)$, we have $\kmatchtime \ge \Omega\inparen{|E|^{c_0\cdot k}}$.
+There exists an absolute constant $c_0>0$ such that for every $G=(\vset,\edgeSet)$, we have $\kmatchtime \ge \Omega\inparen{|E|^{c_0\cdot k}}$ for large enough $k$.
 \end{hypo}
-We note that the above conjecture is somewhat non-standard. In particular, the best known state of the art algorithm to compute $\numocc{G}{\kmatch}$ takes time $\Omega\inparen{|V|^{k/2}}$ (i.e. if this is the best algorithm then $c_0=\frac 14$)~\cite{k-match}. What the above conjecture is saying is that one can only hope for a polynomial improvement over the state of the art algorithm to compute $\numocc{G}{\kmatch}$.
+We note that the above conjecture is somewhat non-standard. In particular, the best known algorithm to compute $\numocc{G}{\kmatch}$ takes time $\Omega\inparen{|V|^{k/2}}$ (i.e. if this is the best algorithm then $c_0=\frac 14$)~\cite{k-match}. What the above conjecture is saying is that one can only hope for a polynomial improvement over the state of the art algorithm to compute $\numocc{G}{\kmatch}$.
 %
+
 Our hardness result in Section~\ref{sec:single-p} is based on the following conjectured hardness result:
 %
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -55,7 +58,7 @@ SELECT COUNT(*) FROM $R_1$ JOIN $R_2$ JOIN$\cdots$JOIN $R_k$
 \end{lstlisting}          
 \noindent Further, the PDB instance generalizes the one in \Cref{fig:two-step} as follows. Relation $OnTime$ has $n$ tuples corresponding to each vertex for $i$ in $[n]$, each with probability $\prob_i$ and $Route$ has tuples corresponding to the edges $\edgeSet$ (each with probability of $1$).\footnote{Technically, $\poly_{G}^\kElem(\vct{X})$ should have variables corresponding to tuples in $Route$ as well, but since they always are present with probability $1$, we drop those. Our argument also works when all the tuples in $Route$ also are present with probability $\prob$ but to simplify notation we assign probability $1$ to edges.}
 In other words, for this instance $\dbbase$ contains the set of $n$ unary tuples in $OnTime$ (which corresponds to $\vset$) and $m$ binary tuples in $Route$ (which corresponds to $\edgeSet$).
-Note that this implies that $\poly_{G}^\kElem$ is indeed a lineage polynomial for a \abbrTIDB \abbrPDB.
+Note that this implies that $\poly_{G}^\kElem$ is indeed a \abbrTIDB-lineage polynomial. % for a \abbrTIDB \abbrPDB.

 Next, we note that the runtime for \abbrStepOne with $\query^k$ and $\dbbase$ as defined above is $O(m)$ (i.e.  \abbrStepOne is `easy' for this query):
 \begin{Lemma}\label{lem:tdet-om}
@ -77,7 +80,7 @@ needs time $\bigOmega{\kmatchtime}$, assuming $\kmatchtime\ge \omega\inparen{\ab
 \end{Theorem}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %
-Note that the second row of \Cref{tab:lbs} follows from \Cref{prop:expection-of-polynom}, \Cref{thm:mult-p-hard-result}, \Cref{lem:tdet-om}, and \Cref{thm:k-match-hard} while the third row is proved by \Cref{prop:expection-of-polynom}, \Cref{thm:mult-p-hard-result}, \Cref{lem:tdet-om}, and \Cref{conj:known-algo-kmatch}. Since \Cref{conj:known-algo-kmatch} is non-standard, the latter hardness result should be interpreted as follows. Any substantial polynomial improvement for \Cref{prob:bag-pdb-poly-expected} (over the trivial algorithm that converts $\poly$ into SMB and then runs the obvious algorithm for \abbrStepTwo) would lead to an improvement over the state of the art {\em upper} bounds on  $\kmatchtime$. Finally, note that \Cref{thm:mult-p-hard-result} needs one to be able to compute the expected multiplicities over $(2k+1)$ distinct values of $p_i$, each of which corresponds to distinct $\pd$ (for the same $\dbbase$), which explain the `Multiple' entry in the second column in the second and third row in \Cref{tab:lbs}. Next, we argue how to get rid of this latter requirement.
+Note that the second row of \Cref{tab:lbs} follows from \Cref{prop:expection-of-polynom}, \Cref{thm:mult-p-hard-result}, \Cref{lem:tdet-om}, and \Cref{thm:k-match-hard} while the third row is proved by \Cref{prop:expection-of-polynom}, \Cref{thm:mult-p-hard-result}, \Cref{lem:tdet-om}, and \Cref{conj:known-algo-kmatch}. Since \Cref{conj:known-algo-kmatch} is non-standard, the latter hardness result should be interpreted as follows. Any substantial polynomial improvement for \Cref{prob:bag-pdb-poly-expected} (over the trivial algorithm that converts $\poly$ into SMB and then uses \Cref{cor:expct-sop} for \abbrStepTwo) would lead to an improvement over the state of the art {\em upper} bounds on  $\kmatchtime$. Finally, note that \Cref{thm:mult-p-hard-result} needs one to be able to compute the expected multiplicities over $(2k+1)$ distinct values of $p_i$, each of which corresponds to distinct $\pd$ (for the same $\dbbase$), which explain the `Multiple' entry in the second column in the second and third row in \Cref{tab:lbs}. Next, we argue how to get rid of this latter requirement.


 %%% Local Variables:
--- a/single_p.tex
+++ b/single_p.tex
@ -11,10 +11,11 @@ Fix $\prob\in (0,1)$. Then assuming \Cref{conj:graph} is true, any algorithm tha
 \end{Theorem}

 Note that \Cref{prop:expection-of-polynom} and \Cref{th:single-p-hard} above imply the hardness result in the first row of \Cref{tab:lbs}.
-The above shows the hardness for a very specific lineage polynomial but it is easy to convert this into a parameterized complexity result as follows.  One can come up with an infinite family of hard query polynomials by `embedding' $\rpoly_{G}^3$ into an infinite family of trivial lineage polynomials.
+%The above shows the hardness for a very specific lineage polynomial but it is easy to convert this into a parameterized complexity result as follows.  One can come up with an infinite family of hard query polynomials by `embedding' $\rpoly_{G}^3$ into an infinite family of trivial lineage polynomials.
+We note that \Cref{thm:k-match-hard} and \Cref{conj:known-algo-kmatch} (and the lower bounds in the second and third row of Table~\ref{tab:lbs}) need $k$ to be large enough (in particular, we need a family of hard queries). But the above \Cref{th:single-p-hard} (and the lower bound in first row of Table~\ref{tab:lbs}) holds for $k=3$ (and hence for a fixed query).


 %%% Local Variables:
 %%% mode: latex
 %%% TeX-master: "main"
-%%% End:
+%%% End: