Changes: \underbrace, citation for results on counting cliques, etc.

2022-02-23 11:59:51 -05:00 · 2022-02-23 11:59:51 -05:00 · fdea34f305
parent 98571dc16a
commit fdea34f305
4 changed files with 33 additions and 4 deletions
--- a/intro-rewrite-070921.tex
+++ b/intro-rewrite-070921.tex
@ -132,7 +132,10 @@ Our question is whether or not it is always true that $\timeOf{}^*\inparen{\quer
 Specifically, depending on what hardness result/conjecture we assume, we get various emphatic versions of {\em no} as an answer to our question.  To make some sense of the other lower bounds in Table~\ref{tab:lbs}, we note that it is not too hard to show that $\timeOf{}^*(Q,\pdb) \le  \bigO{\inparen{\qruntime{\optquery{\query}, \tupset, \bound}}^k}$, where $k$ is the join width (our notion of join width follows from~\Cref{def:degree-of-poly} and~\Cref{fig:nxDBSemantics}.) of the query $\query$ over all result tuples $\tup$ (and the parameter that defines our family of hard queries).

 What our lower bound in the third row says is that one cannot get more than a polynomial improvement over essentially the trivial algorithm for~\Cref{prob:expect-mult}.
- However, this result assumes a hardness conjecture that is not as well studied as those in the first two rows of the table (see \Cref{sec:hard} for more discussion on the hardness assumptions). Further, we note that existing results already imply the claimed lower bounds if we were to replace the $\qruntime{\optquery{\query}, \tupset, \bound}$ by just $\numvar$ (indeed these results follow from known lower bounds for deterministic query processing). Our contribution is to then identify a family of hard queries where deterministic query processing is `easy' but computing the expected multiplicities is hard. 
+ However, this result assumes a hardness conjecture that is not as well studied as those in the first two rows of the table (see \Cref{sec:hard} for more discussion on the hardness assumptions). Further, we note that existing results\footnote{
+ Consider the known results for the problem of counting $k$-cliques~\cite{10.5555/645413.652181},~\cite{CHEN20061346}, where for a query $\query$ over database $\tupset$ that counts the number of $k$-cliques, such results imply a runtime of $\omega_k\inparen{\numvar}$, our lower bounds would hold.
+ }
+  already imply the claimed lower bounds if we were to replace the $\qruntime{\optquery{\query}, \tupset, \bound}$ by just $\numvar$ (indeed these results follow from known lower bounds for deterministic query processing). Our contribution is to then identify a family of hard queries where deterministic query processing is `easy' but computing the expected multiplicities is hard. 

 \mypar{Our upper bound results} We introduce an $(1\pm \epsilon)$-approximation algorithm that computes ~\Cref{prob:expect-mult} in time $O_\epsilon\inparen{\qruntime{\optquery{\query}, \tupset, \bound}}$.  This means, when we are okay with approximation, that we solve~\Cref{prob:expect-mult} in time linear in the size of the deterministic query %$\timeOf{Approx}^*\inparen{\query, \pdb}\leq\qruntim{\optquery{\query},\tupset,\bound}$ (where $\timeOf{Approx}^*\inparen{\cdot}$ denotes runtime of approximation algorithm), 
 and bag \abbrPDB\xplural are deployable in practice.
--- a/main.bib
+++ b/main.bib
@ -1,3 +1,29 @@
+@article{CHEN20061346,
+title = {Strong computational lower bounds via parameterized complexity},
+journal = {Journal of Computer and System Sciences},
+volume = {72},
+number = {8},
+pages = {1346-1367},
+year = {2006},
+issn = {0022-0000},
+doi = {https://doi.org/10.1016/j.jcss.2006.04.007},
+url = {https://www.sciencedirect.com/science/article/pii/S0022000006000675},
+author = {Jianer Chen and Xiuzhen Huang and Iyad A. Kanj and Ge Xia},
+keywords = {Parameterized computation, Computational complexity, Lower bound, Clique, Polynomial time approximation scheme},
+abstract = {We develop new techniques for deriving strong computational lower bounds for a class of well-known NP-hard problems. This class includes weighted satisfiability, dominating set, hitting set, set cover, clique, and independent set. For example, although a trivial enumeration can easily test in time O(nk) if a given graph of n vertices has a clique of size k, we prove that unless an unlikely collapse occurs in parameterized complexity theory, the problem is not solvable in time f(k)no(k) for any function f, even if we restrict the parameter values to be bounded by an arbitrarily small function of n. Under the same assumption, we prove that even if we restrict the parameter values k to be of the order Θ(μ(n)) for any reasonable function μ, no algorithm of running time no(k) can test if a graph of n vertices has a clique of size k. Similar strong lower bounds on the computational complexity are also derived for other NP-hard problems in the above class. Our techniques can be further extended to derive computational lower bounds on polynomial time approximation schemes for NP-hard optimization problems. For example, we prove that the NP-hard distinguishing substring selection problem, for which a polynomial time approximation scheme has been recently developed, has no polynomial time approximation schemes of running time f(1/ϵ)no(1/ϵ) for any function f unless an unlikely collapse occurs in parameterized complexity theory.}
+}
+@inproceedings{10.5555/645413.652181,
+author = {Flum, J\"{o}rg and Grohe, Martin},
+title = {The Parameterized Complexity of Counting Problems},
+year = {2002},
+isbn = {0769518222},
+publisher = {IEEE Computer Society},
+address = {USA},
+abstract = {We develop a parameterized complexity theory for counting problems. As the basis of this theory, we introduce a hierarchy of parameterized counting complexity classes #W[t], for t geqslant 1 , that corresponds to Downey and Fellows's W-hierarchy [12] and show that a few central W-completeness results for decision problems translate to #W-completeness results for the corresponding counting problems.Counting complexity gets interesting with problems whose decision version is tractable, but whose counting version is hard. Our main result states that counting cycles and paths of length k in both directed and undirected graphs, parameterized by k , is#W[1]-complete. This makes it highly unlikely that any of these problems is fixed-parameter tractable, even though their decision versions are fixed-parameter tractable. More explicitly, our result shows that most likely there is no f(k) cdot n^c-algorithm for counting cycles or paths of length k in a graph of size n for any computable function f: mathbb{N} to mathbb{N} and constant c , even though there is a 2^{0(k)}cdot n^{2.376}algorithm for finding a cycle or path of length k [2].},
+booktitle = {Proceedings of the 43rd Symposium on Foundations of Computer Science},
+pages = {538},
+series = {FOCS '02}
+}
 misc{pdbench,
 howpublished = {r̆lhttp://pdbench.sourceforge.net/},
 note = {Accessed: 2020-12-15},
--- a/mult_distinct_p.tex
+++ b/mult_distinct_p.tex
@ -51,9 +51,9 @@ For any graph $G=(V,\edgeSet)$ and $\kElem\ge 1$, define
 SELECT 1 FROM T $t_1$, R r, T $t_2$
 WHERE $t_1$.city = r.city1 AND $t_2$.city = r.city2
 \end{lstlisting}
-as $R_i$ for each $i \in [k]$.  The query $\query^k$ then becomes
+as $R$.  The query $\query^k$ then becomes
 \begin{lstlisting}
-SELECT COUNT(*) FROM $R_1$ JOIN $R_2$ JOIN$\cdots$JOIN $R_k$
+SELECT COUNT(*) FROM $\underbrace{R\text{ JOIN }R\text{ JOIN}\cdots\text{JOIN }R}_{k\rm\ times}$
 \end{lstlisting}          
 \noindent Consider again the \abbrCTIDB instance $\pdb$ of~\Cref{fig:two-step} and, for our hard instance, let $\bound = 1$.  $\pdb$ generalizes to one compatible to~\Cref{def:qk} as follows. Relation $T$ has $n$ tuples corresponding to each vertex for $i$ in $[n]$, each with probability $\prob$ and $R$ has tuples corresponding to the edges $\edgeSet$ (each with probability of $1$).\footnote{Technically, $\poly_{G}^\kElem(\vct{X})$ should have variables corresponding to tuples in $R$ as well, but since they always are present with probability $1$, we drop those. Our argument also works when all the tuples in $R$ also are present with probability $\prob$ but to simplify notation we assign probability $1$ to edges.}
 In other words, this instance $\tupset$ contains the set of $\numvar$ unary tuples in $T$ (which corresponds to $\vset$) and $\numedge$ binary tuples in $R$ (which corresponds to $\edgeSet$).
--- a/prob-def.tex
+++ b/prob-def.tex
@ -28,7 +28,7 @@ The circuits in \Cref{fig:two-step} encode their respective polynomials in colum
 Note that the ciricuit \circuit representing $AX$ and the circuit \circuit' representing $B\inparen{Y+Z}$ each encode a tree, with edges pointing towards the root.


-	\begin{wrapfigure}{l}{0.45\linewidth}
+	\begin{wrapfigure}{L}{0.45\linewidth}
 		\centering
 		\begin{tikzpicture}[thick]
 			\node[tree_node] (a1) at (0, 0) {$\boldsymbol{X}$};