Done with pass on (new) Sec 3.1

master
Atri Rudra 2020-12-13 13:41:42 -05:00
parent b4db64933c
commit a0ac4a4bfe
2 changed files with 16 additions and 5 deletions

View File

@ -1,5 +1,6 @@
%root: main.tex
\section{$1 \pm \epsilon$ Approximation Algorithm}
\label{sec:algo}
Since it is the case that computing the expected multiplicity of a compressed representation of a bag polynomial is hard, it is then desirable to have an algorithm to approximate the multiplicity in linear time, which is what we describe next.
First, let us introduce some useful definitions and notation. For illustrative purposes in the definitions below, let us consider when $\poly(\vct{X}) = 2x^2 + 3xy - 2y^2$.

View File

@ -31,15 +31,25 @@ There exists a constant $\eps_0>0$ such that given an undirected graph $G=(V,E)$
Based on the so called {\em Triangle detection hypothesis} (cf.~\cite{triang-hard}), which states that detection whether $G$ has a triangle or not takes time $\Omega\inparen{|E|^{4/3}}$, implies that in Conjecture~\ref{conj:graph} we can take $\eps_0\ge \frac 13$.
\AR{Need to add something about 3-paths and 3-matchings as well.}
To prove our hardness result, consider a graph $G(V, E)$, where $|E| = \numedge$, $|V| = \numvar$, and $i, j \in [\numvar]$.
Both of our hardness results use a query polynomial that is based on a simple encoding of the edges of a graph.
To prove our hardness result, consider a graph $G(V, E)$, where $|E| = \ge$, $|V| = \numvar$. Our query polynomial will have a variable $X_i$ for every $i, [\numvar]$.
Now consider the query
\[\poly_{G}(\vct{X}) = \sum\limits_{(i, j) \in E} X_i \cdot X_j.\]
The hard query polynomial for our problem will be a suitable power $k\ge 3$ of the polynomial above, i.e.
\begin{Definition}
Let $G=([n],E)$ be a graph. Then for any $\kElem\ge 1$, define
\[\poly_{G}^\kElem(X_1,\dots,X_n) = \left(\sum\limits_{(i, j) \in E} X_i \cdot X_j\right)^\kElem.\]
\end{Definition}
Consider the query $\poly_{G}(\vct{X}) = q_E(X_1,\ldots, X_\numvar) = \sum\limits_{(i, j) \in E} X_i \cdot X_j$.
Our hardness results only need TIDB instance and further, we consider the special case when all the tuple probabilities are the same value.
\AR{need discussion on the `tightness' of various params. First, this is for degree 6 poly-- while things are easy for say deg 2. Second this is for any fixed p. Finally, we only need porject-join queries to get the hardness results. Also need to compare this with the generality of the approx upper bound results.}
Following up on the discussion around Example~\ref{ex:intro}, it is easy to see that $\poly_{G}^\kElem(\vct{X})$ is the query polynomial corresponding to the following query:
\[\poly:- R(A_1),E(A_1,B_1),R(B_1),\dots,R(A_\kElem),E(A_\kElem,B_\kElem),R(B_\kElem)\]
where generalizaing the PDB instance in Example~\ref{ex:intro}, relation $R$ has $n$ tuples corresponding to each vertex in $V=[n]$ each with probability $p$ and $E(A,B)$ has tuples corresponding to the edges in $E$ (each with probability of $1$).\footnote{Technically, $\poly_{G}^\kElem(\vct{X})$ should have variables corresponding to tuples in $E$ as well but since they always are present with probability $1$, we drop those. Our argument also work when all the tuples in $E$ also are present with probability $p$ but to make notation a bit simpler, we make this simplification.}
Note that this imples that our hard query polynimial can be created from a join-project query-- by contrast our approximation algorithm in Section~\ref{sec:algo} can handle lineage polynonmials generated by union of select-project-join queries. % (i.e. we do not need union or select operator to derive our hardness result).
For the following discussion, set $\poly_{G}^\kElem(\vct{X}) = \left(q_E(X_1,\ldots, X_\numvar)\right)^\kElem$.
%\AR{need discussion on the `tightness' of various params. First, this is for degree 6 poly-- while things are easy for say deg 2. Second this is for any fixed p. Finally, we only need porject-join queries to get the hardness results. Also need to compare this with the generality of the approx upper bound results.}
\subsection{Multiple Distinct $\prob$ Values}
\label{sec:multiple-p}