paper-BagRelationalPDBsAreHard/mult_distinct_p.tex

%root:main.tex

\section{Multiple Distinct $\prob$ Values}

We would like to argue for a compressed version of $\poly(\vct{w})$, in general $\expct_{\vct{w}}\pbox{\poly(\vct{w})}$ cannot be computed in linear time.
\AR{Added the hardness result below.}
Our hardness result is based on the following hardness result:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{Theorem}[\cite{k-match}]
\label{thm:k-match-hard}
Given a positive integer $k$ and  an undirected graph $G$ with no self-loops or parallel edges, counting the number of $k$-matchings in $G$ is \sharpwonehard.
\end{Theorem}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

The above result means that we cannot hope to count the number of $k$-matchings in $G=(V,E)$ in time $f(k)\cdot |V|^{O(1)}$ for any function $f$. In fact, all known algorithms to solve this problem take time $|V|^{\Omega(k)}$.

To prove our hardness result, consider a graph $G(V, E)$, where $|E| = \numedge$, $|V| = \numvar$, and $i, j \in [\numvar]$.

Consider the query $\poly_{G}(\vct{X}) = q_E(X_1,\ldots, X_\numvar) = \sum\limits_{(i, j) \in E} X_i \cdot X_j$.

\AR{need discussion on the `tightness' of various params. First, this is for degree 6 poly-- while things are easy for say deg 2. Second this is for any fixed p.  Finally, we only need porject-join queries to get the hardness results. Also need to compare this with the generality of the approx upper bound results.}


For the following discussion, set $\poly_{G}^\kElem(\vct{X}) = \left(q_E(X_1,\ldots, X_\numvar)\right)^\kElem$.
\begin{Lemma}\label{lem:qEk-multi-p}
Let $\prob_0,\ldots, \prob_{2\kElem}$ be distinct values in $(0, 1]$.  Then given the values $\rpoly_{G}^\kElem(\prob_i,\ldots, \prob_i)$ for $0\leq i\leq 2\kElem$, the number of $\kElem$-matchings in $G$ can be computed in $poly(\kElem)$ time.
\end{Lemma}

\begin{proof}[Proof of ~\cref{lem:qEk-multi-p}]
%It is trivial to see that one can readily expand the exponential expression by performing the $n^\kElem$ product operations, yielding the polynomial in the sum of products form of the lemma statement.  By definition $\rpoly_{G}^\kElem$ reduces all variable exponents greater than $1$ to $1$.  Thus, a monomial such as $X_i^\kElem X_j^\kElem$ is $X_iX_j$ in $\rpoly_{G}^\kElem$, and the value after substitution is $p_i\cdot p_j = p^2$.  Further, that the number of terms in the sum is no greater than $2\kElem + 1$, can be easily justified by the fact that each edge has two endpoints, and the most endpoints occur when we have $\kElem$ distinct edges (such a subgraph is also known as a $\kElem$-matching), with non-intersecting points, a case equivalent to $p^{2\kElem}$.
We will show that $\rpoly_{G}^\kElem(\prob,\ldots, \prob) = \sum\limits_{i = 0}^{2\kElem} c_i \cdot \prob^i$.  First, since $\poly_G^\kElem(\vct{X})$ has $\kElem$ products of monomials of degree $2$, it follows that $\poly_G^\kElem(\vct{X})$ has degree $2\kElem$.  We can further write $\poly_{G}^{\kElem}(\vct{X})$ in its expanded SOP form,
\begin{equation*}
\sum_{\substack{(i_1, j_1),\\\cdots,\\(i_\kElem, j_\kElem) \in E}}X_{i_1}X_{j_1}\cdots X_{i_\kElem}X_{j_\kElem}
\end{equation*}
Since each of $(i_1, j_1),\ldots, (i_\kElem, j_\kElem)$ are from $E$, it follows that the set of $\kElem!$ permutations of the $\kElem$ $X_iX_j$ pairs which form the monomial products are of degree $2\kElem$ with the number of distinct variables in an arbitrary monomial $\leq 2\kElem$.  By definition, $\rpoly_{G}^{\kElem}(\vct{X})$ sets every exponent $e > 1$ to $e = 1$, thereby shrinking the degree a monomial product term in the SOP form of $\poly_{G}^{\kElem}(\vct{X})$ to the exact number of distinct variables the monomial contains.  This implies that $\rpoly_{G}^\kElem$ is a polynomial of degree $2\kElem$ and hence $\rpoly_{G}^\kElem(\prob,\ldots, \prob)$ is a polynomial in $\prob$ of degree $2\kElem$.  Then it is the case that
\begin{equation*}
\rpoly_{G}^{\kElem}(\prob,\ldots, \prob) = \sum_{i = 0}^{2\kElem} c_i \prob^i
\end{equation*}
where $c_i$ denotes all monomials in the expansion of $\poly_{G}^{\kElem}(\vct{X})$ composed of $i$ distinct variables, with $\prob$ substituted for each distinct variable\footnote{Since $\rpoly_G^\kElem(\vct{X})$ does not have any monomial with degree $< 2$, it is the case that $c_0 = c_1 = 1$.}.

Given that we then have $2\kElem + 1$ distinct values of $\rpoly_{G}^\kElem(\prob,\ldots, \prob)$ for $0\leq i\leq2\kElem$, it follows that we then have $2\kElem + 1$ distinct rows of the form $\prob_i^0\ldots\prob_i^{2\kElem}$ which form a matrix $M$.  We have then a linear system of the form $M \cdot \vct{c} = \vct{b}$ where $\vct{c}$ is the coefficient vector ($c_0,\ldots, c_{2\kElem}$), and $\vct{b}$ is the vector such that $\vct{b}[i] = \rpoly_{G}^\kElem(\prob_i,\ldots, \prob_i)$.  By construction of the summation, matrix $M$ is the Vandermonde matrix, from which it follows that we have a matrix with full rank, and we can solve the linear system in $O(k^3)$ time to determine $\vct{c}$ exactly.

Denote the number of $\kElem$-matchings in $G$ as $\numocc{G}{\kmatch}$.  Note that $c_{2\kElem}$ is $\kElem! \cdot \numocc{G}{\kmatch}$.  This can be seen intuitively by looking at the original factorized representation $\poly_{G}^\kElem(\vct{X})$, where, across each of the $\kElem$ products, an arbitrary $\kElem$-matching can be selected $\prod_{i = 1}^\kElem \kElem = \kElem!$ times.  Note that each $\kElem$-matching $(i_1, j_1)\ldots$ $(i_k, j_k)$ in $G$ corresponds to the unique monomial $\prod_{\ell = 1}^\kElem X_{i_\ell}X_{j_\ell}$ in $\poly_{G}^\kElem(\vct{X})$, where each index is distinct.  Since each index is distinct, then each variable has an exponent $e = 1$ and this monomial survives in $\rpoly_{G}^{\kElem}(\vct{X})$  Since $\rpoly$ contains only exponents $e \leq 1$, the only degree $2\kElem$ terms that can exist in $\rpoly_{G}^\kElem$ are $\kElem$-matchings since every other monomial in $\poly_{G}^\kElem(\vct{X})$ has strictly less than $2\kElem$ distinct variables, which, as stated earlier implies that every other non-$\kElem$-matching monomial in $\rpoly_{G}^\kElem(\vct{X})$ has degree $< 2\kElem$.
%It has already been established above that a $\kElem$-matching ($\kmatch$) has coefficient $c_{2\kElem}$.  As noted, a $\kElem$-matching occurs when there are $\kElem$ edges, $e_1, e_2,\ldots, e_\kElem$, such that all of them are disjoint, i.e., $e_1 \neq e_2 \neq \cdots \neq e_\kElem$.  In all $\kElem$ factors of $\poly_{G}^\kElem(\vct{X})$ there are $k$ choices from the first factor to select an edge for a given $\kElem$ matching, $\kElem - 1$ choices in the second factor, and so on throughout all the factors, yielding $\kElem!$ duplicate terms for each $\kElem$ matching in the expansion of $\poly_{G}^\kElem(\vct{X})$.

Then, since we have $\kElem!$ duplicates of each distinct $\kElem$-matching, and the fact that $c_{2\kElem}$ contains all monomials with degree $2\kElem$, it follows that $c_{2\kElem} = \kElem!\cdot\numocc{G}{\kmatch}$.  This allows us to solve for $\numocc{G}{\kmatch}$ by simply dividing $c_{2\kElem}$ by $\kElem!$.
\end{proof}

\qed

\begin{Corollary}\label{cor:mult-p-hard-result}
Computing $\rpoly(\vct{X})$ given multiple distinct $\prob$ values is $\#W[1]$-hard.
\end{Corollary}
\begin{proof}[Proof of Corollary ~\ref{cor:mult-p-hard-result}]
The proof follows by ~\cref{thm:k-match-hard} and ~\cref{lem:qEk-multi-p}.
\end{proof}

\qed


%\begin{Corollary}\label{cor:lem-qEk}
%One can compute $\numocc{G}{\kmatch}$ in $\query_{G}^\kElem(\vct{X})$ exactly.
%\end{Corollary}
%
%\begin{proof}[Proof for Corollary ~\ref{cor:lem-qEk}]
%By ~\cref{lem:qEk-multi-p}, the term $c_{2\kElem}$ can be exactly computed.  Additionally we know that $c_{2\kElem}$ can be broken into two factors, and by dividing $c_{2\kElem}$ by the factor $\kElem!$, it follows that the resulting value is indeed $\numocc{G}{\kmatch}$.
%\end{proof}
%
%\qed
%\begin{Corollary}\label{cor:tilde-q-hard}
%Computing $\rpoly(\vct{X})$ is $\#W[1]$-hard.
%\end{Corollary}
%
%\begin{proof}[Proof of Corollary ~\ref{cor:tilde-q-hard}]
%The proof follows by ~\cref{thm:k-match-hard}, ~\cref{lem:qEk-multi-p} and ~\cref{cor:lem-qEk}.
%\end{proof}

%%% Local Variables:
%%% mode: latex
%%% TeX-master: "main"
%%% End:
More work on lemmas 3, 4, and lin sys. 2020-12-04 13:14:12 -05:00			`%root:main.tex`

Moved definitions, lemmas, etc. to background/notation section. 2020-12-11 20:19:45 -05:00			`\section{Multiple Distinct $\prob$ Values}`
More work on lemmas 3, 4, and lin sys. 2020-12-04 13:14:12 -05:00
abstract 2020-12-11 19:50:53 -05:00			`We would like to argue for a compressed version of $\poly(\vct{w})$, in general $\expct_{\vct{w}}\pbox{\poly(\vct{w})}$ cannot be computed in linear time.`
Added hardness result for k-matchings 2020-12-09 00:00:04 -05:00			`\AR{Added the hardness result below.}`
Incorporated @atri pdf 120920 suggestions. 2020-12-09 13:41:44 -05:00			`Our hardness result is based on the following hardness result:`
abstract 2020-12-11 19:50:53 -05:00
			`%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%`
Incorporated all of @atri Riot 120920 suggestions. 2020-12-09 12:20:44 -05:00			`\begin{Theorem}[\cite{k-match}]`
Added hardness result for k-matchings 2020-12-09 00:00:04 -05:00			`\label{thm:k-match-hard}`
abstract 2020-12-11 19:50:53 -05:00			`Given a positive integer $k$ and an undirected graph $G$ with no self-loops or parallel edges, counting the number of $k$-matchings in $G$ is \sharpwonehard.`
Incorporated all of @atri Riot 120920 suggestions. 2020-12-09 12:20:44 -05:00			`\end{Theorem}`
abstract 2020-12-11 19:50:53 -05:00			`%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%`

Incorporated @atri pdf 120920 suggestions. 2020-12-09 13:41:44 -05:00			`The above result means that we cannot hope to count the number of $k$-matchings in $G=(V,E)$ in time $f(k)\cdot \|V\|^{O(1)}$ for any function $f$. In fact, all known algorithms to solve this problem take time $\|V\|^{\Omega(k)}$.`
More work on lemmas 3, 4, and lin sys. 2020-12-04 13:14:12 -05:00
abstract 2020-12-11 19:50:53 -05:00			`To prove our hardness result, consider a graph $G(V, E)$, where $\|E\| = \numedge$, $\|V\| = \numvar$, and $i, j \in [\numvar]$.`
More work on lemmas 3, 4, and lin sys. 2020-12-04 13:14:12 -05:00
abstract 2020-12-11 19:50:53 -05:00			`Consider the query $\poly_{G}(\vct{X}) = q_E(X_1,\ldots, X_\numvar) = \sum\limits_{(i, j) \in E} X_i \cdot X_j$.`
More work on lemmas 3, 4, and lin sys. 2020-12-04 13:14:12 -05:00
Added a note for discussion on hardness results 2020-12-11 21:48:27 -05:00			\AR{need discussion on the `tightness' of various params. First, this is for degree 6 poly-- while things are easy for say deg 2. Second this is for any fixed p. Finally, we only need porject-join queries to get the hardness results. Also need to compare this with the generality of the approx upper bound results.}

Small adjustments to approx algo. 2020-12-08 11:59:46 -05:00
Finished restructuring mult p and single p arguments. 2020-12-07 15:12:39 -05:00			`For the following discussion, set $\poly_{G}^\kElem(\vct{X}) = \left(q_E(X_1,\ldots, X_\numvar)\right)^\kElem$.`
			`\begin{Lemma}\label{lem:qEk-multi-p}`
Implemented @atri 121020 pdf suggestions for sec 3. 2020-12-10 12:07:09 -05:00			`Let $\prob_0,\ldots, \prob_{2\kElem}$ be distinct values in $(0, 1]$. Then given the values $\rpoly_{G}^\kElem(\prob_i,\ldots, \prob_i)$ for $0\leq i\leq 2\kElem$, the number of $\kElem$-matchings in $G$ can be computed in $poly(\kElem)$ time.`
More work on lemmas 3, 4, and lin sys. 2020-12-04 13:14:12 -05:00			`\end{Lemma}`

Finished restructuring mult p and single p arguments. 2020-12-07 15:12:39 -05:00			`\begin{proof}[Proof of ~\cref{lem:qEk-multi-p}]`
Incorporated @atri pdf 120920 suggestions. 2020-12-09 13:41:44 -05:00			%It is trivial to see that one can readily expand the exponential expression by performing the $n^\kElem$ product operations, yielding the polynomial in the sum of products form of the lemma statement. By definition $\rpoly_{G}^\kElem$ reduces all variable exponents greater than $1$ to $1$. Thus, a monomial such as $X_i^\kElem X_j^\kElem$ is $X_iX_j$ in $\rpoly_{G}^\kElem$, and the value after substitution is $p_i\cdot p_j = p^2$. Further, that the number of terms in the sum is no greater than $2\kElem + 1$, can be easily justified by the fact that each edge has two endpoints, and the most endpoints occur when we have $\kElem$ distinct edges (such a subgraph is also known as a $\kElem$-matching), with non-intersecting points, a case equivalent to $p^{2\kElem}$.
Implemented @atri 121020 pdf suggestions for sec 3. 2020-12-10 12:07:09 -05:00			`We will show that $\rpoly_{G}^\kElem(\prob,\ldots, \prob) = \sum\limits_{i = 0}^{2\kElem} c_i \cdot \prob^i$. First, since $\poly_G^\kElem(\vct{X})$ has $\kElem$ products of monomials of degree $2$, it follows that $\poly_G^\kElem(\vct{X})$ has degree $2\kElem$. We can further write $\poly_{G}^{\kElem}(\vct{X})$ in its expanded SOP form,`
			`\begin{equation*}`
			`\sum_{\substack{(i_1, j_1),\\\cdots,\\(i_\kElem, j_\kElem) \in E}}X_{i_1}X_{j_1}\cdots X_{i_\kElem}X_{j_\kElem}`
			`\end{equation*}`
			Since each of $(i_1, j_1),\ldots, (i_\kElem, j_\kElem)$ are from $E$, it follows that the set of $\kElem!$ permutations of the $\kElem$ $X_iX_j$ pairs which form the monomial products are of degree $2\kElem$ with the number of distinct variables in an arbitrary monomial $\leq 2\kElem$. By definition, $\rpoly_{G}^{\kElem}(\vct{X})$ sets every exponent $e > 1$ to $e = 1$, thereby shrinking the degree a monomial product term in the SOP form of $\poly_{G}^{\kElem}(\vct{X})$ to the exact number of distinct variables the monomial contains. This implies that $\rpoly_{G}^\kElem$ is a polynomial of degree $2\kElem$ and hence $\rpoly_{G}^\kElem(\prob,\ldots, \prob)$ is a polynomial in $\prob$ of degree $2\kElem$. Then it is the case that
			`\begin{equation*}`
			`\rpoly_{G}^{\kElem}(\prob,\ldots, \prob) = \sum_{i = 0}^{2\kElem} c_i \prob^i`
abstract 2020-12-11 19:50:53 -05:00			`\end{equation*}`
Attempted to fix runtime analysis for outer approx alg. 2020-12-10 18:13:59 -05:00			`where $c_i$ denotes all monomials in the expansion of $\poly_{G}^{\kElem}(\vct{X})$ composed of $i$ distinct variables, with $\prob$ substituted for each distinct variable\footnote{Since $\rpoly_G^\kElem(\vct{X})$ does not have any monomial with degree $< 2$, it is the case that $c_0 = c_1 = 1$.}.`
Implemented @atri 121020 pdf suggestions for sec 3. 2020-12-10 12:07:09 -05:00
			Given that we then have $2\kElem + 1$ distinct values of $\rpoly_{G}^\kElem(\prob,\ldots, \prob)$ for $0\leq i\leq2\kElem$, it follows that we then have $2\kElem + 1$ distinct rows of the form $\prob_i^0\ldots\prob_i^{2\kElem}$ which form a matrix $M$. We have then a linear system of the form $M \cdot \vct{c} = \vct{b}$ where $\vct{c}$ is the coefficient vector ($c_0,\ldots, c_{2\kElem}$), and $\vct{b}$ is the vector such that $\vct{b}[i] = \rpoly_{G}^\kElem(\prob_i,\ldots, \prob_i)$. By construction of the summation, matrix $M$ is the Vandermonde matrix, from which it follows that we have a matrix with full rank, and we can solve the linear system in $O(k^3)$ time to determine $\vct{c}$ exactly.

			Denote the number of $\kElem$-matchings in $G$ as $\numocc{G}{\kmatch}$. Note that $c_{2\kElem}$ is $\kElem! \cdot \numocc{G}{\kmatch}$. This can be seen intuitively by looking at the original factorized representation $\poly_{G}^\kElem(\vct{X})$, where, across each of the $\kElem$ products, an arbitrary $\kElem$-matching can be selected $\prod_{i = 1}^\kElem \kElem = \kElem!$ times. Note that each $\kElem$-matching $(i_1, j_1)\ldots$ $(i_k, j_k)$ in $G$ corresponds to the unique monomial $\prod_{\ell = 1}^\kElem X_{i_\ell}X_{j_\ell}$ in $\poly_{G}^\kElem(\vct{X})$, where each index is distinct. Since each index is distinct, then each variable has an exponent $e = 1$ and this monomial survives in $\rpoly_{G}^{\kElem}(\vct{X})$ Since $\rpoly$ contains only exponents $e \leq 1$, the only degree $2\kElem$ terms that can exist in $\rpoly_{G}^\kElem$ are $\kElem$-matchings since every other monomial in $\poly_{G}^\kElem(\vct{X})$ has strictly less than $2\kElem$ distinct variables, which, as stated earlier implies that every other non-$\kElem$-matching monomial in $\rpoly_{G}^\kElem(\vct{X})$ has degree $< 2\kElem$.
abstract 2020-12-11 19:50:53 -05:00			%It has already been established above that a $\kElem$-matching ($\kmatch$) has coefficient $c_{2\kElem}$. As noted, a $\kElem$-matching occurs when there are $\kElem$ edges, $e_1, e_2,\ldots, e_\kElem$, such that all of them are disjoint, i.e., $e_1 \neq e_2 \neq \cdots \neq e_\kElem$. In all $\kElem$ factors of $\poly_{G}^\kElem(\vct{X})$ there are $k$ choices from the first factor to select an edge for a given $\kElem$ matching, $\kElem - 1$ choices in the second factor, and so on throughout all the factors, yielding $\kElem!$ duplicate terms for each $\kElem$ matching in the expansion of $\poly_{G}^\kElem(\vct{X})$.
More work on lemmas 3, 4, and lin sys. 2020-12-04 13:14:12 -05:00
Implemented @atri 121020 pdf suggestions for sec 3. 2020-12-10 12:07:09 -05:00			`Then, since we have $\kElem!$ duplicates of each distinct $\kElem$-matching, and the fact that $c_{2\kElem}$ contains all monomials with degree $2\kElem$, it follows that $c_{2\kElem} = \kElem!\cdot\numocc{G}{\kmatch}$. This allows us to solve for $\numocc{G}{\kmatch}$ by simply dividing $c_{2\kElem}$ by $\kElem!$.`
			`\end{proof}`
Small adjustments to approx algo. 2020-12-08 11:59:46 -05:00
Implemented @atri 121020 pdf suggestions for sec 3. 2020-12-10 12:07:09 -05:00			`\qed`
Small adjustments to approx algo. 2020-12-08 11:59:46 -05:00
Implemented @atri 121020 pdf suggestions for sec 3. 2020-12-10 12:07:09 -05:00			`\begin{Corollary}\label{cor:mult-p-hard-result}`
			`Computing $\rpoly(\vct{X})$ given multiple distinct $\prob$ values is $\#W[1]$-hard.`
			`\end{Corollary}`
			`\begin{proof}[Proof of Corollary ~\ref{cor:mult-p-hard-result}]`
			`The proof follows by ~\cref{thm:k-match-hard} and ~\cref{lem:qEk-multi-p}.`
More work on lemmas 3, 4, and lin sys. 2020-12-04 13:14:12 -05:00			`\end{proof}`

			`\qed`




Incorporated @atri pdf 120920 suggestions. 2020-12-09 13:41:44 -05:00			`%\begin{Corollary}\label{cor:lem-qEk}`
			`%One can compute $\numocc{G}{\kmatch}$ in $\query_{G}^\kElem(\vct{X})$ exactly.`
			`%\end{Corollary}`
			`%`
			`%\begin{proof}[Proof for Corollary ~\ref{cor:lem-qEk}]`
			`%By ~\cref{lem:qEk-multi-p}, the term $c_{2\kElem}$ can be exactly computed. Additionally we know that $c_{2\kElem}$ can be broken into two factors, and by dividing $c_{2\kElem}$ by the factor $\kElem!$, it follows that the resulting value is indeed $\numocc{G}{\kmatch}$.`
			`%\end{proof}`
			`%`
			`%\qed`
			`%\begin{Corollary}\label{cor:tilde-q-hard}`
			`%Computing $\rpoly(\vct{X})$ is $\#W[1]$-hard.`
			`%\end{Corollary}`
			`%`
			`%\begin{proof}[Proof of Corollary ~\ref{cor:tilde-q-hard}]`
			`%The proof follows by ~\cref{thm:k-match-hard}, ~\cref{lem:qEk-multi-p} and ~\cref{cor:lem-qEk}.`
			`%\end{proof}`
Incorporated all of @atri Riot 120920 suggestions. 2020-12-09 12:20:44 -05:00
abstract 2020-12-11 19:50:53 -05:00			`%%% Local Variables:`
			`%%% mode: latex`
			`%%% TeX-master: "main"`
			`%%% End:`