More changes from 062320 suggestions

This commit is contained in:
Aaron Huber 2020-06-24 11:58:14 -04:00
parent bad5590f2f
commit 6e8a7e8027

View file

@ -53,42 +53,35 @@ In steps \cref{p1-s1} and \cref{p1-s2}, by linearity of expectation, the expecat
If $\poly$ is given to us in a sum of monomials form, the expectation of $\poly$ ($\ex{\poly}$) can be computed in $O(|\poly|)$, where $|\poly|$ denotes the total number of multiplication/addition operators.
\end{Corollary}
The corollary follows immediately by \cref{lem:exp-poly-rpoly}.\AR{It does not follow from the statement of \cref{lem:exp-poly-rpoly} but rather its proof. So this atatement needs its proof as well.}
\begin{proof}
Note that \cref{lem:exp-poly-rpoly} shows that $\ex{\poly} = \rpoly(\prob_1,\ldots, \prob_\numTup)$. Therefore, if $\poly$ is already in sum of products form, one only needs to compute $\poly(\prob_1,\ldots, \prob_\numTup)$ ignoring exponent terms (note that such a polynomial is $\rpoly(\prob_1,\ldots, \prob_\numTup)$), which is indeed has $O(|\poly|)$ compututations.\qed
\end{proof}
\subsection{When $\poly$ is not in sum of monomials form}
\AR{I made my pass on a printout when this section has a partial proof of Claim 1. So this section will not have much comments beyond that. However my comments have implications for rest of the section, so will make my pass once the comments below have been propagated to the rest of the section.}
We would like to argue that in the general case there is no computation of expectation in linear time.
To this end, consider the following graph $G(V, E)$, where $|E| = m$, $|V| = \numTup$, and $i, j \in [\numTup]$. Consider the query $q_E(\wElem_1,\ldots, \wElem_\numTup) = \sum\limits_{(i, j) \in E} \wElem_i \cdot \wElem_j$.\AR{Again the query polynomial should have $X_i$ as variables.}
To this end, consider the following graph $G(V, E)$, where $|E| = m$, $|V| = \numTup$, and $i, j \in [\numTup]$. Consider the query $q_E(X_1,\ldots, X_\numTup) = \sum\limits_{(i, j) \in E} X_i \cdot X_j$.
\AR{The two lemmas need to be re-written once notation for representing a query is finalized in Section 1.}
\begin{Lemma}\label{lem:gen-p}
If we can compute $\poly(\wElem_1,\ldots, \wElem_\numTup) = q_E(\wElem_1,\ldots, \wElem_\numTup)^3$ in T(m) time for fixed $\prob$,\AR{The statement so far technically does not make sense since the definition of $\poly(\wElem_1,\ldots, \wElem_\numTup)$ does not have $p$ anywhere in it. See my comment above.} then we can count the number of triangles in $G$ in T(m) + O(m) time.
\AR{ANY math notation e.g. T(m) should always be in math mode, like so $T(m)$.}
\AR{Also your just it {\bf not} to just TeX up what is in the yhand-written notes-- you need to verify that the statement is correct and modify the statement as necessary. E.g. I think the final claim above should be for 3-matchings and not triangles.}
\end{Lemma}
\begin{Lemma}\label{lem:const-p}
If we can compute $\poly(\wElem_1,\ldots, \wElem_\numTup) = q_E(\wElem_1,\ldots, \wElem_\numTup)^3$ in T(m) time for $\wElem_1 = \ldots = \wElem_\numTup = \prob$, then we can count the number of 3-matchings in $G$ in $T(m) + O(m)$ time.
\end{Lemma}
\begin{Lemma}\label{lem:gen-p}
If we can compute $\poly(\wElem_1,\ldots, \wElem_\numTup) = q_E(\wElem_1,\ldots, \wElem_\numTup)^3$ in T(m) time for O(1) distinct values of $\prob$ then we can count the number of triangles (and the number of 3-paths, the number of 3-matchings) in $G$ in O(T(m) + m) time.
\end{Lemma}
\AR{This warmup should not be in the actual paper since it is not quite relevant but it is fine to keep for now.}
\AH{The warm-up below is fine for now, but will need to be removed for the final draft}
First, let us do a warm-up by computing $\rpoly(\wElem_1,\dots, \wElem_\numTup)$ when $\poly = q_E(\wElem_1,\ldots, \wElem_\numTup)$. Before doing so, we introduce a notation. Let $\numocc{H}$ denote the number of occurrences that $H$ occurs in $G$. So, e.g., $\numocc{\ed}$ is the number of edges ($m$) in $G$.
\AR{Sorry I should have made this more explicit in the hand-written notes. The notation of $\twopath$ and $\twodis$ are {\bf not} standard notation and we should not be use them ilke in the handwritten notes. There are two options: we could have explicit notation (like $H_{\text{triang}}$) or if you want the figure notation then the edge actually needs to look like an edge-- i.e. the nodes should show up as well-- i.e. the figures for the sub-graphs should look {\bf exactly} like in the hand-written notes. I have seen this done in other papers but I personally do not know how to do this in latex-- you'll need to figure this out on your own if you use this option. I personally am fine with either option (check if Oliver has a preference though).}
\AR{Also we should discuss if $\numocc{H}$ is the best notation. E.g. one could use $\#\textsc{triang}(G)$ to denote the number of triangles in $G$ and so on. This might help with the above comment as well.}
\AH{We need to make a decision on subgraph notation, and number of occurrences notation. Waiting to hear back from Oliver before making a decision.}
\begin{Claim}
\begin{enumerate}
\item $\rpoly_2(\prob,\ldots, \prob) = \numocc{\ed} \cdot \prob^2 + 2\cdot \numocc{\twopath}\cdot \prob^3 + 2\cdot \numocc{\twodis}\cdot \prob^4$
\item We can compute $\rpoly_2$ in O(m) time.
\end{enumerate}
We can compute $\rpoly_2$ in O(m) time.
\end{Claim}
\AR{Note on latex use-- the begin\{claim\} and end\{claim\} should only be around the statement of the claim and not include the proof inside it as well.}
\AR{Also the claim statement should only include the 2nd part. The first part is only useful in proving the 2nd part so no need to explicitly state it in the claim statement iself.}
\begin{proof}
The proof basically follows by definition.
The proof basically follows by definition. When we expand $\poly$, and make all exponents $e = 1$, substituting $\prob$ for all $\wElem_i$ we get $\rpoly_2(\prob,\ldots, \prob) = \numocc{\ed} \cdot \prob^2 + 2\cdot \numocc{\twopath}\cdot \prob^3 + 2\cdot \numocc{\twodis}\cdot \prob^4$.
\begin{enumerate}
\item First note that
\begin{align*}
@ -96,34 +89,39 @@ First, let us do a warm-up by computing $\rpoly(\wElem_1,\dots, \wElem_\numTup)$
&= \sum_{(i, j) \in E} (\wElem_i\wElem_j)^2 + \sum_{\substack{(i, j), (j, \ell) \in E\\s.t. i \neq \ell}}\wElem_i
\wElem_j^2\wElem_\ell + \sum_{\substack{(i, j), (k, \ell) \in E\\s.t. i \neq j \neq k \neq \ell}} \wElem_i\wElem_j\wElem_k\wElem_\ell\\
\end{align*}
By definition,
By definition of $\rpoly$,
\begin{equation*}
\rpoly_2(\wVec) = \sum_{(i, j) \in E} \wElem_i\wElem_j + \sum_{\substack{(i, j), (j, \ell) \in E\\s.t. i \neq \ell}}\wElem_i\wElem_j\wElem_\ell + \sum_{\substack{(i, j), (k, \ell) \in E\\s.t. i \neq j \neq k \neq \ell}} \wElem_i\wElem_j\wElem_k\wElem_\ell\label{eq:part-1}
\end{equation*}
Notice that the first term is $\numocc{\ed}\cdot \prob^2$, the second $\numocc{\twopath}\cdot \prob^3$, and the third $\numocc{\twodis}\cdot \prob^4.$
\item Note that
\AH{We need the correct formula for two-matchings below.}
\begin{align*}
&\numocc{\ed} = m,\\
&\numocc{\twopath} = \sum_{u \in V} \binom{d_u}{2} \text{where $d_u$ is the degree of vertex $u$}\\ &\numocc{\twodis} = \text{a correct formula}
&\numocc{\twopath} = \sum_{u \in V} \binom{d_u}{2} \text{where $d_u$ is the degree of vertex $u$}\\ &\numocc{\twodis} = \textbf{\textit{a correct formula}}
\end{align*}
\end{enumerate}
Thus, since each of the summations can be computed in O(m) time, this implies that by \cref{eq:part-1} $\rpoly(\prob,\ldots, \prob)$ can be computed in O(m) time.\qed
\end{proof}
We are now ready to state the claim we need to prove \cref{lem:gen-p} and \cref{lem:const-p}.
\AH{END of the 'warm-up'}
We are now ready to state the claim we need to prove \cref{lem:const-p} and \cref{lem:gen-p}.
Let $\poly(\wVec) = q_E^3(\wVec)^3$.
\begin{Claim}
\begin{enumerate}
\item\label{claim:four-one} $\rpoly(\prob,\ldots, \prob) = \numocc{\ed}\prob^2 + 6\numocc{\twopath}\prob^3 + 6\numocc{\twodis} + 6\numocc{\tri}\prob^3 + 6\numocc{\oneint}\prob^4 + 6\numocc{\threepath}\prob^4 + 6\numocc{\twopathdis}\prob^5 + 6\numocc{\threedis}\prob^6.$
\item\label{claim:four-two} The following can be computed in O(m) time. $\numocc{\ed}$, $\numocc{\twopath}$, $\numocc{\twodis}$, $\numocc{\oneint}$, $\numocc{\twopathdis}$, $\numocc{\threedis}$.
\end{enumerate}
$\implies$ If one can compute $\rpoly_3(\prob,\ldots, \prob)$ in time T(m), then we can compute the following in O(T(m) + m):
\begin{Claim}\label{claim:four-two}
If one can compute $\rpoly_3(\prob,\ldots, \prob)$ in time T(m), then we can compute the following in O(T(m) + m):
\[\numocc{\tri} + \numocc{\threepath} \cdot \prob - \numocc{\threedis}\cdot(\prob^2 - \prob^3).\]
\end{Claim}
\AR{The claim statement should only include the implication in the 2nd part. The first part is only useful in proving the 2nd part so no need to explicitly state it in the claim statement iself. As a general note: the handwritten notes were written in haste-- you should not assume that notation/statements of claims are the final word. Think if they make sense and/or how you can improve them.}
\begin{proof}
\begin{proof}
When we expand $\poly$ out and assign all exponents $e \geq 1$ a value of $1$, we have the following,
\begin{align}
&\rpoly(\prob,\ldots, \prob) = \numocc{\ed}\prob^2 + 6\numocc{\twopath}\prob^3 + 6\numocc{\twodis} + 6\numocc{\tri}\prob^3 +\nonumber\\
&\qquad\qquad6\numocc{\oneint}\prob^4 + 6\numocc{\threepath}\prob^4 + 6\numocc{\twopathdis}\prob^5 + 6\numocc{\threedis}\prob^6.\label{claim:four-one}
\end{align}
We have shown and will show that the following subgraph cardinalities can be computed in $O(m)$ time:
\[\numocc{\ed}, \numocc{\twopath}, \numocc{\twodis}, \numocc{\oneint}, \numocc{\twopathdis} + \numocc{\threedis}.\]
By definition we have that
\[\poly_3(\wElem_1,\ldots, \wElem_\numTup) = \sum_{\substack{(i_1, j_1),\\ (i_2, j_2),\\ (i_3, j_3) \in E}} \prod_{\ell = 1}^{3}\wElem_{i_\ell}\wElem_{j_\ell}.\]
Rather than list all the expressions in full detail, let us make some observations regarding the sum. Let $e_1 = (i_1, j_1), e_2 = (i_2, j_2), e_3 = (i_3, j_3)$. Notice that each expression in the sum consists of a triple $(e_1, e_2, e_3)$. There are three forms the triple $(e_1, e_2, e_3)$ can take.
@ -140,19 +138,18 @@ $\numocc{\twopathdis} + \numocc{\threedis} = $ the number of occurrences of thre
\[\numocc{\twopathdis} + \numocc{\threedis} = \sum_{(u, v) \in E} \binom{m - d_u - d_v - 1}{2}\] The implication in \cref{claim:four-two} follows by the above and \cref{claim:four-one}.\qed
\end{proof}
\begin{proof}
\underline{Lemma 2}
\AR{You should {\bf NEVER EVER} use hard-coded lemma numbers etc. Latex keeps track of numbering for you-- so {\bf ALWAYS} use the automatic numbering. {\bf Using hard coded numbering is very bad practice.}}
\AR{Also you can modify the text of \textsc{Proof} by using the following latex command \texttt{\\begin\{proof\}[Proof of Lemma 2]} and Latex will typeset this as \textsc{Proof of Lemma 2}, which is what you really want.}
\begin{proof}[Proof of \cref{lem:gen-p}]
\cref{claim:four-two} of Claim 4 implies that if we know $\rpoly_3(\prob,\ldots, \prob)$, then we can know in O(m) additional time
%\AR{Also you can modify the text of \textsc{Proof} by using the following latex command \texttt{\\begin\{proof\}[Proof of Lemma 2]} and Latex will typeset this as \textsc{Proof of Lemma 2}, which is what you really want.}
\cref{claim:four-two} says that if we know $\rpoly_3(\prob,\ldots, \prob)$, then we can know in O(m) additional time
\[\numocc{\tri} + \numocc{\threepath} \cdot \prob - \numocc{\threedis}\cdot(\prob^2 - \prob^3).\] We can think of each term in the above equation as a variable, where one can solve a linear system given 3 distinct $\prob$ values, assuming independence of the three linear equation. In the worst case, without independence, 4 distince values of $\prob$ would suffice...because Atri said so, and I need to ask him for understanding why this is the case, of which I suspect that it has to do with basic result(s) in linear algebra.\AR{Follows from the fact that the corresponding coefficient matrix is the so called Vandermonde matrix, which has full rank.}\qed
\end{proof}
\AH{Below is only a transcription of the notes. The claims need to be verified and further worked out.}
\begin{proof}
\underline{Lemma 1}
\begin{proof}[Proof of \cref{lem:const-p}]
The argument for lemma 2 cannot be applied to lemma 1 since we have that $\prob$ is fixed. We have hope in the following: we assume that we can solve this problem for all graphs, and the hope would be be to solve the problem for say $G_1, G_2, G_3$, where $G_1$ is arbitrary, and relate the values of $\numocc{H}$, where $H$ is a placeholder for the relevant edge combination. The hope is that these relations would result in three independent linear equations, and then we would be done.
The argument for \cref{lem:gen-p} cannot be applied to \cref{lem:const-p} since we have that $\prob$ is fixed. We have hope in the following: we assume that we can solve this problem for all graphs, and the hope would be be to solve the problem for say $G_1, G_2, G_3$, where $G_1$ is arbitrary, and relate the values of $\numocc{H}$, where $H$ is a placeholder for the relevant edge combination. The hope is that these relations would result in three independent linear equations, and then we would be done.
The following is an option.
\begin{enumerate}