From 4b5edc29203bc15990f67a6ec7ce57e175c664c7 Mon Sep 17 00:00:00 2001 From: Atri Rudra Date: Thu, 16 Apr 2020 22:57:03 -0400 Subject: [PATCH] Made pass till argument for (93) --- sop.tex | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/sop.tex b/sop.tex index be65096..ad9bf6c 100644 --- a/sop.tex +++ b/sop.tex @@ -22,7 +22,7 @@ Let us show first that the expectation of the estimate does in fact yield the va = &\ex{\sum_{\substack{\wElem_1,\ldots, \wElem_{\prodsize}\\ \in \wSet_j}} \prod_{i = 1}^{\prodsize}\vect_i(\wElem_i)\prod_{i = 1}^{\prodsize}\sine(\wElem_i)}\\ = &\sum_{\substack{\wElem_1,\ldots, \wElem_{\prodsize}\\ \in \wSet_j}} \prod_{i = 1}^{\prodsize}\vect_i(\wElem_i)\ex{\prod_{i = 1}^{\prodsize}\sine(\wElem_i)} \end{align*} -Fix the variables $\wElem_1,\ldots, \wElem_{\prodsize}$. Define $\dist$ to be the number of distinct worlds in $\wElem_1,\ldots, \wElem_{\prodsize}$ and $e_l$ to be the number of repitions for the $l_{th}$ \AR{General typesetting comments. (1) You shoud laway use $\ell$ instead of $l$. (2) Typeset $l_{th}$ as $\ell^{\text{th}}$-- note that ``th" is in superscript and not in math mode.} distinct world value. For $\term_1^{\est_j} = \ex{\prod_{i = 1}^{\prodsize} \sine(\wElem_i)}$, \AR{Why are you defining the new notation $\term_1^{\est_j}$. You should always be wary of introducing new notation since it makes things hard to read.} we get +Fix the variables $\wElem_1,\ldots, \wElem_{\prodsize}$. Define $\dist$ to be the number of distinct worlds in $\wElem_1,\ldots, \wElem_{\prodsize}$ and $e_l$ to be the number of repetitions for the $l_{th}$ \AR{General typesetting comments. (1) You should always use $\ell$ instead of $l$. (2) Typeset $l_{th}$ as $\ell^{\text{th}}$-- note that ``th" is in superscript and not in math mode.} distinct world value. For $\term_1^{\est_j} = \ex{\prod_{i = 1}^{\prodsize} \sine(\wElem_i)}$, \AR{Why are you defining the new notation $\term_1^{\est_j}$. You should always be wary of introducing new notation since it makes things hard to read.} we get \begin{align*} \term_1^{\est_j} = &\ex{\prod_{i = 1}^{\prodsize}\sine(\wElem_i)}\\ = &\ex{\prod_{l = 1}^{\dist} \sine(\wElem_l)^{e_l}}\\ @@ -65,6 +65,8 @@ Recall that we started this section out by seeking to prove \cref{eq:var-to-prov One can see that \cref{eq:sigsq-jneqj} is composed of two addends. We now bound each of them separately. \subsection{Bounding $\sum_{j \neq j'}\cvar{j, j'}$} +\AR{You need to re-write the stuff below. First in the 2nd equality suddenly the sum on $j\ne j'$ has vanished. Also I think you should first analyze $\lambda(j,j')$ for both $j=j'$ and $j\ne j'$ for as long as you can. Only when it is needed should you divide into the two cases-- do not do the division up front.} + \begin{align*} \sum_{j \neq j'}\cvar{j, j'} &= \sum_{j \neq j'} \ex{\est_j \cdot \conj{\est_{j'}}} - \ex{\est_j}\cdot\ex{\conj{\est_{j'}}}\\ &=\ex{\prod_{i = 1}^{\prodsize}\sum_{\wElem \in W}v_i(\wElem)s(\wElem)\ind{h(\wElem) = j}\cdot \prod_{i = 1}^{\prodsize}\sum_{\wElem' \in W}v_i(\wElem')\conj{s(\wElem')}\ind{h(\wElem') = j'}} - \ex{\prod_{i = 1}^{\prodsize}\sum_{\wElem \in W}v_i(\wElem)s(\wElem)\ind{h(\wElem) = j}}\cdot \ex{\prod_{i = 1}^{\prodsize}\sum_{\wElem' \in W}v_i(\wElem')\conj{s(\wElem')}\ind{h(\wElem') = j'}}\\ @@ -74,6 +76,7 @@ One can see that \cref{eq:sigsq-jneqj} is composed of two addends. We now bound &= \sum_{\substack{\wElem_1,\cdots,\wElem_\prodsize,\\\wElem'_1,\cdots,\wElem'_\prodsize\\\in W}}\prod_{i = 1}^{\prodsize}v_i(\wElem_i)v_i(\wElem'_i)\left(\ex{\prod_{i = 1}^{\prodsize}s(\wElem_i)\conj{s(\wElem'_i)}\ind{h(\wElem_i) = j}\ind{h(\wElem'_i) = j'}} - \ex{\prod_{i = 1}^{\prodsize}s(\wElem_i)\ind{h(\wElem_i) = j}}\cdot\ex{\prod_{i = 1}^{\prodsize}\conj{s(\wElem'_i)}\ind{h(\wElem_i') = j'}} \right). \end{align*} \AH{Perhaps a formal proof is necessary below.} +\AR{Most definitely.} For $\term_1^{\cvar{j, j'}} = \ex{\prod_{i = 1}^{\prodsize}s(\wElem_i)s(\wElem'_i)\ind{h(\wElem_i) = j}\ind{h(\wElem'_i) = j'}}$, because hash function $h$ cannot bucket the same world to two different buckets, the only instance $\term_1^{\cvar{j, j'}} = 1$ occurs when there is no overlap between the $\wElem_i$ and $\wElem'_i$ variables. Given the condition of no overlap, $\term_1^{\cvar{j, j'}} = 1$ only with the further condition that $\forall i \in [\prodsize], \wElem_i = \wElem, \wElem'_i = \wElem', \wElem \neq \wElem'$. Notice, however, given the conditions, the product of the remaining expectations will cancel this out. Looking at the remaining two expectations $\term_2^{\cvar{j, j'}} = \ex{\prod_{i = 1}^{\prodsize}\sine(\wElem_i) \ind{\hfunc(\wElem_i) = j}} \cdot \ex{\prod_{i = 1}^{\prodsize}\conj{\sine(\wElem'_i)} \ind{\hfunc(\wElem'_i) = j'}}$, that $\term_2^{\cvar{j, j'}} = 1$ only when $\forall i \in [\prodsize], \wElem_i = \wElem, \wElem'_i = \wElem'$. Taken together, the constraints leave us with only one possible case for $\term_1^{\cvar{j, j'}} - \term_2^{\cvar{j, j'}} \neq 0$, when all variables are the same world. Thus, \begin{align} &\sum_{j \neq j'}\cvar{j, j'} = - \frac{1}{B^2}\sum_{\wElem \in W}\prod_{i = 1}^{\prodsize}v_i^2(\wElem)\label{eq:cvar-bound}. @@ -97,6 +100,7 @@ We now move on to bound the variance of a $\prodsize$-way join. \end{align} Before proceeding, we introduce some notation and terminology that will aid in communicating the bounds we are about to establish. We refer to the leftmost expectation of \cref{eq:sig-j-last} in the following way: +\AR{dangling eq ref} \[\term_1\left(\wElem_1,\ldots,\wElem_\prodsize, \wElem_1',\ldots, \wElem_\prodsize'\right) = \ex{\prod_{i = 1}^\prodsize s(w_i)\overline{s(w'_i)}\ind{h(w_i) = j}\ind{h(w'_i) = j}}.%\text{, and} \] %\[\term_2\left(\wElem_1,\ldots,\wElem_\prodsize, \wElem_1',\ldots, \wElem_\prodsize'\right) = \ex{\prod_{i = 1}^ks(w_i)\ind{h(w_i) = j}}\cdot \ex{\prod_{i = 1}^\prodsize\overline{s(w'_i)}\ind{h(w'_i) = j}}. \] @@ -118,6 +122,7 @@ We next describe the nonzero terms of \cref{eq:sig-j-last}. Define and then fix a total ordering of the $\dist$ distinct world elements to follow the total order of the natural numbers in $[\dist]$, such that $\forall i, j \in [\dist], i < j \implies \dw_i < \dw_j, i.e. \wElem_1 \prec\ldots\prec\wElem_\prodsize$. %Given a fixed order $\wSet_{\order}: \left(\wSet, \wSet\right)\mapsto \mathbb{B}$ of possible worlds, define the lexographical order of distinct worlds $\wSet_\dist$ to be the ordering which complies to the identity mapping of elements in $[\prodsize]$ to elements in $[\dist]$ up to $\dist$, such that . In other worlds, $\forall \wElem, \wElem' \in \wSet_\dist, \dw < \wElem' \leftrightarrow \wSet_{\order}\left(\wElem, \wElem'\right) = T$. \end{Definition} +\AR{NO. The ordering $\prec$ has nothing to do with $m$. It is just ordering all the worlds in $W$.} To help describe all possible world value matchings we introduce functions $f$ and $f'$. \begin{Definition} Functions f, f' are the set of surjective mappings from $\prodsize$ to $\dist$ elements: $f: [\prodsize] \rightarrow [\dist], f': [\prodsize] \rightarrow [\dist'].$ @@ -137,7 +142,9 @@ We rewrite equation \eqref{eq:sig-j-last} in terms of $\dist$ distinct worlds, w \sum_{\dist = 2}^{\prodsize}\sum_{\dist' = 2}^{\prodsize}\sum_{f, f'}\sum_{\substack{\dw_1, \ldots,\dw_\dist,\\ \dw'_{1},\ldots,\dw'_{\dist'}\\ \in W}}\prod_{i = 1}^{\prodsize}\vect_i(\dw_{_{f(i)}})\vect_i(\dw_{'_{f'(i)}})\cdot \term_1\left(\dw_{f(1)},\ldots,\dw_{f(\prodsize)}, \dw'_{f'(1)},\ldots, \dw'_{f'(\prodsize)}\right) \label{eq:sig-j-distinct} \end{equation} -Observe that the cartesian product of world values assigned to $\wElem_1,\ldots,\wElem_\prodsize$ throughout the summation can be rearranged into groups of variables with distinct values, for each distinct element $\dist$ in the set $[\prodsize]$. For each $\dist \in [\prodsize]$, all possible combinations of $\dist$ world values can be equivalently modeled by taking the set of surjective functions $f:[\prodsize]\mapsto [\dist]$ and computing all world value combinations based on the total ordering of $\dw_{f(1)}\prec\cdots\prec\dw_{f(m)}$. For any $\dist$, all surjective mappings $f$ constitute all unique mappings with their symmetrical counterparts. Combining that with the total order over $\dw_{f(1)},\ldots,\dw_{f(\dist)}$ yields exactly the world value combinations containing $\dist$ distinct values which appear in the cartesian product of the sum, without double counting. What this all boils down to is a rearrangement of addends in the sum. +\AR{Three comments on the above: (1) Why do the sums on $m$ and $m'$ start with $2$ and not $1$? (2) Also $\tilde{w}_1,\dots,\tilde{w}_m\in W$ should be replaced by $\tilde{w}_1\prec \cdots\prec \tilde{w}_m \in W$-- similarly for $\tilde{w'}_i$s as well. (3) Use $\widetilde{w_i}$ instead of $\tilde{w}_i$-- I had used the latter in my notes due to laziness.} +Observe that the cartesian product of world values assigned to $\wElem_1,\ldots,\wElem_\prodsize$ throughout the summation can be rearranged into groups of world variables with distinct world values, for each distinct element $\dist$ in the set $[\prodsize]$. For each $\dist \in [\prodsize]$, all possible combinations of $\dist$ world values can be equivalently modeled by taking the set of surjective functions $f:[\prodsize]\mapsto [\dist]$ and computing all world value combinations based on the total ordering of $\dw_{f(1)}\prec\cdots\prec\dw_{f(m)}$.\AR{Again total ordering is on worlds in $W$-- $\dw_{f(1)}\prec\cdots\prec\dw_{f(m)}$ does not make sense since some of these world values could be the same.} For any $\dist$, all surjective mappings $f$ constitute all unique mappings with their symmetrical counterparts \AR{I do not see what the ``symmetrical counterparts" comment adds here. Just remove it}. Combining that with the total order over $\dw_{f(1)},\ldots,\dw_{f(\dist)}$ yields exactly the world value combinations containing $\dist$ distinct values which appear in the cartesian product of the sum, without double counting \AR{Again not sure the ``double counting" comment adds anything here}. +\AR{Overall comments: (1) The main thing missing if explicitly stating that $(w_1,\dots,w_k)\mapsto (\dw_{f(1)},\ldots,\dw_{f(\dist)})$. (2) After stating the map you should argue in words why all distinct tuples with $m$ distinct world values are covered.} What this all boils down to is a rearrangement of addends in the sum. \begin{Definition} Functions $f:[\prodsize]\mapsto [\dist], f':[\prodsize]\mapsto [\dist']$ are said to be matching, denoted $\match{f}{f'}$, if and only if