Addressing a few comments.

master
Oliver Kennedy 2020-12-19 23:20:31 -05:00
parent 8936fe879d
commit b7454db8c7
Signed by: okennedy
GPG Key ID: 3E5F9B3ABD3FDB60
2 changed files with 12 additions and 7 deletions

View File

@ -165,7 +165,8 @@ Given an expression tree $\etree$, define
\[\gamma(\etree)=\frac{\sum_{(\monom, \coef)\in \expandtree{\etree}} \abs{\coef}\cdot \indicator{\monom\mod{\mathcal{B}}\equiv 0}}{\abs{\etree}(1,\ldots, 1)}\]
\end{Definition}
%\AH{This....combined with \Cref{def:mod-set-polys} is \emph{really} nice notation!}
\AR{Need to make sure use of indicator variable $\onesymbol$ above is consistent with the rest of the paper.}
%\AR{Need to make sure use of indicator variable $\onesymbol$ above is consistent with the rest of the paper.}
%\OK{Done}
We next present couple of corollaries of~\Cref{lem:approx-alg}.
\begin{Corollary}
@ -177,10 +178,14 @@ In particular, if $p_0>0$ and $\gamma<1$ are absolute constants then the above r
The proof for~\Cref{cor:approx-algo-const-p} can be seen in~\Cref{sec:proofs-approx-alg}.
The restriction on $\gamma$ is satisfied by \ti (where $\gamma=0$) as well as for all queries of the PDBench \bi benchmark (see \Cref{app:subsec:experiment}).
\AH{I am thinking that perhaps the terminology and presentation of~\Cref{sec:experiments} may need word-smithing to clearly illustrate the $\bi$ benchmarks satisfied--although the substance is already written there.}
\AR{Yes! E.g. $\gamma$ is not used at all in~\Cref{sec:experiments}}
\AR{{\bf Boris/Oliver:} Is there a way to claim that all probabilities in practice are actually constants: i.e. they do not increase with the number of tuples?}
\OK{@Atri: This seems like a reasonable claim. It's too late for me to come up with a reasonable motivation (maybe something will come to me in the morning), but the intuition for me is that each tuple/block is independent... it would be hard for that to be the case if the probability were a function of the number of tuples.}
Note that (i) tuple presence is independent across blocks, so the corresponding probabilities (and hence $p_0$) are independent of the number of blocks, and (ii) \bis model uncertain attributes, so block size (and hence $\gamma$) is a function of the ``messiness'' of a dataset, rather than its size.
Thus, we expect the corrolary to hold in general.
% \AH{I am thinking that perhaps the terminology and presentation of~\Cref{app:subsec:experiment} may need word-smithing to clearly illustrate the $\bi$ benchmarks satisfied--although the substance is already written there.}
% \AR{Yes! E.g. $\gamma$ is not used at all in~\Cref{app:subsec:experiment}}
% \AR{{\bf Boris/Oliver:} Is there a way to claim that all probabilities in practice are actually constants: i.e. they do not increase with the number of tuples?}
% \OK{@Atri: This seems like a reasonable claim. It's too late for me to come up with a reasonable motivation (maybe something will come to me in the morning), but the intuition for me is that each tuple/block is independent... it would be hard for that to be the case if the probability were a function of the number of tuples.}
\subsection{Approximating $\rpoly$}
The algorithm to prove~\Cref{lem:approx-alg} follows from the following observation. Given a query polynomial $\poly(\vct{X})=poly(\etree)$ for expression tree $\etree$ over $\bi$, we can exactly represent $\rpoly(\vct{X})$ as follows:

View File

@ -53,12 +53,12 @@ Let $\semNX$ denote the set of polynomials over variables $\vct{X}$ with natural
Consider now the semiring $(\semNX, +, \cdot, 0, 1)$ whose domain is $\semNX$ and with the standard addition and multiplication of polynomials.
We will utilize $\semNX$-PDB $\pxdb$, defined as the tuple $(\db, \pd)$, where $\semNX$-database $\db$ is paired with probability distribution $\pd$.
We denote by $\polyForTuple$ the annotation of tuple $t$ in the result of $\query$ (i.e., $\polyForTuple = \query(\pxdb)(t)$) and as before, interpret it as a function $\polyForTuple: \{0,1\}^{|\vct X|} \rightarrow \semN$ from vectors of variable assignments to the corresponding value of the annotating polynomial.
$\semNX$-PDBs, a function $\rmod$, which takes an $\semNX$-PDB input and outputs an equivalent $\semN$-PDB are formally defined in \Cref{subsec:supp-mat-background}.
$\semNX$-PDBs and a function $\rmod$ that takes an $\semNX$-PDB input and outputs an equivalent $\semN$-PDB are formally defined in \Cref{subsec:supp-mat-background}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{Proposition}[Expectation of polynomials]\label{prop:expection-of-polynom}
Given an $\semN$-PDB $\pdb = (\idb,\pd)$ and $\semNX$-PDB $\pxdb = (\db,\pd')$ where $\rmod(\pxdb) = \pdb$:
Given an $\semN$-PDB $\pdb = (\idb,\pd)$ and $\semNX$-PDB $\pxdb = (\db',\pd')$ where $\rmod(\pxdb) = \pdb$:
\[ \expct_{\idb \sim \pd}[\query(\db)(t)] = \expct_{\vct{w} \sim \pd'}\pbox{\polyForTuple(\vct{w})} \]
\end{Proposition}
\noindent A formal proof of \Cref{prop:expection-of-polynom} is given in \Cref{subsec:expectation-of-polynom-proof}.