diff --git a/intro-rewrite2.tex b/intro-rewrite2.tex index 5cbd5c9..a373d04 100644 --- a/intro-rewrite2.tex +++ b/intro-rewrite2.tex @@ -48,6 +48,9 @@ \usetikzlibrary{shapes.geometric}%for cylinder \usetikzlibrary{shapes.arrows}%for arrow shape \usetikzlibrary{shapes.misc} +%rid of vertical spacing for booktabs rules +\renewcommand{\aboverulesep}{0pt} +\renewcommand{\belowrulesep}{0pt} \begin{figure}[h!] \centering @@ -56,9 +59,7 @@ %pdb cylinder \node[cylinder, text width=0.28\textwidth, align=center, draw=black, text=blue, cylinder uses custom fill, cylinder body fill=blue!10, aspect=0.12, minimum height=5cm, minimum width=2.5cm, cylinder end fill=blue!50, shape border rotate=90] (cylinder) at (0, 0) { \tabcolsep=0.1cm - \renewcommand{\aboverulesep}{0pt} - \renewcommand{\belowrulesep}{0pt} - \begin{tabular}{>{\small}c >{\small}c >{\small}c} + \begin{tabular}{>{\small}c | >{\small}c | >{\small}c} \multicolumn{3}{c}{$\boldsymbol{OnTime}$}\\ \toprule City$_\ell$ & $\Phi$ & \textbf{p}\\ @@ -70,7 +71,7 @@ \end{tabular}\\ \tabcolsep=0.05cm %\captionof{table}{Route} - \begin{tabular}{>{\footnotesize}c >{\footnotesize}c >{\footnotesize}c >{\footnotesize}c} + \begin{tabular}{>{\footnotesize}c | >{\footnotesize}c | >{\footnotesize}c | >{\footnotesize}c} \multicolumn{4}{c}{$\boldsymbol{Route$}}\\ \toprule $\text{City}_1$ & $\text{City}_2$ & $\Phi$ & \textbf{p} \\ @@ -91,7 +92,7 @@ \node[rectangle, right=0.175 of arrow1, draw=black, text=purple, fill=purple!15, minimum height=4.5cm, minimum width=2cm](rect) { \tabcolsep=0.075cm %\captionof{table}{Q} - \begin{tabular}{>{\normalsize}c >{\normalsize}c >{\centering\arraybackslash\small}m{1.95cm}} + \begin{tabular}{>{\normalsize}c | >{\normalsize}c | >{\centering\arraybackslash\small}m{1.95cm}} \multicolumn{3}{c}{$\boldsymbol{\query(\pdb)}$}\\ \toprule City & $\Phi$ & Circuit\\% & $\expct_{\idb \sim \probDist}[\query(\db)(t)]$ \\ \hline @@ -155,7 +156,7 @@ \node[rectangle, right=0.25 of arrow2, rounded corners, draw=black, fill=red!15, text=red, minimum height=4.5cm, minimum width=2cm](rrect) { \tabcolsep=0.09cm %\captionof{table}{Q} - \begin{tabular}{>{\small}c >{\centering\arraybackslash\small}m{1.95cm}} + \begin{tabular}{>{\small}c | >{\centering\arraybackslash\small}m{1.95cm}} \multicolumn{2}{c}{$\expct\pbox{\poly(\vct{X})}$}\\ \toprule City & $\mathbb{E}[\poly(\vct{X})]$\\ @@ -338,7 +339,7 @@ With $\Phi^2$ as an example, we have: \end{align*} It can be verified that the reduced polynomial parameterized with each variable's respective marginal probability is a closed form of the expected count (i.e., $\expct\pbox{\Phi^2} = \widetilde{\Phi^2}(\probOf\pbox{L_a=1},$ $\probOf\pbox{L_b=1}, \probOf\pbox{L_c=1}), \probOf\pbox{L_d=1})$). In fact, we show in \Cref{lem:exp-poly-rpoly} that this equivalence holds for {\em all} $\raPlus$ queries over TIDB/BIDB. -To prove our hardness result we show that for the same $Q$ considered in the running example, the query $Q^k$ is able to encode various hard graph-counting problems. We do so by analyzing how the coefficients in the (univariate) polynomial $\widetilde{\Phi}\left(p,\dots,p\right)$ relate to counts of various sub-graphs on $k$ edges in an arbitrary graph $G$ (which is used to define the relations in $Q$). \AH{What is meant by the following sentence?}For the upper bound it is easy to check that if all the probabilties are constant then ${\Phi}\left(\probOf\pbox{X_1=1},\dots, \probOf\pbox{X_n=1}\right)$ (i.e. evaluating the original lineage polynomial over the probability values) is a constant factor approximation. \AH{Why do we say `approximation'? This is a linear {\emph exact} computation.} To get an $(1\pm \epsilon)$-multiplicative approximation we sample monomials from $\Phi$ and `adjust' their contribution to $\widetilde{\Phi}\left(\cdot\right)$. +To prove our hardness result we show that for the same $Q$ considered in the running example, the query $Q^k$ is able to encode various hard graph-counting problems. We do so by analyzing how the coefficients in the (univariate) polynomial $\widetilde{\Phi}\left(p,\dots,p\right)$ relate to counts of various sub-graphs on $k$ edges in an arbitrary graph $G$ (which is used to define the relations in $Q$). For an upper bound on approximating the expected count, it is easy to check that if all the probabilties are constant then ${\Phi}\left(\probOf\pbox{X_1=1},\dots, \probOf\pbox{X_n=1}\right)$ (i.e. evaluating the original lineage polynomial over the probability values) is a constant factor approximation. To get an $(1\pm \epsilon)$-multiplicative approximation we sample monomials from $\Phi$ and `adjust' their contribution to $\widetilde{\Phi}\left(\cdot\right)$. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \mypar{Paper Organization} We present relevant background and notation in \Cref{sec:background}. We then prove our main hardness results in \Cref{sec:hard} and present our approximation algorithm in \Cref{sec:algo}. We present some (easy) generalizations of our results in \Cref{sec:gen} and also discuss extensions from computing expectations of polynomials to the expected result multiplicity problem (\Cref{def:the-expected-multipl})\AH{Aren't they the same?}. Finally, we discuss related work in \Cref{sec:related-work} and conclude in \Cref{sec:concl-future-work}.