approx algo
parent
67b3f750be
commit
91dac1d104
|
@ -10,10 +10,11 @@ The folowing approximation algorithm applies to \abbrBIDB lineage polynomials (o
|
|||
|
||||
\subsection{Preliminaries and some more notation}
|
||||
|
||||
We now introduce definitions and notation related to circuits and polynomials that we will need to state our upper bound results.
|
||||
We now introduce definitions and notation related to circuits and polynomials that we will need to state our upper bound results. First we introduce the expansion $\expansion{\circuit}$ of circuit $\circuit$ which % encodes the reduced polynomial for $\polyf\inparen{\circuit}$ and is the basis
|
||||
is used in our algorithm for sampling monomials (part of our approximation algorithm).
|
||||
|
||||
\begin{Definition}[$\expansion{\circuit}$]\label{def:expand-circuit}
|
||||
For a circuit $\circuit$, we define $\expansion{\circuit}$ as a list of tuples $(\monom, \coef)$, where $\monom$ is a set of variables and $\coef \in \domN$.
|
||||
For a circuit $\circuit$, we define $\expansion{\circuit}$ as a list of tuples $(\monom, \coef)$, where $\monom$ is a set of variables and $\coef \in \domN$.
|
||||
$\expansion{\circuit}$ has the following recursive definition ($\circ$ is list concatenation).
|
||||
$\expansion{\circuit} =
|
||||
\begin{cases}
|
||||
|
@ -54,8 +55,8 @@ Next, we use the following notation for the complexity of multiplying integers:
|
|||
In a RAM model of word size of $W$-bits, $\multc{M}{W}$ denotes the complexity of multiplying two integers represented with $M$-bits. (We will assume that for input of size $N$, $W=O(\log{N})$.)
|
||||
\end{Definition}
|
||||
|
||||
Finally, to get linear runtime results, we will need to define another parameter modeling the (weighted) number of monomials in %$\poly\inparen{\vct{X}}$
|
||||
$\expansion{\circuit}$
|
||||
Finally, to get linear runtime results, we will need to define another parameter modeling the (weighted) number of monomials in %$\poly\inparen{\vct{X}}$
|
||||
$\expansion{\circuit}$
|
||||
that need to be `canceled' when monomials with dependent variables are removed (\Cref{def:reduced-bi-poly}). %def:hen it is modded with $\mathcal{B}$ (\Cref{def:mod-set-polys}).
|
||||
Let $\isInd{\cdot}$ be a boolean function returning true if monomial $\encMon$ is composed of independent variables and false otherwise; further, let $\indicator{\theta}$ also be a boolean function returning true if $\theta$ evaluates to true.
|
||||
\begin{Definition}[Parameter $\gamma$]\label{def:param-gamma}
|
||||
|
@ -69,9 +70,9 @@ Given a \abbrBIDB circuit $\circuit$ define
|
|||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\mypar{Algorithm Idea}
|
||||
%We prove \Cref{lem:approx-alg} by developing an
|
||||
Our approximation algorithm (\approxq pseudo code in \Cref{sec:proof-lem-approx-alg})
|
||||
%with the desired runtime. This algorithm
|
||||
%We prove \Cref{lem:approx-alg} by developing an
|
||||
Our approximation algorithm (\approxq pseudo code in \Cref{sec:proof-lem-approx-alg})
|
||||
%with the desired runtime. This algorithm
|
||||
is based on the following observation.
|
||||
% The algorithm (\approxq detailed in \Cref{alg:mon-sam}) to prove \Cref{lem:approx-alg} follows from the following observation.
|
||||
Given a lineage polynomial $\poly(\vct{X})=\polyf(\circuit)$ for circuit \circuit over $\bi$, we have: % can exactly represent $\rpoly(\vct{X})$ as follows:
|
||||
|
@ -87,7 +88,7 @@ Given a lineage polynomial $\poly(\vct{X})=\polyf(\circuit)$ for circuit \circui
|
|||
|
||||
Given the above, the algorithm is a sampling based algorithm for the above sum: we sample (via \sampmon) $(\monom,\coef)\in \expansion{\circuit}$ with probability proportional
|
||||
to $\abs{\coef}$ and compute $\vari{Y}=\indicator{\isInd{\encMon}}
|
||||
\cdot \prod_{X_i\in \monom} p_i$. %Taking $\ceil{\frac{2 \log{\frac{2}{\conf}}}{\error^2}}$ samples
|
||||
\cdot \prod_{X_i\in \monom} p_i$. %Taking $\ceil{\frac{2 \log{\frac{2}{\conf}}}{\error^2}}$ samples
|
||||
Repeating the sampling appropriate number of times
|
||||
and computing the average of $\vari{Y}$ gives us our final estimate. \onepass is used to compute the sampling probabilities needed in \sampmon (details are in \Cref{sec:proofs-approx-alg}).
|
||||
%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
@ -95,7 +96,7 @@ and computing the average of $\vari{Y}$ gives us our final estimate. \onepass is
|
|||
%The following results assume input circuit \circuit computed from an arbitrary $\raPlus$ query $\query$ and arbitrary \abbrBIDB $\pdb$. We refer to \circuit as a \abbrBIDB circuit.
|
||||
%\AH{Verify that the proof for \Cref{lem:approx-alg} doesn't rely on properties of $\raPlus$ or \abbrBIDB.}
|
||||
%\begin{Theorem}\label{lem:approx-alg}
|
||||
%Let \circuit be an arbitrary \abbrBIDB circuit %for a UCQ over \bi
|
||||
%Let \circuit be an arbitrary \abbrBIDB circuit %for a UCQ over \bi
|
||||
%and define $\poly(\vct{X})=\polyf(\circuit)$ and let $k=\degree(\circuit)$.
|
||||
%Then an estimate $\mathcal{E}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ can be computed in time
|
||||
%{\small
|
||||
|
@ -113,11 +114,11 @@ and computing the average of $\vari{Y}$ gives us our final estimate. \onepass is
|
|||
% We next present a few corollaries of \Cref{lem:approx-alg}.
|
||||
\begin{Theorem}
|
||||
\label{cor:approx-algo-const-p}
|
||||
Let \circuit be an arbitrary \abbrBIDB circuit %for a UCQ over \bi
|
||||
Let \circuit be an arbitrary \abbrBIDB circuit %for a UCQ over \bi
|
||||
and define $\poly(\vct{X})=\polyf(\circuit)$ and let $k=\degree(\circuit)$.
|
||||
%Let $\poly(\vct{X})$ be as in \Cref{lem:approx-alg} and
|
||||
Let $\gamma=\gamma(\circuit)$. Further let it be the case that $\prob_i\ge \prob_0$ for all $i\in[\numvar]$. Then an estimate $\mathcal{E}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$
|
||||
satisfying
|
||||
%Let $\poly(\vct{X})$ be as in \Cref{lem:approx-alg} and
|
||||
Let $\gamma=\gamma(\circuit)$. Further let it be the case that $\prob_i\ge \prob_0$ for all $i\in[\numvar]$. Then an estimate $\mathcal{E}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$
|
||||
satisfying
|
||||
\begin{equation}
|
||||
\label{eq:approx-algo-bound-main}
|
||||
\probOf\left(\left|\mathcal{E} - \rpoly(\prob_1,\dots,\prob_\numvar)\right|> \error' \cdot \rpoly(\prob_1,\dots,\prob_\numvar)\right) \leq \conf
|
||||
|
@ -130,7 +131,7 @@ O\left(\left(\size(\circuit) + \frac{\log{\frac{1}{\conf}}\cdot k\cdot \log{k} \
|
|||
In particular, if $\prob_0>0$ and $\gamma<1$ are absolute constants then the above runtime simplifies to $O_k\left(\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)$.
|
||||
\end{Theorem}
|
||||
|
||||
The restriction on $\gamma$ is satisfied by any \ti (where $\gamma=0$) as well as for all three queries of the PDBench \bi benchmark (see \Cref{app:subsec:experiment} for experimental results).
|
||||
The restriction on $\gamma$ is satisfied by any \ti (where $\gamma=0$) as well as for all three queries of the PDBench \bi benchmark (see \Cref{app:subsec:experiment} for experimental results).
|
||||
|
||||
We briefly connect the runtime in \Cref{eq:approx-algo-runtime} to the algorithm outline earlier (where we ignore the dependence on $\multc{\cdot}{\cdot}$, which is needed to handle the cost of arithmetic operations over integers). The $\size(\circuit)$ comes from the time take to run \onepass once (\onepass essentially computes $\abs{\circuit}(1,\ldots, 1)$ using the natural circuit evaluation algorithm on $\circuit$). We make $\frac{\log{\frac{1}{\conf}}}{\inparen{\error'}^2\cdot(1-\gamma)^2\cdot \prob_0^{2k}}$ many calls to \sampmon (each of which essentially traces $O(k)$ random sink to source paths in $\circuit$ all of which by definition have length at most $\depth(\circuit)$).
|
||||
|
||||
|
@ -154,7 +155,7 @@ Finally, note that by \Cref{prop:circuit-depth} and \Cref{lem:circ-model-runtime
|
|||
\label{cor:approx-algo-punchline}
|
||||
Let $\query$ be an $\raPlus$ query and $\pdb$ be an \abbrBIDB with $p_0>0$ and $\gamma<1$ (where $p_0,\gamma$ as in \Cref{cor:approx-algo-const-p}) are absolute constants. Let $\poly(\vct{X})=\apolyqdt$ for any result tuple $\tup$ with $\deg(\poly)=k$. Then one can compute an approximation satisfying \Cref{eq:approx-algo-bound-main} in time $O_{k,|Q|,\error',\conf}\inparen{\qruntime{\query, \dbbase}}$ (given $\query,\dbbase$ and $p_i$ for each $i\in [n]$ that defines $\pd$).
|
||||
%Let $\poly(\vct{X})$ be a \abbrBIDB-lineage polynomial correspoding to an \abbrBIDB circuit $\circuit$ that satisfies the specific conditions in \Cref{lem:val-ub}. Then one can compute an approximation satisfying \Cref{eq:approx-algo-bound-main} in time
|
||||
% $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)$. % for the case when $\circuit$ satisfies the specific conditions in \Cref{lem:val-ub}.
|
||||
% $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)$. % for the case when $\circuit$ satisfies the specific conditions in \Cref{lem:val-ub}.
|
||||
\end{Corollary}
|
||||
If we want to approximate the expected multiplicities of all $Z=O(n^k)$ result tuples $\tup$ simultaneously, we just need to run the above result with $\conf$ replaced by $\frac \conf Z$. Note this increases the runtime by only a logarithmic factor.
|
||||
|
||||
|
|
Loading…
Reference in New Issue