Done with my pass
parent
7351a432ad
commit
d0b2c75d63
|
@ -2,6 +2,23 @@
|
|||
|
||||
%\input{app_approx-alg-pseudo-code}
|
||||
|
||||
The following results assume input circuit \circuit computed from an arbitrary $\raPlus$ query $\query$ and arbitrary \abbrBIDB $\pdb$. We refer to \circuit as a \abbrBIDB circuit.
|
||||
\AH{Verify that the proof for \Cref{lem:approx-alg} doesn't rely on properties of $\raPlus$ or \abbrBIDB.}
|
||||
\begin{Theorem}\label{lem:approx-alg}
|
||||
Let \circuit be an arbitrary \abbrBIDB circuit %for a UCQ over \bi
|
||||
and define $\poly(\vct{X})=\polyf(\circuit)$ and let $k=\degree(\circuit)$.
|
||||
Then an estimate $\mathcal{E}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ can be computed in time
|
||||
{\small
|
||||
\[O\left(\left(\size(\circuit) + \frac{\log{\frac{1}{\conf}}\cdot \abs{\circuit}^2(1,\ldots, 1)\cdot k\cdot \log{k} \cdot \depth(\circuit))}{\inparen{\error}^2\cdot\rpoly^2(\prob_1,\ldots, \prob_\numvar)}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)\]
|
||||
}
|
||||
such that
|
||||
\begin{equation}
|
||||
\label{eq:approx-algo-bound}
|
||||
\probOf\left(\left|\mathcal{E} - \rpoly(\prob_1,\dots,\prob_\numvar)\right|> \error \cdot \rpoly(\prob_1,\dots,\prob_\numvar)\right) \leq \conf.
|
||||
\end{equation}
|
||||
\end{Theorem}
|
||||
\AR{\textbf{Aaron:}Just copied over from S4. The above text might need smoothening into the appendix.}
|
||||
|
||||
\subsection{Proof of Theorem \ref{lem:approx-alg}}\label{sec:proof-lem-approx-alg}
|
||||
\input{app_approx-alg-pseudo-code}
|
||||
In order to prove \Cref{lem:approx-alg}, we will need to argue the correctness of \approxq, which relies on the correctness of auxiliary algorithms \onepass and \sampmon.
|
||||
|
|
132
approx_alg.tex
132
approx_alg.tex
|
@ -4,13 +4,13 @@
|
|||
\section{$1 \pm \epsilon$ Approximation Algorithm}\label{sec:algo}
|
||||
|
||||
In \Cref{sec:hard}, we showed that the answer to \Cref{prob:intro-stmt} is no.
|
||||
With this result, we now design an approximation algorithm for our problem that runs in $\bigO{\abs{\circuit}}$.\footnote{For a very broad class of circuits: please see the discussion after \Cref{lem:val-ub} for more.}
|
||||
The folowing approximation algorithm applies to \abbrBIDB lineage polynomials (over $\raPlus$ queries), though our bounds are more meaningful for a non-trivial subclass of queries over \bis that contains all queries on \tis, as well as the queries of the PDBench benchmark~\cite{pdbench}. As before, all proofs and pseudocode can be found in \Cref{sec:proofs-approx-alg}.
|
||||
With this result, we now design an approximation algorithm for our problem that runs in $\bigO{\abs{\circuit}}$ for a very broad class of circuits (see the discussion after \Cref{lem:val-ub} for more).
|
||||
The folowing approximation algorithm applies to \abbrBIDB lineage polynomials (over $\raPlus$ queries), though our bounds are more meaningful for a non-trivial subclass of queries over \bis that contains all queries on \tis, as well as the queries of the PDBench benchmark~\cite{pdbench}. All proofs and pseudocode can be found in \Cref{sec:proofs-approx-alg}.
|
||||
%it is then desirable to have an algorithm to approximate the multiplicity in linear time, which is what we describe next.
|
||||
|
||||
\subsection{Preliminaries and some more notation}
|
||||
|
||||
We now introduce useful definitions and notation related to circuits and polynomials.
|
||||
We now introduce definitions and notation related to circuits and polynomials that we will need to state our upper bound results.
|
||||
|
||||
\begin{Definition}[$\expansion{\circuit}$]\label{def:expand-circuit}
|
||||
For a circuit $\circuit$, we define $\expansion{\circuit}$ as a list of tuples $(\monom, \coef)$, where $\monom$ is a set of variables and $\coef \in \domN$.
|
||||
|
@ -49,29 +49,12 @@ $\degree(\circuit)$ is defined recursively as follows:
|
|||
\]
|
||||
\end{Definition}
|
||||
|
||||
Finally, we use the following notation for the complexity of multiplying integers:
|
||||
Next, we use the following notation for the complexity of multiplying integers:
|
||||
\begin{Definition}[$\multc{\cdot}{\cdot}$]\footnote{We note that when doing arithmetic operations on the RAM model for input of size $N$, we have that $\multc{O(\log{N})}{O(\log{N})}=O(1)$. More generally we have $\multc{N}{O(\log{N})}=O(N\log{N}\log\log{N})$.}
|
||||
In a RAM model of word size of $W$-bits, $\multc{M}{W}$ denotes the complexity of multiplying two integers represented with $M$-bits. (We will assume that for input of size $N$, $W=O(\log{N})$.
|
||||
In a RAM model of word size of $W$-bits, $\multc{M}{W}$ denotes the complexity of multiplying two integers represented with $M$-bits. (We will assume that for input of size $N$, $W=O(\log{N})$.)
|
||||
\end{Definition}
|
||||
|
||||
\subsection{Our main result}\label{sec:algo:sub:main-result}
|
||||
The following results assume input circuit \circuit computed from an arbitrary $\raPlus$ query $\query$ and arbitrary \abbrBIDB $\pdb$. We refer to \circuit as a \abbrBIDB circuit.
|
||||
\AH{Verify that the proof for \Cref{lem:approx-alg} doesn't rely on properties of $\raPlus$ or \abbrBIDB.}
|
||||
\begin{Theorem}\label{lem:approx-alg}
|
||||
Let \circuit be an arbitrary \abbrBIDB circuit %for a UCQ over \bi
|
||||
and define $\poly(\vct{X})=\polyf(\circuit)$ and let $k=\degree(\circuit)$.
|
||||
Then an estimate $\mathcal{E}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ can be computed in time
|
||||
{\small
|
||||
\[O\left(\left(\size(\circuit) + \frac{\log{\frac{1}{\conf}}\cdot \abs{\circuit}^2(1,\ldots, 1)\cdot k\cdot \log{k} \cdot \depth(\circuit))}{\inparen{\error}^2\cdot\rpoly^2(\prob_1,\ldots, \prob_\numvar)}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)\]
|
||||
}
|
||||
such that
|
||||
\begin{equation}
|
||||
\label{eq:approx-algo-bound}
|
||||
\probOf\left(\left|\mathcal{E} - \rpoly(\prob_1,\dots,\prob_\numvar)\right|> \error \cdot \rpoly(\prob_1,\dots,\prob_\numvar)\right) \leq \conf.
|
||||
\end{equation}
|
||||
\end{Theorem}
|
||||
|
||||
To get linear runtime results from \Cref{lem:approx-alg}, we will need to define another parameter modeling the (weighted) number of monomials in %$\poly\inparen{\vct{X}}$
|
||||
Finally, to get linear runtime results from \Cref{lem:approx-alg}, we will need to define another parameter modeling the (weighted) number of monomials in %$\poly\inparen{\vct{X}}$
|
||||
$\expansion{\circuit}$
|
||||
that need to be `canceled' when monomials with dependent variables are removed (\Cref{def:reduced-bi-poly}). %def:hen it is modded with $\mathcal{B}$ (\Cref{def:mod-set-polys}).
|
||||
Let $\isInd{\cdot}$ be a boolean function returning true if monomial $\encMon$ is composed of independent variables and false otherwise; further, let $\indicator{\theta}$ also be a boolean function returning true if $\theta$ evaluates to true.
|
||||
|
@ -82,36 +65,14 @@ Given a \abbrBIDB circuit $\circuit$ define
|
|||
\[\gamma(\circuit)=\frac{\sum_{(\monom, \coef)\in \expansion{\circuit}} \abs{\coef}\cdot \indicator{\neg\isInd{\encMon}} }%\encMon\mod{\mathcal{B}}\equiv 0}}
|
||||
{\abs{\circuit}(1,\ldots, 1)}.\]
|
||||
\end{Definition}
|
||||
|
||||
\noindent We next present a few corollaries of \Cref{lem:approx-alg}.
|
||||
\begin{Corollary}
|
||||
\label{cor:approx-algo-const-p}
|
||||
Let $\poly(\vct{X})$ be as in \Cref{lem:approx-alg} and let $\gamma=\gamma(\circuit)$ for \abbrBIDB circuit \circuit. Further let it be the case that $\prob_i\ge \prob_0$ for all $i\in[\numvar]$. Then an estimate $\mathcal{E}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ satisfying \Cref{eq:approx-algo-bound} can be computed in time
|
||||
\[O\left(\left(\size(\circuit) + \frac{\log{\frac{1}{\conf}}\cdot k\cdot \log{k} \cdot \depth(\circuit))}{\inparen{\error'}^2\cdot(1-\gamma)^2\cdot \prob_0^{2k}}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)\]
|
||||
In particular, if $\prob_0>0$ and $\gamma<1$ are absolute constants then the above runtime simplifies to $O_k\left(\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)$.
|
||||
\end{Corollary}
|
||||
|
||||
The restriction on $\gamma$ is satisfied by any \ti (where $\gamma=0$) as well as for all three queries of the PDBench \bi benchmark (see \Cref{app:subsec:experiment} for experimental results).
|
||||
|
||||
Finally, we address the $\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}$ term in the runtime. %In \Cref{susec:proof-val-up}, we show the following:
|
||||
\begin{Lemma}
|
||||
\label{lem:val-ub}
|
||||
For any \abbrBIDB circuit $\circuit$ with $\degree(\circuit)=k$, we have
|
||||
$\abs{\circuit}(1,\ldots, 1)\le 2^{2^k\cdot \size(\circuit)}.$
|
||||
Further, under either of the following conditions:
|
||||
\begin{enumerate}
|
||||
\item $\circuit$ is a tree,
|
||||
\item $\circuit$ encodes the run of the algorithm on a FAQ~\cite{DBLP:conf/pods/KhamisNR16}\AH{AJAR citation.} query,
|
||||
\end{enumerate}
|
||||
we have $\abs{\circuit}(1,\ldots, 1)\le \size(\circuit)^{O(k)}.$
|
||||
\end{Lemma}
|
||||
|
||||
Note that the above implies that with the assumption $\prob_0>0$ and $\gamma<1$ are absolute constants from \Cref{cor:approx-algo-const-p}, then the runtime there simplies to $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)^2\cdot \log{\frac{1}{\conf}}\right)$ for general circuits $\circuit$ and to $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)$ for the case when $\circuit$ satisfies the specific conditions in \Cref{lem:val-ub}. In \Cref{app:proof-lem-val-ub} we argue that these conditions are very general and encompass many interesting scenarios, including query evaluation under \raPlus or FAQ.
|
||||
\AH{AJAR reference.}
|
||||
\subsection{Our main result}\label{sec:algo:sub:main-result}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Approximating $\rpoly$}
|
||||
We prove \Cref{lem:approx-alg} by developing an approximation algorithm (\approxq pseudo code in \Cref{sec:proof-lem-approx-alg}) with the desired runtime. This algorithm is based on the following observation.
|
||||
\mypar{Algorithm Idea}
|
||||
%We prove \Cref{lem:approx-alg} by developing an
|
||||
Our approximation algorithm (\approxq pseudo code in \Cref{sec:proof-lem-approx-alg})
|
||||
%with the desired runtime. This algorithm
|
||||
is based on the following observation.
|
||||
% The algorithm (\approxq detailed in \Cref{alg:mon-sam}) to prove \Cref{lem:approx-alg} follows from the following observation.
|
||||
Given a lineage polynomial $\poly(\vct{X})=\polyf(\circuit)$ for circuit \circuit over $\bi$, we have: % can exactly represent $\rpoly(\vct{X})$ as follows:
|
||||
|
||||
|
@ -126,9 +87,76 @@ Given a lineage polynomial $\poly(\vct{X})=\polyf(\circuit)$ for circuit \circui
|
|||
|
||||
Given the above, the algorithm is a sampling based algorithm for the above sum: we sample (via \sampmon) $(\monom,\coef)\in \expansion{\circuit}$ with probability proportional
|
||||
to $\abs{\coef}$ and compute $\vari{Y}=\indicator{\isInd{\encMon}}
|
||||
\cdot \prod_{X_i\in \monom} p_i$. Taking $\ceil{\frac{2 \log{\frac{2}{\conf}}}{\error^2}}$ samples and computing the average of $\vari{Y}$ gives us our final estimate. \onepass is used to compute the sampling probabilities needed in \sampmon (details are in \Cref{sec:proofs-approx-alg}).
|
||||
\cdot \prod_{X_i\in \monom} p_i$. %Taking $\ceil{\frac{2 \log{\frac{2}{\conf}}}{\error^2}}$ samples
|
||||
Repeating the sampling appropriate number of times
|
||||
and computing the average of $\vari{Y}$ gives us our final estimate. \onepass is used to compute the sampling probabilities needed in \sampmon (details are in \Cref{sec:proofs-approx-alg}).
|
||||
%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
||||
%The following results assume input circuit \circuit computed from an arbitrary $\raPlus$ query $\query$ and arbitrary \abbrBIDB $\pdb$. We refer to \circuit as a \abbrBIDB circuit.
|
||||
%\AH{Verify that the proof for \Cref{lem:approx-alg} doesn't rely on properties of $\raPlus$ or \abbrBIDB.}
|
||||
%\begin{Theorem}\label{lem:approx-alg}
|
||||
%Let \circuit be an arbitrary \abbrBIDB circuit %for a UCQ over \bi
|
||||
%and define $\poly(\vct{X})=\polyf(\circuit)$ and let $k=\degree(\circuit)$.
|
||||
%Then an estimate $\mathcal{E}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$ can be computed in time
|
||||
%{\small
|
||||
%\[O\left(\left(\size(\circuit) + \frac{\log{\frac{1}{\conf}}\cdot \abs{\circuit}^2(1,\ldots, 1)\cdot k\cdot \log{k} \cdot \depth(\circuit))}{\inparen{\error}^2\cdot\rpoly^2(\prob_1,\ldots, \prob_\numvar)}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)\]
|
||||
%}
|
||||
%such that
|
||||
%\begin{equation}
|
||||
%\label{eq:approx-algo-bound}
|
||||
%\probOf\left(\left|\mathcal{E} - \rpoly(\prob_1,\dots,\prob_\numvar)\right|> \error \cdot \rpoly(\prob_1,\dots,\prob_\numvar)\right) \leq \conf.
|
||||
%\end{equation}
|
||||
%\end{Theorem}
|
||||
|
||||
|
||||
\mypar{Runtime analysis} We can argue the following runtime for the algorithm outlined above:
|
||||
% We next present a few corollaries of \Cref{lem:approx-alg}.
|
||||
\begin{Theorem}
|
||||
\label{cor:approx-algo-const-p}
|
||||
Let \circuit be an arbitrary \abbrBIDB circuit %for a UCQ over \bi
|
||||
and define $\poly(\vct{X})=\polyf(\circuit)$ and let $k=\degree(\circuit)$.
|
||||
%Let $\poly(\vct{X})$ be as in \Cref{lem:approx-alg} and
|
||||
Let $\gamma=\gamma(\circuit)$ for \abbrBIDB circuit \circuit. Further let it be the case that $\prob_i\ge \prob_0$ for all $i\in[\numvar]$. Then an estimate $\mathcal{E}$ of $\rpoly(\prob_1,\ldots, \prob_\numvar)$
|
||||
satisfying
|
||||
\begin{equation}
|
||||
\label{eq:approx-algo-bound-main}
|
||||
\probOf\left(\left|\mathcal{E} - \rpoly(\prob_1,\dots,\prob_\numvar)\right|> \error' \cdot \rpoly(\prob_1,\dots,\prob_\numvar)\right) \leq \conf
|
||||
\end{equation}
|
||||
can be computed in time
|
||||
\begin{equation}
|
||||
\label{eq:approx-algo-runtime}
|
||||
O\left(\left(\size(\circuit) + \frac{\log{\frac{1}{\conf}}\cdot k\cdot \log{k} \cdot \depth(\circuit))}{\inparen{\error'}^2\cdot(1-\gamma)^2\cdot \prob_0^{2k}}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right).
|
||||
\end{equation}
|
||||
In particular, if $\prob_0>0$ and $\gamma<1$ are absolute constants then the above runtime simplifies to $O_k\left(\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)\cdot\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}\right)$.
|
||||
\end{Theorem}
|
||||
|
||||
The restriction on $\gamma$ is satisfied by any \ti (where $\gamma=0$) as well as for all three queries of the PDBench \bi benchmark (see \Cref{app:subsec:experiment} for experimental results).
|
||||
|
||||
We briefly connect the runtime in \Cref{eq:approx-algo-runtime} to the algorithm outline earlier (where we ignore the dependence on $\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}$). The $\size(\circuit)$ comes from the time take to run \onepass once (\onepass essentially computes $\abs{\circuit}(1,\ldots, 1)$ using the natural circuit evaluation algorithm on $\circuit$). We make $\frac{\log{\frac{1}{\conf}}}{\inparen{\error'}^2\cdot(1-\gamma)^2\cdot \prob_0^{2k}}$ many calls to \sampmon (each of which essentially traces $O(k)$ random sink to source paths in $\circuit$ all of which by definition have length at most $\depth(\circuit)$.
|
||||
|
||||
Finally, we address the $\multc{\log\left(\abs{\circuit}(1,\ldots, 1)\right)}{\log\left(\size(\circuit)\right)}$ term in the runtime. %In \Cref{susec:proof-val-up}, we show the following:
|
||||
\begin{Lemma}
|
||||
\label{lem:val-ub}
|
||||
For any \abbrBIDB circuit $\circuit$ with $\degree(\circuit)=k$, we have
|
||||
$\abs{\circuit}(1,\ldots, 1)\le 2^{2^k\cdot \size(\circuit)}.$
|
||||
Further, under either of the following conditions:
|
||||
\begin{enumerate}
|
||||
\item $\circuit$ is a tree,
|
||||
\item $\circuit$ encodes the run of the algorithm on a FAQ~\cite{DBLP:conf/pods/KhamisNR16}/AJAR~\cite{ajar} query,
|
||||
\end{enumerate}
|
||||
we have $\abs{\circuit}(1,\ldots, 1)\le \size(\circuit)^{O(k)}.$
|
||||
\end{Lemma}
|
||||
|
||||
Note that the above implies that with the assumption $\prob_0>0$ and $\gamma<1$ are absolute constants from \Cref{cor:approx-algo-const-p}, then the runtime there simplies to $O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)^2\cdot \log{\frac{1}{\conf}}\right)$ for general circuits $\circuit$ and
|
||||
\begin{Corollary}
|
||||
Let $\poly(\vct{X})$ be a \abbrBIDB-lineage polynomial correspoding to an \abbrBIDB circuit $\circuit$ that satisfies the specific conditions in \Cref{lem:val-ub}. Then one can compute an approximation satisfying \Cref{eq:approx-algo-bound-main} in time
|
||||
$O_k\left(\frac 1{\inparen{\error'}^2}\cdot\size(\circuit)\cdot \log{\frac{1}{\conf}}\right)$. % for the case when $\circuit$ satisfies the specific conditions in \Cref{lem:val-ub}.
|
||||
\end{Corollary}
|
||||
\AR{The above Corollary needs to be improved/generalized. This is a place-holder for now.}
|
||||
In \Cref{app:proof-lem-val-ub} we argue that these conditions are very general and encompass many interesting scenarios, including query evaluation under FAQ/AJAR setup.
|
||||
%\AH{AJAR reference.}
|
||||
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "main"
|
||||
|
|
|
@ -1,17 +1,18 @@
|
|||
%!TEX root=./main.tex
|
||||
\section{Conclusions and Future Work}\label{sec:concl-future-work}
|
||||
|
||||
We have studied the problem of calculating the expectation of lineage polynomials over BIDBs. %random integer variables.
|
||||
This problem has a practical application in probabilistic databases over multisets, where it corresponds to calculating the expected multiplicity of a query result tuple.
|
||||
We have studied the problem of calculating the expected multiplicity of a query result tuple, %expectation of lineage polynomials over BIDBs. %random integer variables.
|
||||
a problem that has a practical application in probabilistic databases over multisets. %, where it corresponds to calculating the expected multiplicity of a query result tuple.
|
||||
% It has been studied extensively for sets (lineage formulas), but the bag settings has not received much attention.
|
||||
While the expectation of a polynomial can be calculated in linear time for % in the size of
|
||||
polynomials % that are
|
||||
in SOP form, the problem is \sharpwonehard for factorized polynomials (proven through a reduction from the problem of counting k-matchings).
|
||||
%While the expectation of a polynomial can be calculated in linear time for % in the size of
|
||||
% polynomials % that are
|
||||
%in SOP form, the problem is \sharpwonehard for factorized polynomials (proven through a reduction from the problem of counting k-matchings).
|
||||
%We have proven this claim through a reduction from the problem of counting k-matchings.
|
||||
We show that under various parameterized complexity hardness results/conjectures computing the expected multiplicities exactly is not possible in time linear in the corresponding deterministic query processing time.
|
||||
We prove that it is possible to approximate the expectation of a lineage polynomial in linear time
|
||||
% When only considering polynomials for result tuples of
|
||||
for UCQs over TIDBs and BIDBs (assuming that there are few cancellations).
|
||||
Interesting directions for future work include development of a dichotomy for bag \abbrPDB\xplural. While we handle higher moments in \Cref{app:sec-cicuits}, more general approximations are an interesting area for exploration, including those for more general data models. % beyond what we consider in this paper.
|
||||
in the deterministic query processing over TIDBs and BIDBs (assuming that there are few cancellations).
|
||||
Interesting directions for future work include development of a dichotomy for bag \abbrPDB\xplural. While we can handle higher moments (this follows fairly easily from our existing results-- see \Cref{app:sec-cicuits}), more general approximations are an interesting area for exploration, including those for more general data models. % beyond what we consider in this paper.
|
||||
% Furthermore, it would be interesting to see whether our approximation algorithm can be extended to support queries with negations, perhaps using circuits with monus as a representation system.
|
||||
|
||||
% \BG{I am not sure what interesting future work is here. Some wild guesses, if anybody agrees I'll try to flesh them out:
|
||||
|
|
|
@ -17,14 +17,14 @@ Dalvi et al.~\cite{DS12} and Olteanu et al.~\cite{FO16} proved dichotomies for U
|
|||
% Olteanu et al.~\cite{FO16} presented dichotomies for two classes of queries with negation.
|
||||
% R\'e et al~\cite{RS09b} present a trichotomy for HAVING queries.
|
||||
Amarilli et al. investigated tractable classes of databases for more complex queries~\cite{AB15}. %,AB15c
|
||||
Another line of work, studies which structural properties of lineage formulas lead to tractable cases~\cite{kenig-13-nclexpdc,roy-11-f,sen-10-ronfqevpd}.
|
||||
Another line of work studies which structural properties of lineage formulas lead to tractable cases~\cite{kenig-13-nclexpdc,roy-11-f,sen-10-ronfqevpd}.
|
||||
In this paper we focus on intensional query evaluation with polynomials.
|
||||
|
||||
Many data models have been proposed for encoding PDBs more compactly than as sets of possible worlds.
|
||||
These include tuple-independent databases~\cite{VS17} (\tis), block-independent databases (\bis)~\cite{RS07}, and \emph{PC-tables}~\cite{GT06}.
|
||||
%
|
||||
Fink et al.~\cite{FH12} study aggregate queries over a probabilistic version of the extension of K-relations for aggregate queries proposed in~\cite{AD11d} (\emph{pvc-tables}) that supports bags, and has runtime complexity linear in the size of the lineage.
|
||||
However, this lineage is encoded as a tree; the size (and thus the runtime) are still superlinear in $\qruntime(\query, \dbbase)$.
|
||||
However, this lineage is encoded as a tree; the size (and thus the runtime) are still superlinear in $\qruntime{\query, \dbbase}$.
|
||||
The runtime bound is also limited to a specific class of (hierarchical) queries, suggesting the possibility of a generalization of \cite{DS12}'s dichotomy result to \abbrBPDB\xplural.
|
||||
% Probabilities are computed using a decomposition approach~\cite{DBLP:conf/icde/OlteanuHK10}.
|
||||
% over the symbolic expressions that are used as tuple annotations and values in pvc-tables.
|
||||
|
|
Loading…
Reference in New Issue