paper-BagRelationalPDBsAreHard/conclusions.tex

23 lines
1.9 KiB
TeX

%!TEX root=./main.tex
\section{Conclusions and Future Work}\label{sec:concl-future-work}
We have studied the problem of calculating the expectation of query polynomials over BIDBs. %random integer variables.
This problem has a practical application in probabilistic databases over multisets, where it corresponds to calculating the expected multiplicity of a query result tuple.
It has been studied extensively for sets (lineage formulas), but the bag settings has not received much attention.
While the expectation of a polynomial can be calculated in linear time in the size of polynomials that are in SOP form, the problem is \sharpwonehard for factorized polynomials.
We have proven this claim through a reduction from the problem of counting k-matchings.
When only considering polynomials for result tuples of UCQs over TIDBs and BIDBs (under the assumption that there are few cancellations), we prove that it is still possible to approximate the expectation of a polynomial in linear time.
Interesting directions for future work include development of a dichotomy for queries over bag PDBs and approximations for data models beyond what we consider in this paper.
% Furthermore, it would be interesting to see whether our approximation algorithm can be extended to support queries with negations, perhaps using circuits with monus as a representation system.
\BG{I am not sure what interesting future work is here. Some wild guesses, if anybody agrees I'll try to flesh them out:
\textbullet{More queries: what happens with negation can circuits with monus be used?}
\textbullet{More databases: can we push beyond BIDBs? E.g., C-tables / aggregate semimodules or just TIDBs where each input tuple is a random variable over $\mathbb{N}$?}
\textbullet{Other results: can we extend the work to approximate $P(R(t) = n)$}
}
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "main"
%%% End: