master
Boris Glavic 2018-10-02 12:48:30 -05:00
parent 651f2f95ad
commit a6c66f5978
2 changed files with 6 additions and 6 deletions

View File

@ -11,7 +11,7 @@ However, the heuristic steps in a data curation workflow frequently admit altern
%
\abbrUADBs{} have the potential for great practical impact since they combine the practicality and performance of \termBGQP with the rigor of certain answers.
Our proposed techniques can significantly improve many real world use cases which currently make decisions based on uncertain data with severe negative impact.
In addition to the potential of the proposed research itself, this grant will support three Ph.D. students and s one postdoctoral researcher.
In addition to the potential of the proposed research itself, this grant will support three Ph.D. students and one postdoctoral researcher.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\mypar{Integration of Research and Education}

View File

@ -2,7 +2,7 @@
\section{Research Thrust II - Design and Implementation of \sysname}
\label{sec:research-thrust-ii}
Thrust II addresses the challenge of translating principles developed in Thrust I into practice through a prototype \abbrUADB called \textit{\sysname} (\textit{Uncertainty for You}).
Thrust II addresses the challenge of translating the principles developed in Thrust I into practice through a prototype \abbrUADB called \textit{\sysname} (\textit{Uncertainty for You}).
We will address logistical challenges involved in managing labeled data and then realize \sysname in three stages.
In stage 1, we will realize a purely rewrite-based implementation based on our preliminary work~\cite{FH18} for incomplete bag semantics databases.
Then in stage 2 we will explore how augmenting the database through new data structures, algorithms, and cost-based optimization strategies can improve performance and accuracy.
@ -62,7 +62,7 @@ We will develop techniques for rewriting queries with uncertainty annotations, e
\end{sectionsummary}
In \cite{FH18} and \cite{DBLP:journals/corr/NandiYKGFLG16}, we developed query rewriting middleware that implemented a bag semantics \abbrUADB with tuple-level uncertainty using a classical relational database.
In \cite{FH18} and \cite{DBLP:journals/corr/NandiYKGFLG16}, we developed a query rewriting middleware that implemented a bag semantics \abbrUADB with tuple-level uncertainty using a classical relational database.
As the first stage of realizing \sysname{}, we will extend this middleware with support for: (1) Attribute-level uncertainty, (2) Aggregation, and if time permits (3) semirings other than bags.
The key challenge we will focus on at this stage is ensuring performance, while retaining backwards compatibility by not requiring any fundamental changes to the underlying database.
@ -129,7 +129,7 @@ For such attributes, we will explore gracefully degrading to simpler boolean (i.
\mypar{Supporting Aggregation}
Aggregate queries introduce an additional layer of complexity.
For example take the query: \lstinline{SELECT SUM(guests) AS total FROM PARTICIPANTS}.
The aggregation function result (the value of \lstinline{total}) can be uncertain in multiple ways: (1) if the existence of at least one input tuple is uncertain, then the aggregation function result cannot be the same in worlds that include this tuple and worlds that do not include this tuple (unless that \lstinline{guests} value of this tuple is $0$ in every possible world and (2) even if the existence of all input tuples is certain, the aggregation function result may still be uncertain if the \lstinline!guests! attribute of one of these tuples differs across possible worlds (is uncertain).
The aggregation function result (the value of \lstinline{total}) can be uncertain in multiple ways: (1) if the existence of at least one input tuple is uncertain, then the aggregation function result cannot be the same in worlds that include this tuple and worlds that do not include this tuple (unless the \lstinline{guests} value of this tuple is $0$ in every possible world); and (2) even if the existence of all input tuples is certain, the aggregation function result may still be uncertain if the \lstinline!guests! attribute of one of these tuples differs across possible worlds (is uncertain).
% Uncertainty in the result attribute (e.g., the value of \lstinline{total}) requires one of the source rows to be different across possible worlds.
% \lstinline{total} being uncertain requires that (1) there must exist a certain tuple in \lstinline{PARTICIPANTS} with an uncertain value of \lstinline{guests}; or
% (2) there must exist a possible, but not certain, tuple in \lstinline{PARTICIPANTS}.
@ -175,11 +175,11 @@ Conversely, PI Glavic's GProM~\cite{DBLP:journals/debu/ArabFGLNZ17} encodes prov
\subsection{II-c: Database Engine Specializations for \abbrUADBs}
\begin{sectionsummary}
We will identify, realize, and evaluate new data-structures, algorithms, optimizer passes, and other internal improvements that will make \abbrUADBs{} more efficient and expressive than they could be made through query rewriting alone.
We will identify, realize, and evaluate new data-structures, algorithms, optimization techniques, and other internal improvements that will make \abbrUADBs{} more efficient and expressive than they could be made through query rewriting alone.
\end{sectionsummary}
Our first goal aims at supporting uncertainty-aware data management within existing relational databases.
Our second goal aims at supporting uncertainty-aware data management within existing relational databases.
However, ensuring full backwards-compatibility precludes many opportunities for optimization.
The third goal of this thrust is to consider architectural changes including new algorithms, data structures, and optimization techniques for improving \abbrUADB{} performance and utility.