20 lines
1.8 KiB
TeX
20 lines
1.8 KiB
TeX
%!TEX root = ../main.tex
|
|
\begin{abstract}
|
|
The database community has developed a plethora of tools and techniques for data curation and exploration, from declarative languages, to specialized techniques for data repair, and more.
|
|
Yet, there is currently no consensus on how to best expose these powerful tools to an analyst in a simple, intuitive, and above all, flexible way.
|
|
Thus, analysts continue to rely on tools such as spreadsheets, imperative languages, and notebook style programming environments like Jupyter for data curation.
|
|
In this work, we explore the %intersection
|
|
integration of spreadsheets, notebooks, and relational databases.
|
|
We focus on a key advantage that both spreadsheets and imperative notebook environments have over classical relational databases: ease of exception.
|
|
By relying on set-at-a-time operations, relational databases sacrifice the ability to easily define singleton operations, exceptions to a normal data processing workflow that affect query processing for a fixed set of explicitly targeted records.
|
|
In comparison, a spreadsheet user can easily change the formula for just one cell, while a notebook user can add an imperative operation to her notebook that alters an output ``view''.
|
|
We believe that enabling such idiosyncratic manual transformations in a classical relational database is critical for curation, as curation operations that are easy to declare for individual values can often be extremely challenging to generalize.
|
|
We explore the challenges of enabling singletons in relational databases, propose a hybrid spreadsheet/relational notebook environment for data curation, and present our vision of \sysname, a system that exposes data curation through such an interface.
|
|
\end{abstract}
|
|
|
|
|
|
%%% Local Variables:
|
|
%%% mode: latex
|
|
%%% TeX-master: "../main"
|
|
%%% End:
|