paper-HILDA-2016-Spreadsheets/sections/abstract.tex

20 lines
1.8 KiB
TeX
Raw Normal View History

%!TEX root = ../main.tex
2016-04-24 12:50:41 -04:00
\begin{abstract}
2016-04-24 18:51:40 -04:00
The database community has developed a plethora of tools and techniques for data curation and exploration, from declarative languages, to specialized techniques for data repair, and more.
2016-04-25 20:02:16 -04:00
Yet, there is currently no consensus on how to best expose these powerful tools to an analyst in a simple, intuitive, and above all, flexible way.
2016-04-24 18:51:40 -04:00
Thus, analysts continue to rely on tools such as spreadsheets, imperative languages, and notebook style programming environments like Jupyter for data curation.
2016-04-25 20:02:16 -04:00
In this work, we explore the %intersection
integration of spreadsheets, notebooks, and relational databases.
We focus on a key advantage that both spreadsheets and imperative notebook environments have over classical relational databases: ease of exception.
By relying on set-at-a-time operations, relational databases sacrifice the ability to easily define singleton operations, exceptions to a normal data processing workflow that affect query processing for a fixed set of explicitly targeted records.
2016-04-24 18:51:40 -04:00
In comparison, a spreadsheet user can easily change the formula for just one cell, while a notebook user can add an imperative operation to her notebook that alters an output ``view''.
2016-04-25 20:02:16 -04:00
We believe that enabling such idiosyncratic manual transformations in a classical relational database is critical for curation, as curation operations that are easy to declare for individual values can often be extremely challenging to generalize.
2016-04-24 18:51:40 -04:00
We explore the challenges of enabling singletons in relational databases, propose a hybrid spreadsheet/relational notebook environment for data curation, and present our vision of \sysname, a system that exposes data curation through such an interface.
2016-04-24 12:50:41 -04:00
\end{abstract}
2016-04-24 13:03:33 -04:00
2016-04-24 12:50:41 -04:00
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "../main"
%%% End: