Cleaning up abstract.

main
Oliver Kennedy 2023-03-30 20:35:06 -04:00
parent cbfc0553fd
commit 55f6df55d6
Signed by: okennedy
GPG Key ID: 3E5F9B3ABD3FDB60
1 changed files with 6 additions and 15 deletions

View File

@ -122,23 +122,14 @@
%% The abstract is a short summary of the work to be presented in the
%% article.
\begin{abstract}
Spreadsheets provide a convenient % , friendly
direct manipulation interface to datasets.
Efforts to scale spreadsheets % have taken two approaches: A
either follow a `virtual` strategy that imposes a spreadsheet interface over an existing database engine or a `materialized' strategy based on re-engineering the spreadsheet engine % using % around
%standard database optimizations. % like indexes.
Because database engines are not optimized for spreadsheet access patterns,
% typically optimized for bulk query processing over interactive latencies,
the materialized approach has better performance.
However, the virtual approach offers several advantages that can not be easily replicated in the materialized approach, including % notably
the ability to re-apply user interactions to an updated dataset. % version of the same dataset.
We propose a hybrid % the materialized and virtual
approach, where patterns of user updates are indexed (as in the materialized approach) and overlaid on an existing dataset (as in the virtual approach).
Spreadsheets provide a convenient, friendly direct manipulation interface to datasets.
Efforts to scale spreadsheets either follow a `virtual` strategy that imposes a spreadsheet interface over an existing database engine or a `materialized' strategy based on re-engineering the spreadsheet engine.
Because database engines are not optimized for spreadsheet access patterns, the materialized approach has better performance.
However, the virtual approach offers several advantages that can not be easily replicated in the materialized approach, including the ability to re-apply user interactions to an updated dataset.
We propose a hybrid approach, where patterns of user updates are indexed (as in the materialized approach) and overlaid on an existing dataset (as in the virtual approach).
We introduce the overlay update model, and outline strategies for efficiently accessing an overlay spreadsheet.
A key feature of our approach is storing updates generated by bulk operations (e.g., copy/paste) as ``patterns" that can be leveraged to reduce execution costs.
We implement an overlay spreadsheet over Apache Spark and demonstrate that, compared to DataSpread, it can significantly reduce execution costs. % popular
% materialized spreadsheet.
% Our preliminary results show that overlay spreadsheets can significantly reduce execution costs.
We implement an overlay spreadsheet over Apache Spark and demonstrate that, compared to DataSpread, it can significantly reduce execution costs.
\end{abstract}
%%