Overlay Spreadsheets

Oliver Kennedy
okennedy@buffalo.edu
University at Buffalo
Buffalo
USA

Boris Glavic
bglavic@iit.edu
Illinois Institute of Technology
Illinois
USA

Michael Brachmann
mbrachmann@breadcrumb-analytics.com
Breadcrumb Analytics
Buffalo
USA Often, this list is too long, and will overlap %% other information printed in the page headers. This command allows %% the author to define a more concise list %% of authors' names for this purpose. \renewcommand{\shortauthors}{Kennedy et al.} %% %% The abstract is a short summary of the work to be presented in the %% article. \begin{abstract} % Spreadsheets provide a convenient, friendly direct manipulation interface to datasets. Efforts to scale spreadsheets either follow a `virtual` strategy that layers a spreadsheet interface on top of an existing database engine or a `materialized' strategy based on re-engineering a spreadsheet engine. Because databases are not optimized for spreadsheet access patterns, the materialized approach has better performance. However, the virtual approach offers several advantages that can not be easily replicated in the materialized approach, including the ability to re-apply user interactions to an updated input dataset. We propose the overlay update model, a hybrid approach that overlays user updates on an existing dataset (as in the virtual approach) and indexes user updates (as in the materialized approach). % We propose a hybrid approach, where patterns of user updates are indexed (as in the materialized approach) and overlaid on an existing dataset (as in the virtual approach). % We introduce the overlay update model, and outline strategies for efficiently accessing an overlay spreadsheet. A key feature of our approach is storing updates generated by bulk operations (e.g., copy/paste) as compact ``patterns" that can be leveraged to reduce execution costs. 