More trims

main
Oliver Kennedy 2023-03-30 16:36:33 -04:00
parent 021852f294
commit d4735436f6
Signed by: okennedy
GPG Key ID: 3E5F9B3ABD3FDB60
1 changed files with 15 additions and 18 deletions

View File

@ -204,18 +204,17 @@ $
\subsection{Overlay Updates}
\label{sec:overlay-updates}
An Overlay Update describes a set of changes to a target spreadsheet (or dataset).
Changes may include cell updates as already discussed, or the insertion, deletion, or reordering of rows or columns.
As we discuss in \Cref{sec:system-presentation}, column operations are purely cosmetic in our model, and so we focus on cell and row updates.
An Overlay Update describes a set of changes to a spreadsheet (or dataset).
As we discuss in \Cref{sec:system-presentation}, column operations are purely cosmetic in our model, and we focus on cell and row updates.
Concretely, a spreadsheet overlay $\overlay = \aol$ is a reference frame transformation $\rtrans$ and a set of pattern updates $\oup$, terms we now define.
% We now define these terms, and discuss their semantics.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\partitle{Reference Frame Transformations}
Recall that the spreadsheet's positional row references are translated into the native record format of the source dataset through a mapping function called a reference frame.
To insert, delete, or move rows in the spreadsheet, it is sufficient to simply modify the reference frame.
Formally, a reference frame transformation $\rtrans$ is an injective mapping $\mathbb{Z} \to \mathbb{Z} \cup \errorval$ from an initial set of row positions to a new set of row positions, or the value $\errorval$ to indicate a deleted row.
The new reference frame for the spreadsheet overlay after applying $\overlay$ is $\rframe' = \rtrans \circ \mathcal F$, where $\circ$ denotes function composition.
Recall that a reference frame maps the spreadsheet's positional row references to native record identifiers.
Thus, to insert, delete, or move rows in the spreadsheet, it is sufficient to modify the reference frame.
Formally, a reference frame transformation $\rtrans$ is an injective mapping $\mathbb{Z} \to \mathbb{Z} \cup \errorval$ from an initial set of row positions to a new set of row positions, or the value $\errorval$ for a deleted row.
The new reference frame, after applying $\overlay$ is $\rframe' = \rtrans \circ \mathcal F$, where $\circ$ denotes function composition.
As an example, consider deleting the 2nd row of the spreadsheet from \Cref{fig:example-spreadsheet-and-a}. The positions of rows $3$ and $4$ are decreased by one, while row $1$ retains its position
$$\rtrans(x) = \begin{cases}
x & \textbf{if } x < 2\\
@ -224,15 +223,14 @@ $$\rtrans(x) = \begin{cases}
\end{cases}$$
Row insertions and movement are handled analogously.
Note that row insertions, deletions, and movement are each expressible in constant size, independent of the size of the data.
Note that row insertions, deletions, and movement are expressible in constant size, independent of the size of the data.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\partitle{Pattern Updates}
Spreadsheets allow users to prototype a formula in one cell and then generalize the formula by copying and pasting it into a range of cells.
Spreadsheets allow a formula from one cell to be pasted across a range of cells.
%\footnote{``Relative" column and row references are updated to be relative to each cell the formula is pasted into.}
Such bulk interactions pose a challenge for state models that maintain an expression for each cell.
For example, a user might paste a formula into an entire column, creating one expression for each row of the dataset.
In lieu of this, an overlay update groups together the set of pasted cells into a single \emph{pattern}.
In a classical spreadsheet, bulk interactions like this modify each cell's expression individually.
Overlay spreadsheets avoid the high cost that individual modifications can entail by grouping together the set of pasted cells into a single \emph{pattern}.
A \emph{range} $\rangeOf{\columnRange}{\rowRange}$ is the Cartesian product $\columnRange \times [l,h]$ of a set of columns ($\columnRange \subseteq \columnDomain$) and row positions ($R \subset \mathbb{Z}$).
%
@ -240,7 +238,7 @@ A pattern update $\oup$ is a set of pairs $\{ (\rangeOf{C_i}{R_i}, \pattern_i) \
Ranges $\rangeOf{C_i}{R_i}$ must be pairwise disjoint.
A pattern update $(\rangeOf{C_i}{R_i}, \pattern_i)$ assigns an expression to every cell $(\column, \row)$ in $\rangeOf{C_i}{R_i}$ by replacing any relative references of the form $(\column, \delta)$ in $\pattern_i$ with $(\column, \row + \delta)$. We use $\pattern_i(\cell)$ to denote the instantiation of pattern $\pattern_i$ for cell $\cell$.
For instance, to store a running sum of the values in column \emph{C} as the cell values in column \emph{D} for the spreadsheet from \Cref{fig:example-spreadsheet-and-a}:\\[-2mm]
For instance, to store a running sum of the values in column \emph{C} into column \emph{D} (for the spreadsheet from \Cref{fig:example-spreadsheet-and-a}):\\[-2mm]
%
\[
\oup_{running} = (\rangeOf{D}{1}, (C,+0)), (\rangeOf{D}{2-4}, (C,+0) + (D,-1))
@ -302,19 +300,18 @@ An overlay update $\overlay$ appleid to a spreadsheet $\spreadsheet$ defines th
\begin{example}
\label{ex:recursive-running-sum}
Consider applying our example update ($\overlay_{running} = (\rtrans_{id},\oup_{running})$ where $\rframe_{id}(x) = x$) to our running example spreadsheet.
The result is shown in \Cref{fig:example-overlay-update}. The column $D$ now computes the running sum of column $C$.
Consider our example update ($\overlay_{running} = (\rtrans_{id},\oup_{running})$ where $\rtrans_{id}(x) = x$) to our running example spreadsheet.
\Cref{fig:example-overlay-update} shows the result of applying $\overlay_{running}$
\end{example}
Several remarks are in order. First, note that overlays can be used to encode common spreadsheet update operations in constant space (per update), including bulk updates via copy/paste.
Second, \cite{tang-23-efcsfg} uses similar ideas to compress the dependencies in a spreadsheet using ranges and patterns, but focuses exclusively on the dependency graph and not on compacting the spreadsheet itself.
Several remarks are in order. First, overlays can be used to encode common spreadsheet update operations in constant space (per update), including bulk updates via copy/paste.
Second, \cite{tang-23-efcsfg} uses similar ideas to compress the dependencies in a spreadsheet using ranges and patterns, but focuses exclusively on the dependency graph rather than expressions.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Replacing Source Data}
\label{sec:updating-datasets}
A major advantage of modeling spreadsheets as overlays is that source data may be updated;
An overlay designed for source data $(\ds, \rframe)$ may be applied to a dataset $(\ds', \rframe')$ as long as each $\row\in \rowDomain_{\ds}$ that corresponds to some $\row' \in \rowDomain_{\ds'}$, $\rframe'(\rframe^{-1}(\row)) = \row'$.
This is possible if, for example, $\rowDomain_{\ds}= \rowDomain_{\ds'}$ is a semantic key for the dataset.