Initial commit

master
Oliver Kennedy 2016-04-08 20:17:12 -04:00
commit 2a9398d649
7 changed files with 2204 additions and 0 deletions

17
Makefile Normal file
View File

@ -0,0 +1,17 @@
TARGET=main
TEX_FILES=$(TARGET).tex $(wildcard ../*.bib) $(wildcard sections/*)
all: $(TARGET).pdf
@if [ `uname` = "Darwin" ] ; then open $(TARGET).pdf; fi
$(TARGET).pdf: $(TEX_FILES)
latexmk -pdf $(TARGET).tex
open: $(TARGET).pdf todo
open $<
clean:
latexmk -CA -bibtex
.PHONY: todo plot clean open collabs graphs split

221
acmcopyright.sty Normal file
View File

@ -0,0 +1,221 @@
%%
%% This is file `acmcopyright.sty',
%% generated with the docstrip utility.
%%
%% The original source files were:
%%
%% acmcopyright.dtx (with options: `style')
%%
%% IMPORTANT NOTICE:
%%
%% For the copyright see the source file.
%%
%% Any modified versions of this file must be renamed
%% with new filenames distinct from acmcopyright.sty.
%%
%% For distribution of the original source see the terms
%% for copying and modification in the file acmcopyright.dtx.
%%
%% This generated file may be distributed as long as the
%% original source files, as listed above, are part of the
%% same distribution. (The sources need not necessarily be
%% in the same archive or directory.)
%% \CharacterTable
%% {Upper-case \A\B\C\D\E\F\G\H\I\J\K\L\M\N\O\P\Q\R\S\T\U\V\W\X\Y\Z
%% Lower-case \a\b\c\d\e\f\g\h\i\j\k\l\m\n\o\p\q\r\s\t\u\v\w\x\y\z
%% Digits \0\1\2\3\4\5\6\7\8\9
%% Exclamation \! Double quote \" Hash (number) \#
%% Dollar \$ Percent \% Ampersand \&
%% Acute accent \' Left paren \( Right paren \)
%% Asterisk \* Plus \+ Comma \,
%% Minus \- Point \. Solidus \/
%% Colon \: Semicolon \; Less than \<
%% Equals \= Greater than \> Question mark \?
%% Commercial at \@ Left bracket \[ Backslash \\
%% Right bracket \] Circumflex \^ Underscore \_
%% Grave accent \` Left brace \{ Vertical bar \|
%% Right brace \} Tilde \~}
\NeedsTeXFormat{LaTeX2e}
\ProvidesPackage{acmcopyright}
[2014/06/29 v1.2 Copyright statemens for ACM classes]
\newif\if@printcopyright
\@printcopyrighttrue
\newif\if@printpermission
\@printpermissiontrue
\newif\if@acmowned
\@acmownedtrue
\RequirePackage{xkeyval}
\define@choicekey*{ACM@}{acmcopyrightmode}[%
\acm@copyrightinput\acm@copyrightmode]{none,acmcopyright,acmlicensed,%
rightsretained,usgov,usgovmixed,cagov,cagovmixed,%
licensedusgovmixed,licensedcagovmixed,othergov,licensedothergov}{%
\@printpermissiontrue
\@printcopyrighttrue
\@acmownedtrue
\ifnum\acm@copyrightmode=0\relax % none
\@printpermissionfalse
\@printcopyrightfalse
\@acmownedfalse
\fi
\ifnum\acm@copyrightmode=2\relax % acmlicensed
\@acmownedfalse
\fi
\ifnum\acm@copyrightmode=3\relax % rightsretained
\@acmownedfalse
\fi
\ifnum\acm@copyrightmode=4\relax % usgov
\@printpermissiontrue
\@printcopyrightfalse
\@acmownedfalse
\fi
\ifnum\acm@copyrightmode=6\relax % cagov
\@acmownedfalse
\fi
\ifnum\acm@copyrightmode=8\relax % licensedusgovmixed
\@acmownedfalse
\fi
\ifnum\acm@copyrightmode=9\relax % licensedcagovmixed
\@acmownedfalse
\fi
\ifnum\acm@copyrightmode=10\relax % othergov
\@acmownedtrue
\fi
\ifnum\acm@copyrightmode=11\relax % licensedothergov
\@acmownedfalse
\@printcopyrightfalse
\fi}
\def\setcopyright#1{\setkeys{ACM@}{acmcopyrightmode=#1}}
\setcopyright{acmcopyright}
\def\@copyrightowner{%
\ifcase\acm@copyrightmode\relax % none
\or % acmcopyright
ACM.
\or % acmlicensed
Copyright held by the owner/author(s). Publication rights licensed to
ACM.
\or % rightsretained
Copyright held by the owner/author(s).
\or % usgov
\or % usgovmixed
ACM.
\or % cagov
Crown in Right of Canada.
\or %cagovmixed
ACM.
\or %licensedusgovmixed
Copyright held by the owner/author(s). Publication rights licensed to
ACM.
\or %licensedcagovmixed
Copyright held by the owner/author(s). Publication rights licensed to
ACM.
\or % othergov
ACM.
\or % licensedothergov
\fi}
\def\@copyrightpermission{%
\ifcase\acm@copyrightmode\relax % none
\or % acmcopyright
Permission to make digital or hard copies of all or part of this
work for personal or classroom use is granted without fee provided
that copies are not made or distributed for profit or commercial
advantage and that copies bear this notice and the full citation on
the first page. Copyrights for components of this work owned by
others than ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specific permission
and\hspace*{.5pt}/or a fee. Request permissions from
permissions@acm.org.
\or % acmlicensed
Permission to make digital or hard copies of all or part of this
work for personal or classroom use is granted without fee provided
that copies are not made or distributed for profit or commercial
advantage and that copies bear this notice and the full citation on
the first page. Copyrights for components of this work owned by
others than the author(s) must be honored. Abstracting with credit
is permitted. To copy otherwise, or republish, to post on servers
or to redistribute to lists, requires prior specific permission
and\hspace*{.5pt}/or a fee. Request permissions from
permissions@acm.org.
\or % rightsretained
Permission to make digital or hard copies of part or all of this work
for personal or classroom use is granted without fee provided that
copies are not made or distributed for profit or commercial advantage
and that copies bear this notice and the full citation on the first
page. Copyrights for third-party components of this work must be
honored. For all other uses, contact the
owner\hspace*{.5pt}/author(s).
\or % usgov
This paper is authored by an employee(s) of the United States
Government and is in the public domain. Non-exclusive copying or
redistribution is allowed, provided that the article citation is
given and the authors and agency are clearly identified as its
source.
\or % usgovmixed
ACM acknowledges that this contribution was authored or co-authored
by an employee, or contractor of the national government. As such,
the Government retains a nonexclusive, royalty-free right to
publish or reproduce this article, or to allow others to do so, for
Government purposes only. Permission to make digital or hard copies
for personal or classroom use is granted. Copies must bear this
notice and the full citation on the first page. Copyrights for
components of this work owned by others than ACM must be
honored. To copy otherwise, distribute, republish, or post,
requires prior specific permission and\hspace*{.5pt}/or a
fee. Request permissions from permissions@acm.org.
\or % cagov
This article was authored by employees of the Government of Canada.
As such, the Canadian government retains all interest in the
copyright to this work and grants to ACM a nonexclusive,
royalty-free right to publish or reproduce this article, or to allow
others to do so, provided that clear attribution is given both to
the authors and the Canadian government agency employing them.
Permission to make digital or hard copies for personal or classroom
use is granted. Copies must bear this notice and the full citation
on the first page. Copyrights for components of this work owned by
others than the Canadain Government must be honored. To copy
otherwise, distribute, republish, or post, requires prior specific
permission and\hspace*{.5pt}/or a fee. Request permissions from
permissions@acm.org.
\or % cagovmixed
ACM acknowledges that this contribution was co-authored by an
affiliate of the national government of Canada. As such, the Crown
in Right of Canada retains an equal interest in the copyright.
Reprints must include clear attribution to ACM and the author's
government agency affiliation. Permission to make digital or hard
copies for personal or classroom use is granted. Copies must bear
this notice and the full citation on the first page. Copyrights for
components of this work owned by others than ACM must be honored.
To copy otherwise, distribute, republish, or post, requires prior
specific permission and\hspace*{.5pt}/or a fee. Request permissions
from permissions@acm.org.
\or % licensedusgovmixed
Publication rights licensed to ACM. ACM acknowledges that this
contribution was authored or co-authored by an employee, contractor
or affiliate of the United States government. As such, the
Government retains a nonexclusive, royalty-free right to publish or
reproduce this article, or to allow others to do so, for Government
purposes only.
\or % licensedcagovmixed
Publication rights licensed to ACM. ACM acknowledges that this
contribution was authored or co-authored by an employee, contractor
or affiliate of the national government of Canada. As such, the
Government retains a nonexclusive, royalty-free right to publish or
reproduce this article, or to allow others to do so, for Government
purposes only.
\or % othergov
ACM acknowledges that this contribution was authored or co-authored
by an employee, contractor or affiliate of a national government. As
such, the Government retains a nonexclusive, royalty-free right to
publish or reproduce this article, or to allow others to do so, for
Government purposes only.
\or % licensedothergov
Publication rights licensed to ACM. ACM acknowledges that this
contribution was authored or co-authored by an employee, contractor
or affiliate of a national government. As such, the Government
retains a nonexclusive, royalty-free right to publish or reproduce
this article, or to allow others to do so, for Government purposes
only.
\fi}
\endinput
%%
%% End of file `acmcopyright.sty'.

34
main.tex Normal file
View File

@ -0,0 +1,34 @@
\documentclass{sig-alternate-05-2015}
\usepackage[utf8]{inputenc}
\usepackage{natbib}
\usepackage{graphicx}
\usepackage{url}
\title{The Exception that Improves the Rule}
\begin{document}
\maketitle
\section{Introduction}
\label{sec:introduction}
\input{sections/introduction}
\section{Interface}
\label{sec:interface}
What is a spreadsheet?
\section{Language}
\label{sec:language}
\section{Generalizing Singletons}
\label{sec:generalizing}
\section{Related Work}
\label{sec:related}
\input{sections/related}
\bibliographystyle{plain}
\bibliography{references}
\end{document}

8
references.bib Normal file
View File

@ -0,0 +1,8 @@
@book{adams1995hitchhiker,
title={The Hitchhiker's Guide to the Galaxy},
author={Adams, D.},
isbn={9781417642595},
url={http://books.google.com/books?id=W-xMPgAACAAJ},
year={1995},
publisher={San Val}
}

20
sections/introduction.tex Normal file
View File

@ -0,0 +1,20 @@
Spreadsheets are a ubiquitous data processing tool. Their simplicity, generality, and adaptability make them ideal for ``playing'' with through predominantly visual programming metaphors. In particular, spreadsheets provide powerful, but entirely visual metaphors for programming both data transformations and visualizations. In this paper, we explore how similar visual metaphors can be adapted for use with relational databases and discuss how this exploration informs the design of our prototype data exploration tool, called Vizier. Vizier's user interface combines elements of spreadsheets, so-called notebook interfaces, and classical relational queries, enabling easy data manipulation, summarization, and visualization.
One especially powerful feature of the spreadsheet user interface is that it is easy to define both bulk, set-at-a-time operations, as well as exceptional, singleton data operations. The former class, already a strength of relational database capabilities, is crucial for analyzing a dataset of any significant size. However, the latter class --- singleton data operations --- is also incredibly important for facilitating data exploration: (1) Hypothetical what-if scenarios require users to make arbitrary fine-grained adjustments to data, (2) Outliers in the data may require special-case treatment, and (3) Users may wish to develop transformations on small example data before generalizing. While easy tasks to accomplish in spreadsheets, special cases like these are not handled gracefully by existing mechanisms for interacting with relational DBMS.
To enable singleton transformations within the framework of a classical relational database, we propose a new mechanism for data exploration called interactive views. An interactive view begins life as a classical database view, presented to the user in tabular form. In contrast to a classical view however, an interactive view can be edited much like a spreadsheet. Users can modify fields, add new rows and columns, use a spreadsheet-style equation editor to define derived values, and more. As the user applies edits, the user's activities are seamlessly transformed into a program of relational(-ish) data transformation operators that derive the new, edited view. This program provides two major benefits. First, it serves as a form of history, allowing the user to revisit and revise earlier edits, even out of order. Second, the query defines a workflow, albeit one highly specialized to a specific dataset. Even this is sufficient to provide classical benefits of workflow provenance such as auditability and explainability for derived data. Once an interactive view is developed for one dataset, however, it can more readily be adapted to new data or to react to changes in its inputs. Recasting the user's actions programmatically allows us to leverage existing work on algebraic equivalences and program rewriting to for the purpose of first obtaining different interpretations of sequences of user actions, and then for extrapolating from them.
In this paper, we outline the core technical challenges of implementing interactive views and sketch our proposed solutions. The first set of challenges are to resolve the mismatch between a spreadsheet's positional frame of reference and the more qualitative frame of reference used in the relational model. An edit to a particular cell might be associated with a unique ID for that cell, but in the context of a particular query we have to interpret cells within a particular reference system. Certain operations affect this system of references like for example copy-paste, inserting a new row, or changing the sort order. We need to ensure that changes to the reference system do not affect references to individual cells.
% * Interface / Language
% * Provenance: defining a consistent ROWID
% * this is like switching between different systems of reference, the CID
% (cell ID) gives us a unique ID column, ...
% * one way to address this (at least conceptually) may be to store positions
% of a cell within the current reference system and model changes to the
% system of reference as updates affecting the cell positions
% * Implicit Windowing
% * A Readability-Optimizing S2S Compiler
% * Generalizing Singletons

11
sections/related.tex Normal file
View File

@ -0,0 +1,11 @@
\begin{itemize}
\item Query by Example
\item Query by Explanation \url{http://arxiv.org/abs/1602.03819}
\item Trifacta/Wrangler/Potter's Wheel
\item Similar, but forces upfront generalization.
\item Provenance for edits
\item View maintenance?
\item Workflow systems (VisTrails)
\item Reenactment
\end{itemize}

1893
sig-alternate-05-2015.cls Normal file

File diff suppressed because it is too large Load Diff