break urls
parent
9c92b54fcd
commit
b35782ac75
|
@ -47,7 +47,6 @@
|
|||
|
||||
\documentclass{sig-alternate}
|
||||
|
||||
\usepackage{cleveref}
|
||||
\usepackage{listings}
|
||||
\usepackage{todonotes}
|
||||
\usepackage{xspace}
|
||||
|
@ -75,6 +74,12 @@
|
|||
\newcommand{\trimfigurespacing}{\vspace*{-5mm}}
|
||||
\newcommand{\hide}{}
|
||||
|
||||
\usepackage{url}
|
||||
\def\UrlBreaks{\do\/\do-}
|
||||
\usepackage{breakurl}
|
||||
\usepackage[bookmarks=false,breaklinks]{hyperref}
|
||||
\usepackage{cleveref}
|
||||
|
||||
\begin{document}
|
||||
|
||||
% Copyright
|
||||
|
@ -106,7 +111,7 @@
|
|||
% --- End of Author Metadata ---
|
||||
|
||||
% \title{What needs to be REPLaced in notebooks}
|
||||
\title{Your notebook is not crumby enough, REPLace it.}
|
||||
\title{Your notebook is not crumby enough, REPLace it}
|
||||
%
|
||||
% You need the command \numberofauthors to handle the 'placement
|
||||
% and alignment' of the authors beneath the title.
|
||||
|
|
|
@ -1,26 +1,26 @@
|
|||
% Optional fields: author, title, howpublished, month, year, note
|
||||
@MISC{vanderplass:2017:reproducibility,
|
||||
howpublished = {https://twitter.com/jakevdp/status/935178916490223616},
|
||||
howpublished = {\url{https://twitter.com/jakevdp/status/935178916490223616}},
|
||||
title = {Idea: Jupyter notebooks could have a "reproducibility mode"},
|
||||
author = {Jake VanderPlas}
|
||||
}
|
||||
|
||||
% Optional fields: author, title, howpublished, month, year, note
|
||||
@MISC{zelnicki:2017:nodebook,
|
||||
howpublished = {https://multithreaded.stitchfix.com/blog/2017/07/26/nodebook/},
|
||||
howpublished = {\url{https://multithreaded.stitchfix.com/blog/2017/07/26/nodebook/}},
|
||||
author = {Kevin Zielnicki},
|
||||
title = {Nodebook}
|
||||
}
|
||||
|
||||
% Optional fields: author, title, howpublished, month, year, note
|
||||
@MISC{jobevers:2018:jupyterOrderOfExec,
|
||||
howpublished = {https://github.com/jupyter/notebook/issues/3229},
|
||||
howpublished = {\url{https://github.com/jupyter/notebook/issues/3229}},
|
||||
author = {Job Evers-Meltzer},
|
||||
title = {Enforce a top-down order of execution}
|
||||
}
|
||||
|
||||
@MISC{nyt:wrangling,
|
||||
howpublished = {http://nyti.ms/1Aqif2X},
|
||||
howpublished = {\url{http://nyti.ms/1Aqif2X}},
|
||||
author = {S. Lohr},
|
||||
title = {For big-data scientists, ‘janitor work’ is key hurdle to insights.},
|
||||
year = {2014}
|
||||
|
|
|
@ -1,35 +1,22 @@
|
|||
|
||||
% -*- root: ../paper.tex -*-
|
||||
Notebook and spreadsheet systems are currently the de-facto standard for data collection, preparation, and analysis.
|
||||
However, these systems have been criticized for their lack of
|
||||
reproducibility, versioning, and support for sharing.
|
||||
%
|
||||
These shortcomings are particularly detrimental for
|
||||
data curation where data scientists iteratively
|
||||
build workflows to clean up and integrate data as a prerequisite for
|
||||
analysis.
|
||||
% \hide{ JF: here, there is a disconnect, since in the prev parag we
|
||||
% talk about spreadsheets too. Also, we get into details that may not
|
||||
% be clear for readers without giving some background first-- I
|
||||
% suggest we remove the sentence below. %
|
||||
% A key reason for these shortcomings is an impedence mismatch between
|
||||
% the notebook user interface (as a sequence of steps) and the
|
||||
% underlying implementation of most notebooks (as a library of code
|
||||
% snippets).}
|
||||
%
|
||||
We present Vizier, an open-source tool that helps analysts to
|
||||
build and refine data pipelines. Vizier combines the flexibility
|
||||
of notebooks with the easy-to-use data manipulation
|
||||
interface of spreadsheets.
|
||||
%a publicly available, open-source workflow-based notebook system aimed at helping analysts to iteratively build and refine data pipelines.
|
||||
%We highlight two features of Vizier: A spreadsheet interface for
|
||||
%simultaneous exploration and direct manipulation of data, and caveats,
|
||||
%an advanced approach for tracking potential data errors.
|
||||
Combined with advanced provenance tracking for both data
|
||||
and computational steps this enables reproducibility, versioning, and
|
||||
streamlined data exploration.
|
||||
% caveats
|
||||
Unique to Vizier is that it exposes potential issues with data, no matter whether they already exist in the input or are introduced by the operations of a notebook. We refer to such potential errors as \emph{data caveats}. Caveats are propagated alongside data using principled techniques from uncertain data management. Vizier provides extensive user interface support for caveats, e.g., exposing them as summaries in a dedicated error view and highlighting cells with caveats in spreadsheets.
|
||||
Notebook and spreadsheet systems are currently the de-facto standard for data
|
||||
collection, preparation, and analysis. However, these systems have been
|
||||
criticized for their lack of reproducibility, versioning, and support for
|
||||
sharing. These shortcomings are particularly detrimental for data curation where
|
||||
data scientists iteratively build workflows to clean up and integrate data as a
|
||||
prerequisite for analysis. We present Vizier, an open-source tool that helps
|
||||
analysts to build and refine data pipelines. Vizier combines the flexibility of
|
||||
notebooks with the easy-to-use data manipulation interface of spreadsheets.
|
||||
Combined with advanced provenance tracking for both data and computational steps
|
||||
this enables reproducibility, versioning, and streamlined data exploration.
|
||||
Unique to Vizier is that it exposes potential issues with data, no matter
|
||||
whether they already exist in the input or are introduced by the operations of a
|
||||
notebook. We refer to such potential errors as \emph{data caveats}. Caveats are
|
||||
propagated alongside data using principled techniques from uncertain data
|
||||
management. Vizier provides extensive user interface support for caveats, e.g.,
|
||||
exposing them as summaries in a dedicated error view and highlighting cells with
|
||||
caveats in spreadsheets.
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "../paper"
|
||||
|
|
Loading…
Reference in New Issue