intro

2022-03-29 12:22:43 -05:00 · 2022-03-29 12:22:43 -05:00 · 5c93f62d7f
parent 1f9fdfecad
commit 5c93f62d7f
1 changed files with 8 additions and 3 deletions
--- a/sections/introduction.tex
+++ b/sections/introduction.tex
@ -16,10 +16,15 @@ In this paper, we present a novel \emph{coarse-grained} dataflow provenance mode
 We outline the implementation of this provenance model into an existing workflow system named Vizier~\cite{brachmann:2020:cidr:your,brachmann:2019:sigmod:data}, and address several of the challenges that arise when parallelizing notebooks.

 \subsection{Potential for Improvement}
-To assess the potential for improvement, we conducted a preliminary survey on an archive of Jupyter notebooks scraped from Github by Pimentel et. al.~\cite{DBLP:journals/ese/PimentelMBF21}.  
+To assess the potential for improvement, we conducted a preliminary survey on an archive of Jupyter notebooks scraped from Github by Pimentel et. al.~\cite{DBLP:journals/ese/PimentelMBF21}.
 Our survey included only notebooks using a python kernel and known to execute successfully; A total of 800\OK{fill in the exact number} notebooks met these criteria.
 We used the python \texttt{ast} module to construct an inter-cell dataflow graph (e.g., using the methodology of \OK{citations}).
-As a proxy measure for potential speedup, we considered the depth of this graph in relation to the total number of python cells in the notebook.  
-\Cref{fig:parallelismSurvey} relates these measures in a XXX. 
+As a proxy measure for potential speedup, we considered the depth of this graph in relation to the total number of python cells in the notebook.
+\Cref{fig:parallelismSurvey} relates these measures in a XXX.
 Although XXX percent of the notebooks do require sequential execution, as many as XXX percent can XXX.

+
+%%% Local Variables:
+%%% mode: latex
+%%% TeX-master: "../main"
+%%% End: