This commit is contained in:
Boris Glavic 2022-04-02 23:13:50 -05:00
parent fe34bee267
commit 5a13fe9ef7

View file

@ -1,5 +1,6 @@
%!TEX root=../main.tex
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure*}[t]
\newcommand{\plotminusvspace}{-5mm}
\begin{subfigure}[b]{.3\textwidth}
@ -31,7 +32,7 @@
\label{fig:gantt}
\trimfigurespacing
\end{figure*}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
As a proof of concept, we implemented the static analysis approach described in \Cref{sec:import} adn a simple, provenance-aware parallel scheduler (\Cref{sec:scheduler}) within the Vizier notebook system~\cite{brachmann:2020:cidr:your}.
Parallelizing cell execution requires an ICE architecture, which comes at the cost of increased communication overhead relative to monolithic kernel notebooks.
@ -52,21 +53,23 @@ All experiments were run on Ubuntu 20.04 on a server with 2 x AMD Opteron 4238
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\tinysection{Overview}
As a preliminary experiment, we run a synthetic workload consisting of one cell that randomly generates a 100k-row, 2-integer column\OK{Nachiket: Confirm this please} Pandas dataframe and exports it, and 10 reader cells that read the dataset and perform a compute intensive task: Computing pairwise distance for a 10k-row subset of the source dataset.
\Cref{fig:gantt} shows example execution traces for the workload in Vizier with its default (serial) scheduler, Vizier with its new (parallel) scheduler, and Jupyter.
The experiments show an overhead of XX\OK{Fill in}s overhead as Python exports data, and a XXs overhead from loading the data back in.
We observe several oppoortunities for potential improvement:
First, the serial first access to the dataset is 2s more expensive than the remaining lookups as Vizier loads and prepares to host the dataset through the Arrow protocol. We expect that such startup costs can be mitigated, for example by having the Python kernel continue hosting the dataset itself while the monitor process is loading the data.
We also note that this overhead grows to almost 10s in the parallel case. In addition to startup-costs, this may be the result of contention; Even when executing cells in parallel execution, it may be beneficial to stagger cell starts. Nonetheless, parallel execution even in its preliminary implementation already reduces runtime from 80 to 20 seconds for this test case.
As a preliminary experiment, we ran a synthetic workload consisting of one cell that randomly generates a 100k-row, 2 integer column Pandas dataframe and exports it, and 10 reader cells that read the dataset and perform a compute intensive task: Computing pairwise distance for a 10k-row subset of the source dataset.
\Cref{fig:gantt} shows execution traces for the workload in Vizier with its default (serial) scheduler and Vizier with its new (parallel) scheduler.
The experiments shows that the parallel execution is $\sim 4$ times faster than the serial execution. However, each individual reader takes longer to finish in the parallel execution. % an overhead of XX\OK{Fill in}s overhead as Python exports data, and a XXs overhead from loading the data back in.
% We observe several oppoortunities for potential improvement:
% First, the serial first access to the dataset is 2s more expensive than the remaining lookups as Vizier loads and prepares to host the dataset through the Arrow protocol. We expect that such startup costs can be mitigated, for example by having the Python kernel continue hosting the dataset itself while the monitor process is loading the data.
% We also note that this overhead grows to almost 10s in the parallel case. In addition to startup-costs,
This is possibly the result of contention on the dataset. % Even when executing cells in parallel execution, it may be beneficial to stagger cell starts.
Nonetheless, this preliminary result together with the results of the analysis shown in \Cref{fig:parallelismSurvey} demonstrates the potential for parallel execution of notebooks. % parallel execution even in its preliminary implementation already reduces runtime from 80 to 20 seconds for this test case.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\tinysection{Scaling}
In this benchmark, we specifically measure ICE overheads relative to data size.
We specifically measure the cost of:
(i) exporting a dataset,
(ii) importing a 'cold' dataset, and
(iii) importing a 'hot' dataset.
Results are shown in XXXXX\BG{add}.
In this benchmark, we specifically measure ICE cost of reading data relative to data size. We ran the experiment with cold and hot cache. \Cref{fig:scalability} shows the results of this experiment. Not that Vizier scales linearly for larger dataset sizes.
% We specifically measure the cost of:
% (i) exporting a dataset,
% (ii) importing a 'cold' dataset, and
% (iii) importing a 'hot' dataset.
% Results are shown in XXXXX\BG{add}.
%%% Local Variables: