Website/src/talks/2022-06-20-DataThread.erb

150 lines
3.8 KiB
Plaintext

---
template: templates/talk_slides_v1.erb
title: "Microkernel Notebooks"
---
<section>
<h2>μKernel Notebooks</h2>
<h4>Oliver Kennedy</h4>
<h5>University at Buffalo</h5>
</section>
<section>
<svg data-src="graphics/2022-06-20/NotebookOverview.svg" height="400px" style="margin-left: -100px"/>
</section>
<section>
<div style="display: inline-block; width: 45%;">
<img src="graphics/2022-06-20/Pimentel.png" height="400px">
<p style="font-size: 70%;"><a href="https://ieeexplore.ieee.org/document/8816763">Pimentel et al</a>: "4.03% of notebooks on github are reproducible"</p>
</div>
<div style="display: inline-block; width: 45%;" class="fragment">
<img src="graphics/2022-06-20/Grus.png">
<p style="font-size: 70%;"><a href="https://www.youtube.com/watch?v=7jiPeIFXb6U">Joel Grus</a>: "For beginners, with dozens of cells and more complex code [the ability to run code snippets out of order] is utterly confusing."</p>
</div>
</section>
<section>
<svg data-src="graphics/2022-06-20/Checkpointing.svg" width="800px"/>
</section>
<section>
<h3><a href="https://github.com/stitchfix/nodebook">Nodebook</a></h3>
<img src="graphics/2022-06-20/Nodebook.png" height="300px">
<attribution><a href="https://github.com/stitchfix/nodebook">https://github.com/stitchfix/nodebook</a></attribution>
</section>
<section>
<svg data-src="graphics/2022-06-20/MonokernelCheckpoints.svg" height="400px" />
<attribution><a href="https://openclipart.org">https://openclipart.org</a></attribution>
</section>
<section>
<img src="graphics/2022-06-20/NoCheckpointing.png" height="400px">
</section>
<section>
<p>A modest proposal...</p>
</section>
<section>
<img src="graphics/2022-06-20/MicrokernelCheckpoints.svg" height="400px">
<attribution>https://openclipart.com</attribution>
</section>
<section>
<p>So now...</p>
</section>
<section>
<img src="graphics/2022-06-20/MicrokernelPyV2Checkpoints.svg" height="400px">
</section>
<section>
<img src="graphics/2022-06-20/MicrokernelPyScalaCheckpoints.svg" height="400px">
</section>
<section>
<p>and...</p>
</section>
<section>
<svg data-src="graphics/2022-06-20/Parallelism.svg" width="800px"/>
</section>
<section>
<h3>... and more</h3>
<ul>
<li>Better error-handling</li>
<li>Easily inspect inter-cell state</li>
<li>Automatically re-run stale cells</li>
</ul>
</section>
<section>
<h3>... not all smiles and sunshine</h3>
<ul>
<li class="fragment highlight-grey" data-fragment-index="1">Dependency Analysis</li>
<li class="fragment highlight-grey" data-fragment-index="1">Scheduling Cell Execution</li>
<li class="fragment highlight-grey" data-fragment-index="1">Python Startup Costs</li>
<li>Migrating state between kernels</li>
</ul><br/>
</section>
<section>
<pre><code>
x = 3
</code></pre>
</section>
<section>
<pre><code>
from foo import x
</code></pre>
</section>
<section>
<pre><code>
x = pandas.read_csv("foo.csv")
</code></pre>
</section>
<section>
<img src="graphics/2022-06-20/arrow.png" height="300px">
</section>
<section>
<img src="graphics/2022-06-20/Vizier-System-Diag.svg" height="300px">
</section>
<section>
<img src="graphics/2022-06-20/Vizier-Polyglot.png" height="500px">
</section>
<section>
<img src="graphics/2022-06-20/Vizier-Load.png" height="500px">
</section>
<section>
<img src="graphics/2022-06-20/Vizier-Spreadsheet.png" height="500px">
</section>
<section>
<img src="graphics/2022-06-20/Vizier-New.png" height="400px">
</section>
<section>
<a href="https://vizierdb.info">
<img src="graphics/2022-06-20/vizier.svg" height="200px">
<p style="margin-top: -20px;">https://vizierdb.info</p>
</a>
<p style="font-size: 65%"><b>Mike Brachmann, Boris Glavic, Nachiket Deo</b>, Juliana Freire, Heiko Mueller, Sonia Castello, Munaf Arshad Qazi, William Spoth, Poonam Kumari, Soham Patel, and more...</p>
</section>