499 lines
19 KiB
HTML
499 lines
19 KiB
HTML
<!doctype html>
|
|
<html lang="en">
|
|
|
|
<head>
|
|
<meta charset="utf-8">
|
|
|
|
<title>How to Start Collaborating with CSE to Solve Global Health Problems</title>
|
|
|
|
<meta name="description" content="How to Start Collaborating with CSE to Solve Global Health Problems">
|
|
<meta name="author" content="Oliver Kennedy">
|
|
|
|
<meta name="apple-mobile-web-app-capable" content="yes" />
|
|
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent" />
|
|
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
|
|
|
|
<link rel="stylesheet" href="../../reveal.js-3.7.0/css/reveal.css">
|
|
<link rel="stylesheet" href="ubodin.css" id="theme">
|
|
|
|
<!-- Code syntax highlighting -->
|
|
<link rel="stylesheet" href="../../reveal.js-3.7.0/lib/css/zenburn.css">
|
|
|
|
|
|
<style type="text/css">
|
|
.reveal .slides section .fragment.growbig {
|
|
opacity: 1;
|
|
visibility: inherit; }
|
|
.reveal .slides section .fragment.growbig.visible {
|
|
-webkit-transform: scale(7);
|
|
transform: scale(7); }
|
|
</style>
|
|
|
|
<!-- Printing and PDF exports -->
|
|
<script>
|
|
var link = document.createElement( 'link' );
|
|
link.rel = 'stylesheet';
|
|
link.type = 'text/css';
|
|
link.href = window.location.search.match( /print-pdf/gi ) ? '../../reveal.js-3.7.0/css/print/pdf.css' : '../reveal.js-3.7.0/css/print/paper.css';
|
|
document.getElementsByTagName( 'head' )[0].appendChild( link );
|
|
</script>
|
|
|
|
<!--[if lt IE 9]>
|
|
<script src="../reveal.js-3.5.0/lib/js/html5shiv.js"></script>
|
|
<![endif]-->
|
|
</head>
|
|
|
|
<body>
|
|
|
|
<div class="reveal">
|
|
|
|
<div class="header">
|
|
<!-- Any Talk-Specific Header Content Goes Here -->
|
|
<center>
|
|
<a href="http://www.buffalo.edu" target="_blank">
|
|
<img src="../graphics/logos/ub-1line-ro-white.png" height="20"/>
|
|
</a>
|
|
</center>
|
|
</div>
|
|
<div class="footer">
|
|
<!-- Any Talk-Specific Footer Content Goes Here -->
|
|
<div style="float: left; margin-top: 15px; ">
|
|
Exploring <u><b>O</b></u>nline <u><b>D</b></u>ata <u><b>In</b></u>teractions
|
|
</div>
|
|
<a href="https://odin.cse.buffalo.edu" target="_blank">
|
|
<img src="../graphics/logos/odin-1line-white.png" height="40" style="float: right;"/>
|
|
</a>
|
|
</div>
|
|
|
|
<div class="slides">
|
|
<!-- Any section element inside of this container is displayed as a slide -->
|
|
|
|
<section>
|
|
<h3>How to Start Collaborating with CSE to Solve Global Health Problems</h3>
|
|
<h4>Oliver Kennedy</h4>
|
|
</section>
|
|
|
|
<section>
|
|
<ol>
|
|
<li>CSE: Research vs Implementation?</li>
|
|
<li>Research Highlights.</li>
|
|
<li>Behind the Buzzwords.</li>
|
|
<li>CSE Resources for Your Benefit.</li>
|
|
</ol>
|
|
</section>
|
|
|
|
<section>
|
|
<!--
|
|
<section>
|
|
<p style="font-family: serif">Computer science is not about machines, in the same way that astronomy is not about telescopes.</p>
|
|
<p>Michael R. Fellows (often attributed to Edward Dijkstra)</p>
|
|
</section>
|
|
-->
|
|
<section>
|
|
<h3>Computer Science & Engineering</h3>
|
|
<ul>
|
|
<li>How hard is a particular type of problem to solve?</li>
|
|
<li>Can a particular solution be scaled to bigger problems?</li>
|
|
<li>How do I solve problems that fit a particular pattern?</li>
|
|
</ul>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>Abstraction</h3>
|
|
<p>CSE research is about creating <u><b>general</b></u> solutions</p>
|
|
<p class="fragment">(motivated by specific problems)</p>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>CSE Publication</h3>
|
|
<p>How does the new solution differ? Is it...</p>
|
|
<ul>
|
|
<li>... more general?</li>
|
|
<li>... more efficient/reliable?</li>
|
|
<li>... more scalable?</li>
|
|
<li>... easier to use?</li>
|
|
</ul>
|
|
</section>
|
|
|
|
<section>
|
|
<p>Do you need <u>research</u> or <u>implementation</u>?</p>
|
|
<p class="fragment" style="margin-top: 50px;">UB-CSE can help with both, <br/>but each takes a different approach.</p>
|
|
<p class="fragment">Have clear, well-defined parameters for the problem.</p>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>Typical Implementation Problems</h3>
|
|
<ul>
|
|
<li>Organizing data into a database.</li>
|
|
<li>A mobile app to display information.</li>
|
|
<li>Making an R script run faster.</li>
|
|
</ul>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>Implementation Resources</h3>
|
|
<p>CSE-611</p>
|
|
<p>Undergraduate Research</p>
|
|
<p>Invenst</p>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>But...</h3>
|
|
|
|
<p class="fragment">Implementation often inspires research topics.<br/>(Ailamaki's 7 month rule)</p>
|
|
<p class="fragment">Implementation may synergize with existing research.<br/>(e.g., Motivating use cases)</p>
|
|
|
|
<p class="fragment">(so talking to a friendly CSE faculty can still be useful)</p>
|
|
</section>
|
|
|
|
<!--
|
|
<section>
|
|
<h3>Automating Repetitive Tasks</h3>
|
|
|
|
<p>It's more likely to be research when...</p>
|
|
|
|
<ul>
|
|
<li>... the task requires at least some human intuition.</li>
|
|
<li>... the task is "hard" for computers (images, video, audio, prose).</li>
|
|
<li>... the task requires a lot (10+ TB) of data.</li>
|
|
<li>... the task is hard to describe precisely (outlier detection).</li>
|
|
</ul>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>Solving Bigger Problems Faster</h3>
|
|
|
|
<p>It's more likely to be research when...</p>
|
|
|
|
<ul>
|
|
<li>... the task requires a lot (10+ TB) of data.</li>
|
|
<li>... the task has extremely tight time constraints.</li>
|
|
<li>... your code works well for smaller problems (fewer variables, less data)</li>
|
|
<li>... the task is naturally computationally complex.</li>
|
|
</ul>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>Solved Problems on Different Tech.</h3>
|
|
|
|
<p>It's more likely to be research when...</p>
|
|
|
|
<ul>
|
|
<li>... the new technology is resource constrained (battery, cpu, memory).</li>
|
|
<li>... the new technology uses a different interface.</li>
|
|
<li>... the new technology violates assumptions made by existing solutions.</li>
|
|
<li>... the new technology makes something an existing solution does easier.</li>
|
|
</ul>
|
|
</section>
|
|
-->
|
|
</section>
|
|
|
|
<section>
|
|
<section>
|
|
<h3>Research Highlights</h3>
|
|
|
|
<ul>
|
|
<li>Reproducible Datasets (Oliver Kennedy)</li>
|
|
<li>Wireless Sensor Networks (Chang Wen Chen)</li>
|
|
</ul>
|
|
</section>
|
|
|
|
<section>
|
|
<h2>Reproducible Datasets</h2>
|
|
<h3>Oliver Kennedy</h3>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>Data Errors Suck</h3>
|
|
<img src="images/data_error.png">
|
|
<attribution><a href="https://xkcd.com/2239/">https://xkcd.com/2239/</a></attribution>
|
|
</section>
|
|
|
|
<section>
|
|
|
|
<p>
|
|
<span class="fragment">
|
|
<img src="images/female-computer-user.svg" height="70px" style="vertical-align: middle;"/>
|
|
<span style="vertical-align: middle; padding-left: 70px; padding-right: 70px">→</span>
|
|
</span>
|
|
<img src="images/db.svg" height="70px" style="vertical-align: middle;"/>
|
|
<span style="vertical-align: middle; padding-left: 70px; padding-right: 70px">→</span>
|
|
<img src="images/male-computer-user.png" height="70px" style="vertical-align: middle;"/>
|
|
</p>
|
|
|
|
<p class="fragment">
|
|
<span style="margin-right: 250px; vertical-align: middle;">↓</span>
|
|
<span style="margin-left: 250px; vertical-align: middle;">↓</span>
|
|
<br/>
|
|
<span style="margin-right: 100px; vertical-align: middle;">Assumption</span>
|
|
<span style="font-size: 300%; vertical-align: middle;" class="fragment">≠</span>
|
|
<span style="margin-left: 100px; vertical-align: middle;">Assumption</span>
|
|
</p>
|
|
|
|
<attribution>freesvg.org</attribution>
|
|
</section>
|
|
|
|
<section style="top: 121px; display: block;" class="" aria-hidden="true">
|
|
<h3>Assumptions?</h3>
|
|
<ol style="font-size: 70%">
|
|
<li>"This outlier is actually a data error"</li>
|
|
<li>"There will always be six values in this column"</li>
|
|
<li>"The correct fix is to delete erroneous records"</li>
|
|
<li>"Unparseable values should be treated as NULL"</li>
|
|
<li>"Nobody will analyze this portion of the dataset"</li>
|
|
<li>"These subjective field observations are correct"</li>
|
|
</ol>
|
|
<p class="fragment">Alice needs to document each and every assumption.</p>
|
|
<p class="fragment">Bob needs to understand the implications<br/>on every part of his analysis.</p>
|
|
</section>
|
|
|
|
<section>
|
|
<img src="images/montoya.jpeg" height="400px" />
|
|
|
|
<attribution>© 20th Century Fox</attribution>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>The Vizier Notebook</h3>
|
|
<img src="images/1.1.StagedExecution.png" height="400px">
|
|
</section>
|
|
|
|
<section>
|
|
<p>If you're using Python, R, SQL, or Jupyter, we can help...</p>
|
|
<ul>
|
|
<li>...improve your dataset documentation</li>
|
|
<li>...make your workflows more reproducible</li>
|
|
<li>...make your code faster</li>
|
|
</ul>
|
|
<p class="fragment">talk to me afterwards!</p>
|
|
</section>
|
|
</section>
|
|
|
|
|
|
<section>
|
|
<section>
|
|
<h2>Wireless Sensor Networks</h2>
|
|
<h3>Chang Wen Chen</h3>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>IoT Devices Need To Communicate</h3>
|
|
|
|
<dl>
|
|
<dt>Cellular or LORA Networks</dt>
|
|
<dd>Each device talks to a tower <br>(reliable, but requires infrastructure)</dd>
|
|
|
|
<dt>Store and Collect</dt>
|
|
<dd>Each device stores data and is recovered <br>(no infrastructure, but requires physical visits)</dd>
|
|
|
|
<dt>Mesh Networks</dt>
|
|
<dd>Each device communicates via nearby devices <br>(bandwidth/power limited)</dd>
|
|
</dl>
|
|
</section>
|
|
|
|
<section>
|
|
<p>UB-CSE is at the forefront of wireless research</p>
|
|
</section>
|
|
|
|
<section>
|
|
<p><b>Application: </b>Monitoring the excessive antibiotic discharge into Missouri river from overdose usage as agricultural runoff</p>
|
|
|
|
<p>Problem: 1d "mesh" is even more limited</p>
|
|
<img src="images/changwen_1d_mesh.png">
|
|
</section>
|
|
</section>
|
|
|
|
|
|
<section>
|
|
<section>
|
|
<h2>Buzzwords </h2>
|
|
</section>
|
|
|
|
<!--
|
|
<section>
|
|
<p>
|
|
Accountants use ledgers for keeping track of accounts, inventory status, etc...
|
|
Ledgers have basic physical limitations (hard to erase pen marks, hard to insert new entries, only one person can write at a time).
|
|
These physical limitations go away when the ledger is on the computer.
|
|
Need to trust that everyone participating keeps "track changes" on.
|
|
</p>
|
|
</section>
|
|
<section>
|
|
<p>
|
|
Blockchain is a collection of techniques that enforce similar limitations in a digital setting <i>without requiring trust</i>.
|
|
If you trust whoever's running the computer to play by the rules (e.g., through personal trust, legal force, or crescent wrenches), you probably don't need a blockchain.
|
|
</p>
|
|
</section>
|
|
<section>
|
|
<p>As an aside, these techniques are extremely compute-intensive (and intentionally so). Estimates put the power use of Bitcoin alone at 30-60 TWh per year (back of the napkin: enough to power all homes in the US, 10-20 times over)</p>
|
|
</section>
|
|
-->
|
|
|
|
<section>
|
|
<h3>Neural Networks / Deep Learning</h3>
|
|
|
|
<div style="font-size: 80% ">
|
|
<p class="fragment">Linear Regression</p>
|
|
<p class="fragment">↓<br>Spline Fitting</p>
|
|
<p class="fragment">↓<br>Graphical Models</p>
|
|
<p class="fragment">↓<br>Neural Networks</p>
|
|
</div>
|
|
<p class="fragment">$y=f(x)$ where $f$ has 100s or 1000s (or more) DoF</p>
|
|
</section>
|
|
|
|
<section>
|
|
<dl>
|
|
<dt>The Good</dt>
|
|
<dd>• Feasible to fit very complex functions (e.g., face?)</dd>
|
|
<dd>• Minimal knowledge of problem structure required.</dd>
|
|
<dt>The Bad</dt>
|
|
<dd>• Need <b>huge</b> training data</dd>
|
|
<dd>• Very easy to overfit</dd>
|
|
<dd>• Not explainable (yet)</dd>
|
|
</dl>
|
|
</section>
|
|
|
|
<!--
|
|
<section>
|
|
<dl style="font-size: 70%">
|
|
<dt>Layer</dt>
|
|
<dd>A function that regresses N variables to predict M variables (typically N=M). </dd>
|
|
|
|
<dt>Neural Network</dt>
|
|
<dd>
|
|
A stack of 2 or more layers, with each network predicting its outputs from the (latent/hidden) variables produced by the previous layer
|
|
</dd>
|
|
|
|
<dt>Recurrent Neural Network (RNN)</dt>
|
|
<dd>A Neural Network for timeseries data (NN:RNN :: Markovian Variable:Markov Chain)</dd>
|
|
|
|
<dt>Convolutional Neural Network (CNN)</dt>
|
|
<dd>A Neural Network for image data designed to take advantage of the fact that a feature (like a face) is usually size, position, and/or rotationally invariant (you can make it bigger, move it around, or rotate it and it's still a face).</dd>
|
|
|
|
<dt>Generative Adversarial Network (GAN)</dt>
|
|
<dd>Black wizardry that can generate an image/audio in one "style" while retaining the key features of another. (e.g., Transform a video of a horse into one of a zebra)</dd>
|
|
</dl>
|
|
</section>
|
|
<section>
|
|
<h3>Cloud Computing</h3>
|
|
|
|
<p>
|
|
Rent a server from Amazon, Microsoft, or Google (or UB).
|
|
Basically, pay someone to do the managerial work of running/security/etc...
|
|
Also benefit by not having infrastructure (e.g., access to 100s of servers for precisely as long as it takes you to get your answer)
|
|
</p>
|
|
</section>
|
|
-->
|
|
|
|
<section>
|
|
<h3>Differential Privacy</h3>
|
|
|
|
<p>
|
|
A way to mathematically prove to yourself how much PII could leak if an aggregate dataset is released.
|
|
</p>
|
|
|
|
<p class="fragment">
|
|
"Can I create a statistically significant effect on the dataset by removing N individuals?"
|
|
</p>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>Blockchain</h3>
|
|
<img src="images/blockchain.png">
|
|
<attribution><a href="https://xkcd.com/2267/">https://xkcd.com/2267/</a></attribution>
|
|
</section>
|
|
<section>
|
|
<h3>Oliver's Blockchain PSA</h3>
|
|
|
|
<ul>
|
|
<li>It is statistically unlikely that you need a blockchain.<br/>
|
|
<span class="fragment" style="font-size: 70%">(I've yet to see a use case apart from cryptocurrency that needs a blockchain)</span></li>
|
|
<li>Proof of work is a huge power sink.<br>
|
|
<span class="fragment" style="font-size: 70%">(Bitcoin alone <a href="https://digiconomist.net/bitcoin-energy-consumption">estimated</a> at 77 TWh this year ~= the power consumption of Chile)</span></li>
|
|
</ul>
|
|
|
|
<p class="fragment">Please consult a Doctor (of Philosophy in CS)<br> before starting a blockchain project</p>
|
|
</section>
|
|
|
|
<!--
|
|
<section>
|
|
<h3>Internet of Things</h3>
|
|
|
|
<p>
|
|
Cellphones are cheap. Cellular plans are cheap. Put cellphone radios (modems) in everything.
|
|
Cellphone + Sensors = tons of data about anything and everything that you want to measure.
|
|
</p>
|
|
</section>
|
|
-->
|
|
</section>
|
|
|
|
<section>
|
|
<section>
|
|
<h3>Resources</h3>
|
|
|
|
<ul style="font-size: 80%">
|
|
<li>CSE 611: <a href="https://invenst.cse.buffalo.edu/viewideabank.php">https://invenst.cse.buffalo.edu/viewideabank.php</a></li>
|
|
<li>Undergraduates: Any CSE Faculty</li>
|
|
<li>Invenst: Alan Hunt</li>
|
|
<li>Centers <ul>
|
|
<li>CUBS/CEDAR (image/video data)</li>
|
|
<li>CARA (structured data)</li>
|
|
<li>CMIF (multisource fusion)</li>
|
|
</ul></li>
|
|
</ul>
|
|
</section>
|
|
|
|
<section>
|
|
<h3>NSF CS+X Programs</h3>
|
|
<ul>
|
|
<li><b>CSSI</b>: Cyberinfrastructure for Sustained Scientific Innovation</li>
|
|
<li><b>SCH</b>: Smart and Connected Health</li>
|
|
<li><b>SCC</b>: Smart and Connected Communities</li>
|
|
</dl>
|
|
</section>
|
|
</section>
|
|
|
|
</div></div>
|
|
|
|
<script src="../reveal.js-3.5.0/lib/js/head.min.js"></script>
|
|
<script src="../reveal.js-3.5.0/js/reveal.js"></script>
|
|
|
|
<script>
|
|
|
|
// Full list of configuration options available at:
|
|
// https://github.com/hakimel/../reveal.js#configuration
|
|
Reveal.initialize({
|
|
controls: false,
|
|
progress: true,
|
|
history: true,
|
|
center: true,
|
|
slideNumber: true,
|
|
|
|
transition: 'fade', // none/fade/slide/convex/concave/zoom
|
|
|
|
// Optional ../reveal.js plugins
|
|
dependencies: [
|
|
{ src: '../../reveal.js-3.7.0/plugin/svginline/data-src-svg.js' },
|
|
{ src: '../reveal.js-3.5.0/lib/js/classList.js', condition: function() { return !document.body.classList; } },
|
|
{ src: '../reveal.js-3.5.0/plugin/math/math.js',
|
|
condition: function() { return true; },
|
|
mathjax: '../reveal.js-3.5.0/js/MathJax.js'
|
|
},
|
|
{ src: '../reveal.js-3.5.0/plugin/markdown/marked.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
|
|
{ src: '../reveal.js-3.5.0/plugin/markdown/markdown.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
|
|
//{ src: '../reveal.js-3.5.0/plugin/highlight/highlight.js', async: true, condition: function() { return !!document.querySelector( 'tt code' ); }, callback: function() { hljs.initHighlightingOnLoad(); } },
|
|
{ src: '../../reveal.js-3.7.0/plugin/highlight/highlight-9.16.2.js', async: true,
|
|
callback: function() { hljs.initHighlightingOnLoad(); } },
|
|
{ src: '../reveal.js-3.5.0/plugin/zoom-js/zoom.js', async: true },
|
|
{ src: '../reveal.js-3.5.0/plugin/notes/notes.js', async: true }
|
|
]
|
|
});
|
|
|
|
</script>
|
|
|
|
<script>document.write('<script src="http://' + (location.host || 'localhost').split(':')[0] + ':35729/livereload.js?snipver=1"></' + 'script>')</script>
|
|
|
|
</body>
|
|
</html>
|