2018-12-06 22:45:45 -05:00
<h2>Quality-Aware Machine Learning</h2>
<p>Oliver Kennedy</p>
<p>Jaroslaw Zola</p>
<p>Matthew Knepley</p>
<svg data-src="ml-pipeline.svg" />
<h3>Fixing Data is Expensive</h3>
<p class="fragment">(or impossible)</p>
<h3>Re-using already fixed data is dangerous.</h3>
<p class="fragment">(the "right" fix depends on use case)</p>
<h3>Idea: Track Errors</h3>
<p>Incomplete Databases store <i>possibilities</i>, not just certainties.</p>
<svg data-src="tag-pipeline.svg" />
<li class="fragment" style="margin-top: 20px;">Statistically rigorous techniques for training classifiers, neural networks on incomplete databases.</li>
<li class="fragment" style="margin-top: 20px;">Models incorporating incompleteness information.
<div style="font-size: 70%; color: grey; margin-left: 20px;">"I didn't have enough training data" should be an allowed prediction.</div>
<li class="fragment" style="margin-top: 20px;">Incompleteness as an assist for model debugging.
<div style="font-size: 70%; color: grey; margin-left: 20px;">Which errors have the biggest impact on a prediction? <br/>
Which errors best explain an incorrect prediction?</div>
