oops
This commit is contained in:
parent
e1e470c642
commit
7bebdc85b3
Before Width: | Height: | Size: 9.6 KiB After Width: | Height: | Size: 9.6 KiB |
Before Width: | Height: | Size: 11 KiB After Width: | Height: | Size: 11 KiB |
|
@ -246,28 +246,28 @@
|
||||||
<section>
|
<section>
|
||||||
<h2>CSV Import</h2>
|
<h2>CSV Import</h2>
|
||||||
<h4>Run a <code>SELECT</code> on a raw CSV File</h4>
|
<h4>Run a <code>SELECT</code> on a raw CSV File</h4>
|
||||||
<ul class="fragment">
|
<ul>
|
||||||
<li>File may not have column headers</li>
|
<li>File may not have column headers</li>
|
||||||
<li>CSV does not provide "types"</li>
|
<li>CSV does not provide "types"</li>
|
||||||
<li>Lines may be missing fields</li>
|
<li>Lines may be missing fields</li>
|
||||||
<li>Fields may be mistyped (typo, missing comma)</li>
|
<li>Fields may be mistyped (typo, missing comma)</li>
|
||||||
<li>Comment text can be inlined into the file</li>
|
<li>Comment text can be inlined into the file</li>
|
||||||
</ul>
|
</ul>
|
||||||
<p class="fragment">
|
<p>
|
||||||
<b>State of the art</b>: External Table Defn <span class="fragment">+ "Manually" edit CSV</span>
|
<b>State of the art</b>: External Table Defn <span>+ "Manually" edit CSV</span>
|
||||||
</p>
|
</p>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section>
|
<section>
|
||||||
<h2>Merge Two Datasets</h2>
|
<h2>Merge Two Datasets</h2>
|
||||||
<h4><code>UNION</code> two data sources</h4>
|
<h4><code>UNION</code> two data sources</h4>
|
||||||
<ul class="fragment">
|
<ul>
|
||||||
<li>Schema matching</li>
|
<li>Schema matching</li>
|
||||||
<li>Deduplication</li>
|
<li>Deduplication</li>
|
||||||
<li>Format alignment (GIS coordinates, $ vs €)
|
<li>Format alignment (GIS coordinates, $ vs €)
|
||||||
<li>Precision alignment (State vs County)</li>
|
<li>Precision alignment (State vs County)</li>
|
||||||
</ul>
|
</ul>
|
||||||
<p class="fragment">
|
<p>
|
||||||
<b>State of the art</b>: Manually map schema
|
<b>State of the art</b>: Manually map schema
|
||||||
</p>
|
</p>
|
||||||
</section>
|
</section>
|
||||||
|
@ -275,19 +275,17 @@
|
||||||
<section>
|
<section>
|
||||||
<h2>JSON Shredding</h2>
|
<h2>JSON Shredding</h2>
|
||||||
<h4>Run a <code>SELECT</code> on JSON or a Doc Store</h4>
|
<h4>Run a <code>SELECT</code> on JSON or a Doc Store</h4>
|
||||||
<ul class="fragment">
|
<ul>
|
||||||
<li>Separating fields and record sets:<br/>(e.g., <code>{ A: "Bob", B: "Alice" }</code>)</li>
|
<li>Separating fields and record sets:<br/>(e.g., <code>{ A: "Bob", B: "Alice" }</code>)</li>
|
||||||
<li>Missing fields (Records with no 'address')</li>
|
<li>Missing fields (Records with no 'address')</li>
|
||||||
<li>Type alignment (Records with 'address' as an array)</li>
|
<li>Type alignment (Records with 'address' as an array)</li>
|
||||||
<li>Schema matching$^2$</li>
|
<li>Schema matching$^2$</li>
|
||||||
</ul>
|
</ul>
|
||||||
<p class="fragment">
|
<p>
|
||||||
<b>State of the art</b>: DataGuide, Wrangler, etc...
|
<b>State of the art</b>: DataGuide, Wrangler, etc...
|
||||||
</p>
|
</p>
|
||||||
</section>
|
</section>
|
||||||
</section>
|
|
||||||
|
|
||||||
<section>
|
|
||||||
<section>
|
<section>
|
||||||
<h2>Data Cleaning is Hard!</h2>
|
<h2>Data Cleaning is Hard!</h2>
|
||||||
</section>
|
</section>
|
||||||
|
@ -300,21 +298,14 @@
|
||||||
|
|
||||||
<p>Alice spends weeks cleaning her data before using it.</p>
|
<p>Alice spends weeks cleaning her data before using it.</p>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section>
|
|
||||||
<h3>Newer State of the Art</h3>
|
|
||||||
<img src="graphics/iu.jpeg" height=500 />
|
|
||||||
<attribution>(azure.microsoft.com)</attribution>
|
|
||||||
</section>
|
|
||||||
|
|
||||||
<section>
|
|
||||||
<img src="graphics/data-lake-to-data-swamp.jpg" height=500 />
|
|
||||||
<attribution>(timoelliott.com)</attribution>
|
|
||||||
</section>
|
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section>
|
<section>
|
||||||
|
|
||||||
|
<section>
|
||||||
|
<h2>The database is in the way</h2>
|
||||||
|
</section>
|
||||||
|
|
||||||
<section>
|
<section>
|
||||||
<h3>
|
<h3>
|
||||||
In the name of Codd,<br/><span class="fragment grow highlight-current-blue">thou shalt not give the user a wrong answer.</span>
|
In the name of Codd,<br/><span class="fragment grow highlight-current-blue">thou shalt not give the user a wrong answer.</span>
|
||||||
|
@ -327,6 +318,7 @@
|
||||||
What would it take for that to be ok?
|
What would it take for that to be ok?
|
||||||
</h4>
|
</h4>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section>
|
<section>
|
||||||
<h2>Industry says...</h2>
|
<h2>Industry says...</h2>
|
||||||
</section>
|
</section>
|
||||||
|
@ -396,7 +388,7 @@
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section>
|
<section>
|
||||||
<h2>What if a database did the same?</h2>
|
<h3>What if a database did the same?</h3>
|
||||||
<h4 class="fragment">(they can)</h4>
|
<h4 class="fragment">(they can)</h4>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
@ -404,12 +396,145 @@
|
||||||
|
|
||||||
<section>
|
<section>
|
||||||
<h3>On representing incomplete information in a relational data base</h3>
|
<h3>On representing incomplete information in a relational data base</h3>
|
||||||
<h4>T. Imielinski & W. Lipski Jr.<span style="margin-left: 40px">(<i>VLDB <span class="fragment grow highlight-current-red">1981</span></i>)</span></h4>
|
<h4>T. Imielinski & W. Lipski Jr.<span style="margin-left: 40px">(<i>VLDB <span class="fragment highlight-current-red" data-fragment-index="1">1981</span></i>)</span></h4>
|
||||||
<p class="fragment">
|
<p class="fragment" data-fragment-index="1" style="margin-top: 60px">
|
||||||
Incomplete and Probabilistic
|
Incomplete and Probabilistic Databases<br/>have existed since the 1980s
|
||||||
</p>
|
</p>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
<section>
|
||||||
|
<svg width="800" height="500">
|
||||||
|
<g transform="translate(150,0)">
|
||||||
|
<image
|
||||||
|
xlink:href="graphics/db.svg"
|
||||||
|
width="93" height="103"
|
||||||
|
x="0" y="10"
|
||||||
|
/>
|
||||||
|
<image
|
||||||
|
xlink:href="graphics/db.svg"
|
||||||
|
width="93" height="103"
|
||||||
|
x="0" y="130"
|
||||||
|
/>
|
||||||
|
<image
|
||||||
|
xlink:href="graphics/db.svg"
|
||||||
|
width="93" height="103"
|
||||||
|
x="0" y="250"
|
||||||
|
/>
|
||||||
|
<image
|
||||||
|
xlink:href="graphics/db.svg"
|
||||||
|
width="93" height="103"
|
||||||
|
x="0" y="370"
|
||||||
|
/>
|
||||||
|
</g>
|
||||||
|
<g
|
||||||
|
transform="translate(250, 0)"
|
||||||
|
class="fragment"
|
||||||
|
style="
|
||||||
|
fill: rgba(200, 50, 50, 0);
|
||||||
|
stroke-width: 4;
|
||||||
|
stroke: rgba(200, 200, 200, 1);
|
||||||
|
">
|
||||||
|
<polyline
|
||||||
|
points="0,60 220,60 200,50 220,60 200,70 220,60 0,60"
|
||||||
|
transform="translate(0,0)"
|
||||||
|
/>
|
||||||
|
<polyline
|
||||||
|
points="0,60 220,60 200,50 220,60 200,70 220,60 0,60"
|
||||||
|
transform="translate(0,120)"
|
||||||
|
/>
|
||||||
|
<polyline
|
||||||
|
points="0,60 220,60 200,50 220,60 200,70 220,60 0,60"
|
||||||
|
transform="translate(0,240)"
|
||||||
|
/>
|
||||||
|
<polyline
|
||||||
|
points="0,60 220,60 200,50 220,60 200,70 220,60 0,60"
|
||||||
|
transform="translate(0,360)"
|
||||||
|
/>
|
||||||
|
<text x="60" y="50">Q(D)</text>
|
||||||
|
<text x="60" y="170">Q(D)</text>
|
||||||
|
<text x="60" y="290">Q(D)</text>
|
||||||
|
<text x="60" y="410">Q(D)</text>
|
||||||
|
<image
|
||||||
|
xlink:href="graphics/jean-victor-balin-icon-table.svg"
|
||||||
|
width="96" height="96"
|
||||||
|
x="230" y="15"
|
||||||
|
/>
|
||||||
|
<image
|
||||||
|
xlink:href="graphics/jean-victor-balin-icon-table.svg"
|
||||||
|
width="96" height="96"
|
||||||
|
x="230" y="135"
|
||||||
|
/>
|
||||||
|
<image
|
||||||
|
xlink:href="graphics/jean-victor-balin-icon-table.svg"
|
||||||
|
width="96" height="96"
|
||||||
|
x="230" y="255"
|
||||||
|
/>
|
||||||
|
<image
|
||||||
|
xlink:href="graphics/jean-victor-balin-icon-table.svg"
|
||||||
|
width="96" height="96"
|
||||||
|
x="230" y="375"
|
||||||
|
/>
|
||||||
|
</g>
|
||||||
|
<g
|
||||||
|
transform="translate(0, 0)"
|
||||||
|
class="fragment"
|
||||||
|
style="
|
||||||
|
fill: rgba(200, 50, 50, 0);
|
||||||
|
stroke-width: 4;
|
||||||
|
stroke: rgba(200, 200, 200, 1);
|
||||||
|
">
|
||||||
|
<polyline
|
||||||
|
points="20,60 140,60 120,50 140,60 120,70 140,60"
|
||||||
|
transform="translate(0,200) rotate(-60)"
|
||||||
|
/>
|
||||||
|
<polyline
|
||||||
|
points="70,60 140,60 120,50 140,60 120,70 140,60"
|
||||||
|
transform="translate(-15,200) rotate(-20)"
|
||||||
|
/>
|
||||||
|
<polyline
|
||||||
|
points="70,60 140,60 120,50 140,60 120,70 140,60"
|
||||||
|
transform="translate(25,170) rotate(20)"
|
||||||
|
/>
|
||||||
|
<polyline
|
||||||
|
points="20,60 140,60 120,50 140,60 120,70 140,60"
|
||||||
|
transform="translate(102,220) rotate(60)"
|
||||||
|
/>
|
||||||
|
<text x="40" y="250">?</text>
|
||||||
|
</g>
|
||||||
|
<g
|
||||||
|
transform="translate(540, 0)"
|
||||||
|
class="fragment"
|
||||||
|
style="
|
||||||
|
fill: rgba(200, 50, 50, 0);
|
||||||
|
stroke-width: 4;
|
||||||
|
stroke: rgba(200, 200, 200, 1);
|
||||||
|
">
|
||||||
|
<polyline
|
||||||
|
points="20,60 140,60 120,50 140,60 120,70 140,60"
|
||||||
|
transform="translate(102,30) rotate(60)"
|
||||||
|
/>
|
||||||
|
<polyline
|
||||||
|
points="70,60 140,60 120,50 140,60 120,70 140,60"
|
||||||
|
transform="translate(0,120) rotate(20)"
|
||||||
|
/>
|
||||||
|
<polyline
|
||||||
|
points="70,60 140,60 120,50 140,60 120,70 140,60"
|
||||||
|
transform="translate(-40,240) rotate(-20)"
|
||||||
|
/>
|
||||||
|
<polyline
|
||||||
|
points="20,60 140,60 120,50 140,60 120,70 140,60"
|
||||||
|
transform="translate(0,390) rotate(-60)"
|
||||||
|
/>
|
||||||
|
<image
|
||||||
|
xlink:href="graphics/dagobert83-female-user-icon-800px.png"
|
||||||
|
width="100" height="100"
|
||||||
|
x="110" y="180"
|
||||||
|
/>
|
||||||
|
<text x="180" y="190">?</text>
|
||||||
|
</g>
|
||||||
|
|
||||||
|
</svg>
|
||||||
|
</section>
|
||||||
|
|
||||||
</div></div>
|
</div></div>
|
||||||
|
|
||||||
|
|
|
@ -12,7 +12,7 @@ UB-CSE is celebrating [50 years of Computer Science and Engineering](http://cse.
|
||||||
The ODIn Lab will be showing up in force at the [CSE50 Undergraduate and Gradeuate conferences](https://engineering.buffalo.edu/computer-science-engineering/news-events/cse50.program.html).
|
The ODIn Lab will be showing up in force at the [CSE50 Undergraduate and Gradeuate conferences](https://engineering.buffalo.edu/computer-science-engineering/news-events/cse50.program.html).
|
||||||
|
|
||||||
* Lisa and Olivia will demo Mimir at the Undergraduate Event during the Welcome Reception on Thursday.
|
* Lisa and Olivia will demo Mimir at the Undergraduate Event during the Welcome Reception on Thursday.
|
||||||
* Poonam, Will, Aaron, and Lisa will present on <a href="/papers/2017/CSE50/mimir.pdf">Mimir</a> at the Graduate Poster Session on Saturday.
|
* Poonam, Will, Aaron, Shivang, and Lisa will present on <a href="/papers/2017/CSE50/mimir.pdf">Mimir</a> at the Graduate Poster Session on Saturday.
|
||||||
* Saurav and Darshana will present on <a href="/papers/2017/CSE50/jitds.pdf">JITDs</a> at the Graduate Poster Session on Saturday.
|
* Saurav and Darshana will present on <a href="/papers/2017/CSE50/jitds.pdf">JITDs</a> at the Graduate Poster Session on Saturday.
|
||||||
* Duc, Ting, and Gokhan will present on <a href="/papers/2017/CSE50/insiderthreats.pdf">The Insider Threats project</a> at the Graduate Poster Session on Saturday.
|
* Duc, Ting, and Gokhan will present on <a href="/papers/2017/CSE50/insiderthreats.pdf">The Insider Threats project</a> at the Graduate Poster Session on Saturday.
|
||||||
* Gourab, Gokhan, and Carl will present on <a href="papers/2017/CSE50/pocketdata.pgf">The PocketData project</a> at the Graduate Poster Session on Saturday.
|
* Gourab, Gokhan, and Carl will present on <a href="papers/2017/CSE50/pocketdata.pgf">The PocketData project</a> at the Graduate Poster Session on Saturday.
|
||||||
|
|
Loading…
Reference in a new issue