Website/slides/cse4562sp2018/2018-02-14-ExtRA.html

363 lines
11 KiB
HTML
Raw Normal View History

2018-02-06 22:44:57 -05:00
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>CSE 4/562 - Spring 2018</title>
<meta name="description" content="CSE 4/562 - Spring 2018">
<meta name="author" content="Oliver Kennedy">
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
<link rel="stylesheet" href="../reveal.js-3.6.0/css/reveal.css">
<link rel="stylesheet" href="ubodin.css" id="theme">
<!-- Code syntax highlighting -->
<link rel="stylesheet" href="../reveal.js-3.6.0/lib/css/zenburn.css">
<!-- Printing and PDF exports -->
<script>
var link = document.createElement( 'link' );
link.rel = 'stylesheet';
link.type = 'text/css';
link.href = window.location.search.match( /print-pdf/gi ) ? '../reveal.js-3.6.0/css/print/pdf.css' : '../reveal.js-3.6.0/css/print/paper.css';
document.getElementsByTagName( 'head' )[0].appendChild( link );
</script>
<script src="../reveal.js-3.6.0/lib/js/head.min.js"></script>
<!--[if lt IE 9]>
<script src="../reveal.js-3.6.0/lib/js/html5shiv.js"></script>
<![endif]-->
</head>
<body>
<div class="reveal">
<!-- Any section element inside of this container is displayed as a slide -->
<div class="header">
<!-- Any Talk-Specific Header Content Goes Here -->
CSE 4/562 - Database Systems
</div>
<div class="slides">
<section>
2018-02-12 10:36:05 -05:00
<h1>Extended RA</h1>
2018-02-06 22:44:57 -05:00
<h3>CSE 4/562 Database Systems</h3>
<h5>February 12, 2018</h5>
</section>
<section>
<section>
<h3>Extended Relational Algebra</h3>
<dl>
2018-02-12 10:36:05 -05:00
<dt class="fragment highlight-grey" data-fragment-index="1">Set/Bag Operations</dt>
2018-02-06 22:44:57 -05:00
<dd class="fragment highlight-grey" data-fragment-index="1">Select ($\sigma$), Project ($\pi$), Join ($\bowtie$), Union ($\cup$)</dd>
<dt>Bag Operations</dt>
2018-02-13 23:22:39 -05:00
<dd>Distinct ($\delta$), Outer Joins (⟗)</dd>
2018-02-06 22:44:57 -05:00
<dt>List Operations</dt>
<dd>Sort ($\tau$), Limit</dd>
<dt>Arithmetic Operations</dt>
<dd>Extended Projection ($\pi$), Aggregation ($\sigma$), Grouping ($\gamma$)</dd>
</dl>
2018-02-12 10:36:05 -05:00
<dl>
</dl>
</section>
2018-02-13 23:22:39 -05:00
<section>
<h3>Extended Projection</h3>
<p style="margin: 50px;">Like normal projection, but can create new columns</p>
$\pi_{M \leftarrow A+B*C,\; N \leftarrow 2}(R)$
<p>produces 1 row for every row of R, with 2 columns: M and N</p>
</section>
</section>
2018-02-12 10:36:05 -05:00
2018-02-13 23:22:39 -05:00
<section>
2018-02-13 19:44:12 -05:00
<section>
<h3>Outer Join</h3>
2018-02-13 23:22:39 -05:00
<p class="fragment">... but first</p>
</section>
<section>
<h3><code>NULL</code> Values</h3>
<ul>
<li>Field values can be unknown or inapplicable.<ul>
<li>A tree with an unknown species.</li>
<li>The street of a tree in a park.</li>
<li>The '.' on many of your ID cards</li>
</ul></li>
<li>SQL provides a special <code>NULL</code> value for these cases</li>
</ul>
<p class="fragment">NULL makes things more complicated.</p>
</section>
<section>
$$\textbf{Trees.SPC_COMMON} = \texttt{'Brooklyn'}$$
<p style="margin: 50px;" class="fragment">
What happens if <b>Trees.SPC_COMMON</b> is NULL?
</p>
<div class="fragment">
$$\texttt{NULL} = \textbf{'Brooklyn'} \equiv \textbf{Unknown}$$
</div>
</section>
<section>
<table>
<tr><td>Unknown</td><td>AND</td><td>Unknown</td><td>$\equiv$</td><td>Unknown</td></tr>
<tr><td>Unknown</td><td>AND</td><td>True</td><td>$\equiv$</td><td>Unknown</td></tr>
<tr><td>Unknown</td><td>AND</td><td>False</td><td>$\equiv$</td><td>False</td></tr>
<tr><td colspan="5"></td></tr>
<tr><td>Unknown</td><td>OR</td><td>Unknown</td><td>$\equiv$</td><td>Unknown</td></tr>
<tr><td>Unknown</td><td>OR</td><td>True</td><td>$\equiv$</td><td>True</td></tr>
<tr><td>Unknown</td><td>OR</td><td>False</td><td>$\equiv$</td><td>Unknown</td></tr>
<tr><td colspan="5"></td></tr>
<tr><td></td><td>NOT</td><td>Unknown</td><td>$\equiv$</td><td>Unknown</td></tr>
</table>
<p class="fragment"><code>WHERE</code> clauses eliminate all non-True rows</p>
</section>
<section>
$$Streets \bowtie_{StreetName} Trees$$
<p style="margin-top: 100px;" class="fragment">What happens if some streets have no trees?</p>
</section>
<section>
<h3>Outer Join (⟗, ⟕, ⟖)</h3>
<ol>
<li>Include all results from the normal (inner) join.</li>
<li>Also include rows that don't get joined.</li>
</ol>
</section>
<section>
<h3>Outer Join</h3>
<dl>
<dt>Inner Join</dt>
<dd>Normal, plain, simple join</dd>
<dt>Left Outer Join (⟕)</dt>
<dd>Include un-joined rows from the left hand side</dd>
<dt>Right Outer Join (⟖)</dt>
<dd>Include un-joined rows from the right hand side</dd>
<dt>[Full] Outer Join (⟗)</dt>
<dd>Include un-joined rows from either side</dd>
</dl>
2018-02-13 19:44:12 -05:00
</section>
</section>
<section>
2018-02-12 10:36:05 -05:00
<section>
<h3>Sort / Limit</h3>
<p>
$$\tau_{A}(R)$$
The tuples of $R$ in ascending order according to 'A'
</p>
<p>
$$\textbf{L}_{n}(R)$$
The first $n$ tuples of R
<div style="font-size: 60%">(Typically combined with sort. If not, pick arbitrarily.)</div>
</p>
</section>
<section>
2018-02-13 19:44:12 -05:00
<h3>Sort</h3>
<p style="margin-top: 100px;">
Pick your favorite sort algorithm.
</p>
<p class="fragment">What happens if you don't have enough memory?</p>
</section>
<section>
<p><b>Key Idea:</b> Merging 2 sorted lists requires $O(1)$ memory.</p>
</section>
<section>
<h3>2-Way Sort</h3>
<dl>
<dt>Pass 1</dt>
<dd>Create lots of (small) sorted lists.</dd>
2018-02-12 10:36:05 -05:00
2018-02-13 19:44:12 -05:00
<dt>Pass 2+</dt>
<dd>Merge sorted lists of size $N$ into sorted lists of size $2N$</dd>
</dl>
2018-02-06 22:44:57 -05:00
</section>
2018-02-13 22:30:08 -05:00
<section>
<h3>Pass 1: Create Sorted Runs</h3>
<svg data-src="graphics/2018-02-14-Sort-Run.svg" style="margin: 100px;" />
</section>
<section>
<h3>Pass 2: Merge Sorted Runs</h3>
<svg data-src="graphics/2018-02-14-Merge-Run.svg" style="margin: 100px;" />
</section>
<section>
<p>Repeat Pass 2 As Needed.</p>
</section>
<section>
<p>What's the bottleneck?</p>
2018-02-13 23:22:39 -05:00
<p class="fragment">IO Cost: $O(N \cdot \lceil\log_2(N)\rceil)$<br/>(with $N$ blocks)</p>
2018-02-13 22:30:08 -05:00
</section>
<section>
<svg data-src="graphics/2018-02-14-Merge-Tree.svg" />
</section>
<section>
<h3>Using More Memory</h3>
<dl>
<dt>Pass 1</dt>
<dd>Sort Bigger Buffers</dd>
2018-02-13 23:22:39 -05:00
<dd class="fragment">Re-Use Memory When Done</dd>
2018-02-13 22:30:08 -05:00
2018-02-13 23:22:39 -05:00
<dt>Pass 2</dt>
<dd>Merge $K$ Runs Simultaneously: $O(N \cdot \lceil\log_K(N)\rceil)$ IO</dd>
2018-02-13 22:30:08 -05:00
</dl>
</section>
2018-02-06 22:44:57 -05:00
2018-02-13 23:22:39 -05:00
<section>
<h3>Replacement Sort</h3>
<img src="graphics/2018-02-14-ReplSort-1.svg" />
</section>
<section>
<h3>Replacement Sort</h3>
<img src="graphics/2018-02-14-ReplSort-2.svg" />
</section>
<section>
<h3>Replacement Sort</h3>
<img src="graphics/2018-02-14-ReplSort-3.svg" />
</section>
<section>
<h3>Replacement Sort</h3>
<img src="graphics/2018-02-14-ReplSort-4.svg" />
</section>
<section>
<h3>Replacement Sort</h3>
<img src="graphics/2018-02-14-ReplSort-5.svg" />
</section>
<section>
<h3>Replacement Sort</h3>
<img src="graphics/2018-02-14-ReplSort-6.svg" />
</section>
<section>
<h3>Replacement Sort</h3>
<img src="graphics/2018-02-14-ReplSort-7.svg" />
</section>
<section>
<h3>Replacement Sort</h3>
<img src="graphics/2018-02-14-ReplSort-8.svg" />
</section>
<section>
<p>On average, we'll get runs of size $2 \cdot |WS|$</p>
</section>
</section>
2018-02-06 22:44:57 -05:00
</div></div>
<script src="../reveal.js-3.6.0/js/reveal.js"></script>
<script>
// Full list of configuration options available at:
// https://github.com/hakimel/../reveal.js#configuration
Reveal.initialize({
controls: false,
progress: true,
history: true,
center: true,
slideNumber: true,
transition: 'fade', // none/fade/slide/convex/concave/zoom
chart: {
defaults: {
global: {
title: { fontColor: "#333", fontSize: 24 },
legend: {
labels: { fontColor: "#333", fontSize: 20 },
},
responsiveness: true
},
scale: {
scaleLabel: { fontColor: "#333", fontSize: 20 },
gridLines: { color: "#333", zeroLineColor: "#333" },
ticks: { fontColor: "#333", fontSize: 16 },
}
},
line: { borderColor: [ "rgba(20,220,220,.8)" , "rgba(220,120,120,.8)", "rgba(20,120,220,.8)" ], "borderDash": [ [5,10], [0,0] ]},
bar: { backgroundColor: [
"rgba(220,220,220,0.8)",
"rgba(151,187,205,0.8)",
"rgba(205,151,187,0.8)",
"rgba(187,205,151,0.8)"
]
},
pie: { backgroundColor: [ ["rgba(0,0,0,.8)" , "rgba(220,20,20,.8)", "rgba(20,220,20,.8)", "rgba(220,220,20,.8)", "rgba(20,20,220,.8)"] ]},
radar: { borderColor: [ "rgba(20,220,220,.8)" , "rgba(220,120,120,.8)", "rgba(20,120,220,.8)" ]},
},
// Optional ../reveal.js plugins
dependencies: [
{ src: '../reveal.js-3.6.0/lib/js/classList.js', condition: function() { return !document.body.classList; } },
{ src: '../reveal.js-3.6.0/plugin/math/math.js',
condition: function() { return true; },
mathjax: '../reveal.js-3.6.0/js/MathJax.js'
},
{ src: '../reveal.js-3.6.0/plugin/markdown/marked.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
{ src: '../reveal.js-3.6.0/plugin/markdown/markdown.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
{ src: '../reveal.js-3.6.0/plugin/highlight/highlight.js', async: true, condition: function() { return !!document.querySelector( 'pre code' ); }, callback: function() { hljs.initHighlightingOnLoad(); } },
{ src: '../reveal.js-3.6.0/plugin/zoom-js/zoom.js', async: true },
{ src: '../reveal.js-3.6.0/plugin/notes/notes.js', async: true },
// Chart.min.js
{ src: '../reveal.js-3.6.0/plugin/chart/Chart.min.js'},
// the plugin
{ src: '../reveal.js-3.6.0/plugin/chart/csv2chart.js'},
{ src: '../reveal.js-3.6.0/plugin/svginline/es6-promise.auto.js', async: false },
{ src: '../reveal.js-3.6.0/plugin/svginline/data-src-svg.js', async: false }
]
});
</script>
</body>
</html>