Finshing slides

This commit is contained in:
Oliver Kennedy 2018-02-05 10:39:46 -05:00
parent 4f5dc3dabe
commit cee46ab57b
2 changed files with 273 additions and 8 deletions

View file

@ -126,13 +126,13 @@
<p style="font-size: smaller;" class="fragment">What is the ID, Commmon Name and Borough of Trees in Brooklyn?</p>
<table style="font-size: small;" class="fragment">
<tr><th>TREE_ID</th><th>SPC_COMMON</th><th>BORONAME</th>
<tr><td>204026</td><td>'honeylocust'</td><td>'Brooklyn'</td>
<tr><td>204337</td><td>'honeylocust'</td><td>'Brooklyn'</td>
<tr><td>189565</td><td>'American linden'</td><td>'Brooklyn'</td>
<tr><td>192755</td><td>'London planetree'</td><td>'Brooklyn'</td>
<tr><td>189465</td><td>'London planetree'</td><td>'Brooklyn'</td>
<tr><td style="font-weight: bold;" colspan="3">... and 177287 more</td>
<tr><th>TREE_ID</th><th>SPC_COMMON</th><th>BORONAME</th></tr>
<tr><td>204026</td><td>'honeylocust'</td><td>'Brooklyn'</td></tr>
<tr><td>204337</td><td>'honeylocust'</td><td>'Brooklyn'</td></tr>
<tr><td>189565</td><td>'American linden'</td><td>'Brooklyn'</td></tr>
<tr><td>192755</td><td>'London planetree'</td><td>'Brooklyn'</td></tr>
<tr><td>189465</td><td>'London planetree'</td><td>'Brooklyn'</td></tr>
<tr><td style="font-weight: bold;" colspan="3">... and 177287 more</td></tr>
</table>
</section>

View file

@ -29,6 +29,8 @@
document.getElementsByTagName( 'head' )[0].appendChild( link );
</script>
<script src="../reveal.js-3.6.0/lib/js/head.min.js"></script>
<!--[if lt IE 9]>
<script src="../reveal.js-3.6.0/lib/js/html5shiv.js"></script>
<![endif]-->
@ -298,11 +300,274 @@ For each week:
</table>
<p class="fragment" data-fragment-index="5">First we focus on sets and bags.</p>
</section>
<section>
<h3>Selection ($\sigma_{c}$)</h3>
<p>Delete rows that fail the condition $c$.</p>
<div class="fragment">
$$\sigma_{(BORONAME = \texttt{'Brooklyn'})} \textbf{Trees}$$
<table style="font-size: small; margin-top: 30px;">
<tr><th>TREE_ID</th><th>SPC_COMMON</th><th>BORONAME</th><th>...</th></tr>
<tr><td>204026</td><td>'honeylocust'</td><td>'Brooklyn'</td><td>...</td></tr>
<tr><td>204337</td><td>'honeylocust'</td><td>'Brooklyn'</td><td>...</td></tr>
<tr><td>189565</td><td>'American linden'</td><td>'Brooklyn'</td><td>...</td></tr>
<tr><td>192755</td><td>'London planetree'</td><td>'Brooklyn'</td><td>...</td></tr>
<tr><td>189465</td><td>'London planetree'</td><td>'Brooklyn'</td><td>...</td></tr>
<tr><td style="font-weight: bold;" colspan="4">... and 177287 more</td></tr>
</table>
</div>
</section>
<section>
<h3>Projection ($\pi_{A}$)</h3>
<p>Delete attributes not in the projection list $A$.</p>
<div class="fragment">
$$\pi_{BORONAME}(Trees)$$
<table style="font-size: small; margin-top: 30px;">
<tr><th>BORONAME</th></tr>
<tr><td>Queens</td></tr>
<tr><td>Brooklyn</td></tr>
<tr><td>Manhatten</td></tr>
<tr><td>Bronx</td></tr>
<tr><td>Staten Island</td></tr>
</table>
</div>
<p class="fragment">Only 5 results... not 683788?</p>
<p class="fragment">Set and Bag Projection are different</p>
</section>
<section>
<h3>Reminder: Queries are Relations</h3>
<p>What are these queries schemas?</p>
<div class="fragment" style="margin-top: 50px;">$$\pi_{TREEID, SPC\_COMMON, BORONAME} \textbf{Trees}$$</div>
<div class="fragment" style="margin-top: 50px;">$$\sigma_{(BORONAME = \texttt{'Brooklyn'})} \textbf{Trees}$$</div>
<div class="fragment" style="margin-top: 50px;">$$\sigma_{(BORONAME = \texttt{'Brooklyn'})}(\pi_{TREEID, SPC\_COMMON, BORONAME} \textbf{Trees})$$</div>
</section>
<section>
<h3>Union ($\cup$)</h3>
<p style="margin-bottom: 0px;">Takes two relations that are <u>union-compatible</u>...</p>
<p style="font-size: 60%; margin-top: 0px;">(Both relations have the same number of fields with the same types)</p>
<p>... and returns all tuples appearing in <u>either</u> relation</p>
<div class="fragment" style="font-size: 70%;">
$$(\sigma_{(BORONAME=\texttt{'Brooklyn'})} \textbf{Trees}) \cup (\sigma_{(BORONAME=\texttt{'Manhattan'})} \textbf{Trees})$$
</div>
<p class="fragment">We use $\uplus$ if we explicitly mean <i>bag</i> union</p>
</section>
<section>
<h3>Intersection ($\cap$)</h3>
<p>Return all tuples appearing in <u>both</u> <br/>of two union-compatible relations</p>
<div class="fragment" style="font-size: 70%;">
$$(\sigma_{(BORONAME=\texttt{'Brooklyn'})} (\pi_{SPC\_COMMON} \textbf{Trees})) \\ ~~~~~~~~~\cap (\sigma_{(BORONAME=\texttt{'Manhattan'})} (\pi_{SPC\_COMMON} \textbf{Trees}))$$
<p>What is this query asking?</p>
</div>
</section>
<section>
<h3>Set Difference</h3>
<p>Return all tuples appearing in the first, but not the second<br/>of two union-compatible relations</p>
<div class="fragment" style="font-size: 70%;">
$$(\sigma_{(BORONAME=\texttt{'Brooklyn'})} (\pi_{SPC\_COMMON} \textbf{Trees})) \\ ~~~~~~~~~- (\sigma_{(BORONAME=\texttt{'Manhattan'})} (\pi_{SPC\_COMMON} \textbf{Trees}))$$
<p>What is this query asking?</p>
</div>
</section>
<section>
<h3>Union, Intersection, Set Difference</h3>
<p style="margin-top: 100px;">What is the schema of the result of any of these operators?</p>
</section>
<section>
<h3>Cross (Cartesian) Product ($\times$)</h3>
<p>Create all pairs of tuples.</p>
<div class="fragment" data-fragment-index="1">
<div style="font-size: 70%">
$$\pi_{SPC\_COMMON, BORONAME} (\textbf{Trees}) \times \pi_{SPC\_COMMON, AVG\_HEIGHT} (\textbf{TreeInfo})$$
</div>
<table style="font-size: small; margin-top: 30px; display: inline-block; vertical-align: middle; margin-right: 50px;">
<tr><th>SPC_COMMON</th><th>AVG_HEIGHT</th></tr>
<tr class="fragment highlight-current-blue" data-fragment-index="2"><td>cedar elm</td><td>60</td></tr>
<tr class="fragment highlight-current-blue" data-fragment-index="3"><td>lacebark elm</td><td>45</td></tr>
<tr><td colspan="2" style="font-weight: bold;">... and more</td></tr>
</table>
<table style="font-size: small; margin-top: 30px; display: inline-block; vertical-align: middle;">
<tr><th>SPC_COMMON</th><th>BORONAME</th><th>SPC_COMMON</th><th>AVG_HEIGHT</th></tr>
<tr class="fragment highlight-current-grey" data-fragment-index="3"><td>'honeylocust'</td><td>'Brooklyn'</td><td class="fragment highlight-current-blue" data-fragment-index="2">cedar elm</td><td>60</td></tr>
<tr class="fragment highlight-current-grey" data-fragment-index="3"><td>'honeylocust'</td><td>'Brooklyn'</td><td class="fragment highlight-current-blue" data-fragment-index="2">cedar elm</td><td>60</td></tr>
<tr class="fragment highlight-current-grey" data-fragment-index="3"><td>'American linden'</td><td>'Brooklyn'</td><td class="fragment highlight-current-blue" data-fragment-index="2">cedar elm</td><td>60</td></tr>
<tr class="fragment highlight-current-grey" data-fragment-index="3"><td>'London planetree'</td><td>'Manhattan'</td><td class="fragment highlight-current-blue" data-fragment-index="2">cedar elm</td><td>60</td></tr>
<tr class="fragment highlight-current-grey" data-fragment-index="3"><td>'London planetree'</td><td>'Manhattan'</td><td class="fragment highlight-current-blue" data-fragment-index="2">cedar elm</td><td>60</td></tr>
<tr><td style="font-weight: bold;" colspan="4">...</td></tr>
<tr class="fragment highlight-current-grey" data-fragment-index="2"><td>'honeylocust'</td><td>'Brooklyn'</td><td class="fragment highlight-current-blue" data-fragment-index="3">lacebark elm</td><td>45</td></tr>
<tr class="fragment highlight-current-grey" data-fragment-index="2"><td>'honeylocust'</td><td>'Brooklyn'</td><td class="fragment highlight-current-blue" data-fragment-index="3">lacebark elm</td><td>45</td></tr>
<tr class="fragment highlight-current-grey" data-fragment-index="2"><td>'American linden'</td><td>'Brooklyn'</td><td class="fragment highlight-current-blue" data-fragment-index="3">lacebark elm</td><td>45</td></tr>
<tr class="fragment highlight-current-grey" data-fragment-index="2"><td>'London planetree'</td><td>'Manhattan'</td><td class="fragment highlight-current-blue" data-fragment-index="3">lacebark elm</td><td>45</td></tr>
<tr class="fragment highlight-current-grey" data-fragment-index="2"><td>'London planetree'</td><td>'Manhattan'</td><td class="fragment highlight-current-blue" data-fragment-index="3">lacebark elm</td><td>45</td></tr>
<tr><td style="font-weight: bold;" colspan="4">... and more</td></tr>
</table>
</div>
</section>
<section>
<h3>Cross (Cartesian) Product ($\times$)</h3>
<div style="font-size: 70%; margin-top: 50px;">
$$\pi_{SPC\_COMMON,\ BORONAME} (\textbf{Trees}) \times \pi_{SPC\_COMMON,\ AVG\_HEIGHT} (\textbf{TreeInfo})$$
</div>
<p style="margin-top: 50px;">What is the schema of the resulting relation?</p>
<p class="fragment">The relation has a naming conflict<br/>(two attributes with the same name)</p>
</section>
<section>
<h3>Renaming ($\rho$)</h3>
<div style="font-size: 50%; margin-top: 50px;">
$$\rho_{TNAME,\ BORO,\ INAME,\ HEIGHT}\left( \pi_{SPC\_COMMON,\ BORONAME} (\textbf{Trees}) \times \pi_{SPC\_COMMON,\ AVG\_HEIGHT} (\textbf{TreeInfo})\right)$$
</div>
<p style="margin-top: 50px;">What is the schema of the resulting relation?</p>
<p style="margin-top: 50px;" class="fragment">When writing cross-products on the board,<br/>I will use implicit renaming</p>
</section>
<section>
<h3>Join ($\bowtie_c$)</h3>
<p>Pair tuples according to a condition c.</p>
<div style="font-size: 50%; margin-top: 50px;">
$$\pi_{SPC\_COMMON,\ BORONAME} (\textbf{Trees}) \bowtie_{T.SPC\_COMMON = TI.SPC\_COMMON} \pi_{SPC\_COMMON,\ AVG\_HEIGHT} (\textbf{TreeInfo})$$
</div>
<div class="fragment">
<div style="font-size: 50%; margin-top: 50px;">
Identical to...
$$\sigma_{T.SPC\_COMMON = TI.SPC\_COMMON}\left(\pi_{SPC\_COMMON,\ BORONAME} (\textbf{Trees}) \times \pi_{SPC\_COMMON,\ AVG\_HEIGHT} (\textbf{TreeInfo})\right)$$
</div>
</div>
<div class="fragment" style="margin-top: 100px">
$$R \bowtie_c S \equiv \sigma_c(R \times S)$$
</div>
</section>
<section>
<h3>Join Shorthands</h3>
<p><b>Equi-joins</b> are joins with only equality tests in the condition.</p>
<dl>
<dt class="fragment" data-fragment-index="1">Join on attribute(s)</dt>
<dd class="fragment" data-fragment-index="1">$R \bowtie_{A} S \equiv R \bowtie_{R.A = S.A} S$</dd>
<dd class="fragment" data-fragment-index="1">Same values on the listed attributes</dd>
<dt class="fragment" data-fragment-index="2">Natural Join</dt>
<dd class="fragment" data-fragment-index="2">$R \bowtie S \equiv R \bowtie_{attrs(R) \cap attrs(S)} S$</dd>
<dd class="fragment" data-fragment-index="2">Same values on all shared attributes</dd>
</dl>
</section>
<section>
<h4>Which operators can create duplicates?</h4>
<p style="font-size: 60%">(Which operators behave differently in Set- and Bag-RA?)</p>
<table>
<tr ><th>Operator </th><th>Symbol </th><th >Duplicates?</th></tr>
<tr ><td>Selection </td><td>$\sigma$ </td><td class="fragment" style="color: darkred;">No</td></tr>
<tr class="fragment"><td>Projection </td><td>$\pi$ </td><td class="fragment" style="color: darkgreen;">Yes</td></tr>
<tr class="fragment"><td>Cross-product </td><td>$\times$ </td><td class="fragment" style="color: darkred;">No</td></tr>
<tr class="fragment"><td>Set-difference</td><td>$-$ </td><td class="fragment" style="color: darkred;">No</td></tr>
<tr class="fragment"><td>Union </td><td>$\cup$ </td><td class="fragment" style="color: darkgreen;">Yes</td></tr>
<tr class="fragment"><td>Join </td><td>$\bowtie$</td><td class="fragment" style="color: darkred;">No</td></tr>
</table>
</section>
</section>
<section>
<h3>Group Work</h3>
<p>Find the <b>BORONAME</b>s of all boroughs that <b>do</b> have trees with an average height of below 45 inches</p>
<table style="font-size: small; margin-top: 30px; display: inline-block; vertical-align: middle; margin-right: 50px;">
<tr><th>SPC_COMMON</th><th>AVG_HEIGHT</th></tr>
<tr><td>cedar elm</td><td>60</td></tr>
<tr><td>lacebark elm</td><td>45</td></tr>
<tr><td colspan="2" style="font-weight: bold;">... and more</td></tr>
</table>
<table style="font-size: small; margin-top: 30px; display: inline-block; vertical-align: middle;">
<tr><th>SPC_COMMON</th><th>BORONAME</th></tr>
<tr><td>'honeylocust'</td><td>'Brooklyn'</td></tr>
<tr><td>'honeylocust'</td><td>'Brooklyn'</td></tr>
<tr><td>'American linden'</td><td>'Brooklyn'</td></tr>
<tr><td>'London planetree'</td><td>'Manhattan'</td></tr>
<tr><td>'London planetree'</td><td>'Manhattan'</td></tr>
<tr><td style="font-weight: bold;" colspan="2">... and more</td></tr>
</table>
<div class="fragment">
$$\pi_{BORONAME}(\sigma_{AVG\_HEIGHT < 45}(\textbf{Trees}\bowtie\textbf{TreeInfo}))$$
</div>
<div class="fragment">
$$\pi_{BORONAME}(\textbf{Trees}\bowtie\sigma_{AVG\_HEIGHT < 45}(\textbf{TreeInfo}))$$
</div>
</section>
<section>
<h3>Division ($/$)</h3>
<p>Not typically supported as a primitive operator,<br/>but useful for expressing queries like:</p>
<p style="font-size: 70%; font-weight: bold">Find species that appear in all boroughs</p>
<div style="font-size: 70%">
$$\pi_{BORONAME,\ SPC\_COMMON}(\textbf{Trees}) \;\;/\;\;\pi_{SPC\_COMMON}(\textbf{Trees})$$
(using set relational algebra)
</div>
<p>
$$R / S \equiv \{\; \left<\vec t\right> \;|\; \forall \left<\vec s\right> \in S, \left< \vec t \vec s \right> \in R \;\}$$
</p>
</section>
<section>
<table style="font-size: small; margin-top: 30px; display: inline-block; vertical-align: middle;">
<tr><th>BORO</th> <th>SPC_COMMON</th></tr>
<tr><td>Brooklyn</td> <td>honeylocust</td></tr>
<tr><td>Brooklyn</td> <td>American linden</td></tr>
<tr><td>Brooklyn</td> <td>London planetree</td></tr>
<tr><td>Manhattan</td> <td>honeylocust</td></tr>
<tr><td>Manhattan</td> <td>American linden</td></tr>
<tr><td>Manhattan</td> <td>pin oak</td></tr>
<tr><td>Queens</td> <td>honeylocust</td></tr>
<tr><td>Queens</td> <td>American linden</td></tr>
<tr><td>Bronx</td> <td>honeylocust</td></tr>
</table>
<table style="font-size: 40%; margin-left: 30px; display: inline-block; vertical-align: middle;">
<tr class="fragment"><td style="text-align: left; padding-bottom: 30px;">/ { honeylocust }</td> <td style="text-align: left;">= Brooklyn, Manhattan, Queens, Bronx</td></tr>
<tr class="fragment"><td style="text-align: left; padding-bottom: 30px;">/ { honeylocust, American linden }</td> <td style="text-align: left;">= Brooklyn, Manhattan, Queens</td></tr>
<tr class="fragment"><td style="text-align: left; padding-bottom: 30px;">/ { honeylocust, American linden, pin oak }</td><td style="text-align: left;">= Manhattan</td></tr>
</table>
</section>
<section>
<h3>Group Work</h3>
<p>If time permits: Implement division using other operators.</p>
</section>
</section>
<section>
<section>
<h3>Relational Algebra</h3>
<p>
A simple way to think about and work with<br/>
computations over collections.
</p>
<p>… simple → easy to evaluate</p>
<p>… simple → easy to optimize</p>
<p style="margin-top: 100px">
Next time, Optimizing RA
</p>
</section>
</div></div>
<script src="../reveal.js-3.6.0/lib/js/head.min.js"></script>
<script src="../reveal.js-3.6.0/js/reveal.js"></script>
<script>