foo

2021-02-18 13:28:10 -05:00 · 2021-02-18 13:28:10 -05:00 · be815d9adb
parent 41f986b1b4
commit be815d9adb
2 changed files with 41 additions and 13 deletions
--- a/src/teaching/cse-562/2021sp/slide/2021-02-18-QueryAlgorithms.erb
+++ b/src/teaching/cse-562/2021sp/slide/2021-02-18-QueryAlgorithms.erb
@ -19,6 +19,15 @@ textbook: "Ch. 15.1-15.5, 16.7"
  Might help to tighten up the time spent a little too.  I had to cut out before introducing Sort-Merge Joins
 -->

+<section>
+  <h3>News</h3>
+
+  <ul>
+    <li>Homework 1 assigned last night, due Weds night.</li>
+    <li>Checkpoint 1 posted Sunday. Submissions open tonight.</li>
+  </ul>
+</section>
+

 <section>
  <section>
@ -40,9 +49,9 @@ textbook: "Ch. 15.1-15.5, 16.7"
    <h3>Analyzing Volcano Operators</h3>

    <ul>
+      <li class="fragment highlight-grey" data-fragment-index="1">CPU Used</li>
      <li>Memory Bounds</li>
      <li>Disk IO Used</li>
-      <li class="fragment highlight-grey" data-fragment-index="1">CPU Used</li>
    </ul>

    <p class="fragment" data-fragment-index="1" style="margin-top: 30px;"><u>Data</u>bases are usually IO- or Memory-bound</p>
@ -81,7 +90,7 @@ textbook: "Ch. 15.1-15.5, 16.7"
 <section>
  <h3>Note</h3>

-  <p>We'll be discussing the "default" algorithm for each operator.</p> 
+  <p>So far, we've been pretending that each operator has one algorithm.</p> 

  <p class="fragment">Often, there are many algorithms, some of which cover multiple operators.</p>

@ -218,8 +227,8 @@ textbook: "Ch. 15.1-15.5, 16.7"

    <p>How many IOs do we need to compute $Q := R \times S$?</p>
    <ol>
-      <li class="fragment">Getting an Iterator on $R$: 100 tuples</li>
-      <li class="fragment">Getting an Iterator on $S$: 20 tuples</li>
+      <li class="fragment">Getting an Iterator on $R$: 20 tuples</li>
+      <li class="fragment">Getting an Iterator on $S$: 100 tuples</li>
      <li class="fragment">Getting an Iterator on $R \times S$ using the above iterators: </li>
    </ol>
  </section>
@ -231,7 +240,7 @@ textbook: "Ch. 15.1-15.5, 16.7"
      <li class="fragment"><b>Cache</b>: $|R| \times |S| = 20 \times 100 = 2000$ extra tuples</li>
    </ul>

-    <p class="fragment"><b>Best Total Cost</b> $100 + 20 + 1900 = 2020$</p>
+    <p class="fragment"><b>Best Total Cost</b> $100 + 20 + 1900 = 2010$</p>
  </section>

  <section>
@ -241,7 +250,7 @@ textbook: "Ch. 15.1-15.5, 16.7"

    <p>How many IOs do we need to compute $Q := R \times \sigma_c(R \times S)$</p>
    <ol>
-      <li class="fragment">Getting an Iterator on $\sigma_c(R \times S)$: 2020 tuples</li>
+      <li class="fragment">Getting an Iterator on $\sigma_c(R \times S)$: 2010 tuples</li>
      <li class="fragment">Getting an Iterator on $R$: 20 tuples</li>
      <li class="fragment">Getting an Iterator on $R \times \sigma_c(R \times S)$ using the above iterators: </li>
    </ol>
@ -250,15 +259,15 @@ textbook: "Ch. 15.1-15.5, 16.7"
  <section>
    <ul>
      <li><b>Memory</b>: 0 extra tuples</li>
-      <li class="fragment"><b>Replay</b>: $(|R|-1) \times \texttt{cost}(\sigma_c(R \times S)) = 19 \times 2020 = 38380$ extra tuples</li>
-      <li class="fragment"><b>Cache</b>: $|R| \times |S| = 20 \times 200 = 4000$ extra tuples</li>
+      <li class="fragment"><b>Replay</b>: $(|R|-1) \times \texttt{cost}(\sigma_c(R \times S)) = 19 \times 2010 = 38190$ extra tuples</li>
+      <li class="fragment"><b>Cache</b>: $|R| \times (0.1 \times (|R| \times |S|)) = 20 \times 200 = 4000$ extra tuples</li>
    </ul>

-    <p class="fragment"><b>Best Total Cost</b> $2020 + 20 + 4000 = 6040$</p>
+    <p class="fragment"><b>Best Total Cost</b> $2010 + 20 + 4000 = 6030$</p>
  </section>

  <section>
-    <p>Is there a middle ground?</p>
+    <p>Can we do better with cartesian product<br/>(and joins)?</p>
  </section>
 </section>

@ -312,14 +321,15 @@ textbook: "Ch. 15.1-15.5, 16.7"
      <dd class="fragment">$|S|$ tuples written.</dd>
      <dd class="fragment">$(\frac{|R|}{\mathcal B} - 1) \cdot |S|$ tuples read.</dd>
    </dl>
-    <p style="font-size: 70%;" class="fragment">In-memory caching is a special case of block-nested loop with $\mathcal B = |S|$</p>
-    <p style="font-size: 70%;" class="fragment">Does the block size for $R$ matter?</p>
+    <p style="font-size: 70%;" class="fragment">In-memory caching is a special case of block-nested loop with $\mathcal B = |R|$</p>
+    <p style="font-size: 70%;" class="fragment">Does the block size for $S$ matter?</p>
  </section>

  <section>
    <p>How big should the blocks be?</p>

-    <aside class="notes">As big as possible!  Leads to the question of distributing available memory between multiple joins: A simple linear optimization problem.</aside>
+    <p class="fragment">As big as possible!</p>
+    <p class="fragment">... but more on that later.</p>
  </section>
 </section>

@ -479,6 +489,21 @@ textbook: "Ch. 15.1-15.5, 16.7"
      <dd>No added IO! (not counting sort).</dd>
    </dl>          
  </section>
+
+  <section>
+    <h3>Recap: Joins</h3>
+
+    <dl>
+      <dt>Block-Nested Join</dt>
+      <dd>Moderate Memory, Moderate IO, High CPU</dd>
+      <dt>In-Memory Index Join (e.g., 1-Pass Hash)</dt>
+      <dd>High Memory, Low IO</dd>
+      <dt>Partition Join (e.g., 2-Pass Hash)</dt>
+      <dd>High IO, Low Memory</dd>
+      <dt>Sort/Merge Join</dt>
+      <dd>Low IO, Low Memory (But need sorted data)</dd>
+    </dl>
+  </section>
 </section>

 <section>
--- a/templates/cse4562_2021_slides.erb
+++ b/templates/cse4562_2021_slides.erb
@ -116,6 +116,8 @@ class_name = "CSE-4/562 Spring 2021"
          { src: '../../../../slides/reveal.js-3.7.0/plugin/markdown/marked.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
          { src: '../../../../slides/reveal.js-3.7.0/plugin/markdown/markdown.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
          { src: '../../../../slides/reveal.js-3.7.0/plugin/highlight/highlight.js', async: true, condition: function() { return !!document.querySelector( 'pre code' ); }, callback: function() { hljs.initHighlightingOnLoad(); } },
+          { src: '../../../../slides/reveal.js-3.7.0/plugin/zoom-js/zoom.js', async: true },
+          { src: '../../../../slides/reveal.js-3.7.0/plugin/notes/notes.js', async: true },
          // Chart.min.js
          { src: '../../../../slides/reveal.js-3.7.0/plugin/chart/Chart.min.js'},
          // the plugin
@ -127,5 +129,6 @@ class_name = "CSE-4/562 Spring 2021"

    </script>

+
  </body>
 </html>