From 09f414d6fd39456d33975474cc8f835f24c8316a Mon Sep 17 00:00:00 2001 From: Oliver Date: Tue, 2 Mar 2021 12:35:43 -0500 Subject: [PATCH] slides --- src/teaching/cse-562/2021sp/index.erb | 2 + .../2021sp/slide/2021-03-02-Indexing1.erb | 55 +++++++++++++++++-- 2 files changed, 51 insertions(+), 6 deletions(-) diff --git a/src/teaching/cse-562/2021sp/index.erb b/src/teaching/cse-562/2021sp/index.erb index 9f318747..bd385a48 100644 --- a/src/teaching/cse-562/2021sp/index.erb +++ b/src/teaching/cse-562/2021sp/index.erb @@ -44,6 +44,8 @@ schedule: slides: slide/2021-02-25-PhysicalLayout.html - date: "Mar. 2" topic: "Indexes: Tree-Based, Hash" + materials: + slides: slide/2021-03-02-Indexing1.html - date: "Mar. 4" topic: "Indexes: View-Based, Modern" - date: "Mar. 9" diff --git a/src/teaching/cse-562/2021sp/slide/2021-03-02-Indexing1.erb b/src/teaching/cse-562/2021sp/slide/2021-03-02-Indexing1.erb index f2d3c673..ba3f73f4 100644 --- a/src/teaching/cse-562/2021sp/slide/2021-03-02-Indexing1.erb +++ b/src/teaching/cse-562/2021sp/slide/2021-03-02-Indexing1.erb @@ -119,11 +119,11 @@ textbook: "Ch. 8.3-8.4, 14.1-14.2, 14.4"
- +
- +
@@ -205,11 +205,11 @@ textbook: "Ch. 8.3-8.4, 14.1-14.2, 14.4"
Always works... but slow
-
$\pi_A\left(\sigma_{\wedge B = 1}( IndexScan(R,\;C < 3) ) \right)$
+
$\pi_A\left(\sigma_{B = 1}( IndexScan(R,\;C < 3) ) \right)$
Requires a non-hash index on $C$
-
$\pi_A\left(\sigma_{\wedge C < 3}( IndexScan(R,\;B=1) ) \right)$
+
$\pi_A\left(\sigma_{C < 3}( IndexScan(R,\;B=1) ) \right)$
Requires a any index on $B$
@@ -249,7 +249,6 @@ textbook: "Ch. 8.3-8.4, 14.1-14.2, 14.4"
  • For every $c_i \equiv (A = a)$: Do you have any index on $A$?
  • For every $c_i \in \{\; (A \geq a), (A > a), (A \leq a), (A < a)\;\}$: Do you have a tree index on $A$?
  • For every $c_i, c_j$, do you have an appropriate index?
  • -
  • etc...
  • A simple table scan is also an option
  • Which one do we pick?

    @@ -307,6 +306,50 @@ textbook: "Ch. 8.3-8.4, 14.1-14.2, 14.4"
    +
    + +
    +

    What if we need multiple sort orders?

    +
    + +
    +

    Data Organization

    + +
    + +
    +

    Data Organization

    + +
    + +
    +

    Data Organization

    + +
    +
    +
    Unordered Heap
    +
    $O(N)$ reads.
    +
    + +
    +
    Sorted List
    +
    $O(\log_2 N)$ random reads for some queries.
    +
    + +
    +
    Clustered (Primary) Index
    +
    $O(\ll N)$ sequential reads for some queries.
    +
    + +
    +
    (Secondary) Index
    +
    $O(\ll N)$ random reads for some queries.
    +
    + +
    +
    +
    +
    @@ -322,7 +365,7 @@ textbook: "Ch. 8.3-8.4, 14.1-14.2, 14.4"
    Different $k$s are unlikely to have the same hash value.

    -

    Modulus $h(k)\mod N$ gives you a random number in $[0, N)$

    +

    $h(k)\mod N$ gives you a random number in $[0, N)$