From e0c0cb412b85b385a8d8beabe5af99ccf8c340ec Mon Sep 17 00:00:00 2001 From: Oliver Date: Thu, 11 Mar 2021 18:24:19 -0500 Subject: [PATCH] Slides --- .rake_tasks~ | 0 db/cv/okennedy/grants.json | 20 ++- .../2021sp/slide/2021-03-09-CostOpt1.erb | 4 +- .../2021sp/slide/2021-03-11-CostOpt2.erb | 138 ++++++++---------- 4 files changed, 82 insertions(+), 80 deletions(-) create mode 100644 .rake_tasks~ diff --git a/.rake_tasks~ b/.rake_tasks~ new file mode 100644 index 00000000..e69de29b diff --git a/db/cv/okennedy/grants.json b/db/cv/okennedy/grants.json index 025f9fcc..647beeb2 100644 --- a/db/cv/okennedy/grants.json +++ b/db/cv/okennedy/grants.json @@ -1,4 +1,22 @@ [ + { "title" : "SCC-PG: A Sustainable and Connected Community-Scale Food System to Empower Consumers, Farmers, and Retailers", + "agency" : "NSF: SCC", + "role" : "Co-PI", + "amount" : 150000, + "effort" : "20%", + "status" : "submitted", + "start" : "08/15/2021", "end" : "08/14/2022", + "type" : "grant", + "commitment" : { }, + "projects" : ["vizier"], + "copis" : [ + "Samina Raja", + "Sara Behdad", + "Debabrata Talukdar", + "Srirangaraj Setlur", + "Emmanuel Frimpong Boamah" + ] + }, { "title" : "HDR Institute: Institute for data enabled functional soft material innovation", "agency" : "NSF: HDR", "role" : "Co-PI", @@ -30,7 +48,7 @@ "role" : "Co-PI", "amount" : 50000, "effort" : "20%", - "status" : "submitted", + "status" : "rejected", "start" : "11/01/2020", "end" : "02/28/2021", "type" : "grant", "commitment" : { }, diff --git a/src/teaching/cse-562/2021sp/slide/2021-03-09-CostOpt1.erb b/src/teaching/cse-562/2021sp/slide/2021-03-09-CostOpt1.erb index a47dbf22..60c84532 100644 --- a/src/teaching/cse-562/2021sp/slide/2021-03-09-CostOpt1.erb +++ b/src/teaching/cse-562/2021sp/slide/2021-03-09-CostOpt1.erb @@ -125,7 +125,7 @@ textbook: Ch. 16 Sort (In-Mem) $\tau(R)$ - $0$ + $\textbf{io}(R)$ $O(|R|)$ @@ -267,8 +267,6 @@ textbook: Ch. 16

Cardinality Estimation

Unlike estimating IOs, cardinality estimation doesn't care about the algorithm, so we'll just be working with raw RA.

- -

Also unlike estimating IOs, we care about the cardinality of $|Q(R)|$ as a whole, rather than the contribution of each individual operator.

diff --git a/src/teaching/cse-562/2021sp/slide/2021-03-11-CostOpt2.erb b/src/teaching/cse-562/2021sp/slide/2021-03-11-CostOpt2.erb index fd59dc4d..73d2233c 100644 --- a/src/teaching/cse-562/2021sp/slide/2021-03-11-CostOpt2.erb +++ b/src/teaching/cse-562/2021sp/slide/2021-03-11-CostOpt2.erb @@ -15,144 +15,130 @@ textbook: Ch. 16
-
-

Accounting

-

Figure out the cost of each individual operator.

-

Only count the number of IOs added by each operator.

-
-
- + - - + + - + - - + + - + - + - - + + - - + + - - - - + + + + - - + + - - + + - - - - + + + +
OperationRATotal IOs (#pages)Memory (#tuples)
Table Scan $R$$\frac{|R|}{\mathcal P}$$O(1)$$\frac{|R|}{\mathcal P}$$O(1)$
Projection $\pi(R)$$\textbf{io}(R)$$O(1)$$\textbf{io}(R)$$O(1)$
Selection $\sigma(R)$ $\textbf{io}(R)$ $O(1)$
Union $R \uplus S$ $\textbf{io}(R) + \textbf{io}(S)$ $O(1)$
Sort (In-Mem)
Sort (In-Mem) $\tau(R)$$\textbf{io}(R)$$O(|R|)$$\textbf{io}(R)$$O(|R|)$
Sort (On-Disk)$\tau(R)$$\frac{2 \cdot \lfloor log_{\mathcal B}(|R|) \rfloor}{\mathcal P} + \textbf{io}(R)$$O(\mathcal B)$Sort (On-Disk)$\tau(R)$$\frac{2 \cdot \lfloor log_{\mathcal B}(|R|) \rfloor}{\mathcal P} + \textbf{io}(R)$$O(\mathcal B)$
(B+Tree) Index Scan
(B+Tree) Index Scan $Index(R, c)$$\log_{\mathcal I}(|R|) + \frac{|\sigma_c(R)|}{\mathcal P}$$O(1)$$\log_{\mathcal I}(|R|) + \frac{|\sigma_c(R)|}{\mathcal P}$$O(1)$
(Hash) Index Scan$Index(R, c)$$1$$O(1)$(Hash) Index Scan$Index(R, c)$$1$$O(1)$
    -
  1. Tuples per Page ($\mathcal P$) – Normally defined per-schema
  2. -
  3. Size of $R$ ($|R|$)
  4. -
  5. Pages of Buffer ($\mathcal B$)
  6. -
  7. Keys per Index Page ($\mathcal I$)
  8. +
  9. Tuples per Page ($\mathcal P$) – Normally defined per-schema
  10. +
  11. Size of $R$ ($|R|$)
  12. +
  13. Pages of Buffer ($\mathcal B$)
  14. +
  15. Keys per Index Page ($\mathcal I$)
- - + + - - + + - - - + + + - - - + + + - + - - + + - + - - + + - + - - + + - - + + - - + + - - - - + + + + - - + + - - + + - - - + + +
OperationRATotal IOs (#pages)Mem (#tuples)
Nested Loop Join (Buffer $S$ in mem)
Nested Loop Join (Buffer $S$ in mem) $R \times_{mem} S$$\textbf{io}(R)+\textbf{io}(S)$$O(|S|)$$\textbf{io}(R)+\textbf{io}(S)$$O(|S|)$
Block NLJ (Buffer $S$ on disk)$R \times_{disk} S$$\frac{|R|}{\mathcal B} \cdot \frac{|S|}{\mathcal P} + \textbf{io}(R) + \textbf{io}(S)$$O(1)$$R \times_{disk} S$$\frac{|R|}{\mathcal B} \cdot \frac{|S|}{\mathcal P} + \textbf{io}(R) + \textbf{io}(S)$$O(1)$
Block NLJ (Recompute $S$)$R \times_{redo} S$$\textbf{io}(R) + \frac{|R|}{\mathcal B} \cdot \textbf{io}(S)$$O(1)$$R \times_{redo} S$$\textbf{io}(R) + \frac{|R|}{\mathcal B} \cdot \textbf{io}(S)$$O(1)$
1-Pass Hash Join $R \bowtie_{1PH, c} S$$\textbf{io}(R) + \textbf{io}(S)$$O(|S|)$$\textbf{io}(R) + \textbf{io}(S)$$O(|S|)$
2-Pass Hash Join $R \bowtie_{2PH, c} S$$\frac{2|R| + 2|S|}{\mathcal P} + \textbf{io}(R) + \textbf{io}(S)$$O(1)$$\frac{2|R| + 2|S|}{\mathcal P} + \textbf{io}(R) + \textbf{io}(S)$$O(1)$
Sort-Merge Join $R \bowtie_{SM, c} S$[Sort][Sort][Sort][Sort]
(Tree) Index NLJ
(Tree) Index NLJ $R \bowtie_{INL, c}$$|R| \cdot (\log_{\mathcal I}(|S|) + \frac{|\sigma_c(S)|}{\mathcal P})$$O(1)$$|R| \cdot (\log_{\mathcal I}(|S|) + \frac{|\sigma_c(S)|}{\mathcal P})$$O(1)$
(Hash) Index NLJ$R \bowtie_{INL, c}$$|R| \cdot 1$$O(1)$(Hash) Index NLJ$R \bowtie_{INL, c}$$|R| \cdot 1$$O(1)$
(In-Mem) Aggregate
(In-Mem) Aggregate $\gamma_A(R)$$0$$adom(A)$$\textbf{io}(R)$$adom(A)$
(Sort/Merge) Aggregate$\gamma_A(R)$[Sort][Sort]$\gamma_A(R)$[Sort][Sort]
- -
    -
  1. Tuples per Page ($\mathcal P$) – Normally defined per-schema
  2. -
  3. Size of $R$ ($|R|$)
  4. -
  5. Pages of Buffer ($\mathcal B$)
  6. -
  7. Keys per Index Page ($\mathcal I$)
  8. -
  9. Number of distinct values of $A$ ($adom(A)$)
  10. -