diff --git a/src/teaching/cse-350/2024sp/index.erb b/src/teaching/cse-350/2024sp/index.erb index 89eff4ff..0b21a26e 100644 --- a/src/teaching/cse-350/2024sp/index.erb +++ b/src/teaching/cse-350/2024sp/index.erb @@ -47,10 +47,20 @@ schedule: notes: slide/06-bplustrees.pdf - topic: Write-Optimized structures detail: We build up LSM trees and Beta-Epsilon Trees from first principles + docs: + - lsm trees: https://www.cs.umb.edu/~poneil/lsmtree.pdf + - bslm trees: https://dl.acm.org/doi/10.1145/2213836.2213862 + - crimsondb: http://daslab.seas.harvard.edu/projects/crimsondb-demo/ + - beta-epsilon: https://www.usenix.org/publications/login/oct15/bender + - notes: slide/07-write-optimized.pdf - topic: The Shuffle Operation detail: We explore on-disk hashing and related strategies for partitioning data into manageable chunks + docs: + - notes: slide/08-shuffling.pdf - topic: Bloom Filters detail: Using compact summary structures to avoid expensive on-disk operations + docs: + - notes: slide/07-write-optimized.pdf - topic: Dataframe Storage detail: We design, from first principles, a storage format for persistent, mutable dataframes (aka relational tables) - topic: TBD diff --git a/src/teaching/cse-350/2024sp/slide/07-write-optimized.pdf b/src/teaching/cse-350/2024sp/slide/07-write-optimized.pdf new file mode 100644 index 00000000..8dbfed43 Binary files /dev/null and b/src/teaching/cse-350/2024sp/slide/07-write-optimized.pdf differ diff --git a/src/teaching/cse-350/2024sp/slide/08-shuffling.pdf b/src/teaching/cse-350/2024sp/slide/08-shuffling.pdf new file mode 100644 index 00000000..5ae8b255 Binary files /dev/null and b/src/teaching/cse-350/2024sp/slide/08-shuffling.pdf differ