From 846ae67e7424175bdd9d4af27a0a29020f6ac3b9 Mon Sep 17 00:00:00 2001 From: Oliver Date: Mon, 14 Oct 2019 17:33:18 -0400 Subject: [PATCH] Paper links --- src/teaching/cse-662/2019fa/index.md | 8 +- .../2019fa/slide/2019-10-10-Legorithmics.erb | 463 ++++++++++++++++++ 2 files changed, 467 insertions(+), 4 deletions(-) create mode 100644 src/teaching/cse-662/2019fa/slide/2019-10-10-Legorithmics.erb diff --git a/src/teaching/cse-662/2019fa/index.md b/src/teaching/cse-662/2019fa/index.md index caff3ba6..a7acb613 100644 --- a/src/teaching/cse-662/2019fa/index.md +++ b/src/teaching/cse-662/2019fa/index.md @@ -97,11 +97,11 @@ After the taking the course, students should be able to: * **Oct 3** - SkyServer on MonetDB presented by SQLatin ([reading](https://ieeexplore-ieee-org.gate.lib.buffalo.edu/abstract/document/4274958/) | slides) * **Oct 8** - Software Transactional Memory ([reading](https://dl.acm.org/citation.cfm?id=1378582)) * **Oct 10** - Skyserver (continued) -* **Oct 15** - NoDB / RAW ([paper 1](http://dl-acm-org.gate.lib.buffalo.edu/citation.cfm?id=2213864) | [paper 2](http://www.vldb.org/pvldb/vol7/p1119-karpathiotakis.pdf)) +* **Oct 15** - NoDB / RAW ([paper 1](https://dl-acm-org.gate.lib.buffalo.edu/citation.cfm?id=2213864) | [paper 2](http://www.vldb.org/pvldb/vol7/p1119-karpathiotakis.pdf)) * **Oct 17** - Group Presentations -* **Oct 22** - Legorithmics ([paper](http://infoscience.epfl.ch/record/186017/files/main-final.pdf)) -* **Oct 24** - Streaming ([paper](http://www.cs.cornell.edu/johannes/papers/2007/2007-CIDR-Cayuga.pdf)) -* **Oct 29** - Scan Sharing ([paper](http://dl-acm-org.gate.lib.buffalo.edu/citation.cfm?id=1687707)) +* **Oct 22** - Legorithmics ([paper](https://infoscience.epfl.ch/record/186017/files/main-final.pdf)) +* **Oct 24** - Streaming ([paper](https://www.cs.cornell.edu/johannes/papers/2007/2007-CIDR-Cayuga.pdf)) +* **Oct 29** - Scan Sharing ([paper](https://dl-acm-org.gate.lib.buffalo.edu/citation.cfm?id=1687707)) * **Oct 31** - Group Presentations --- diff --git a/src/teaching/cse-662/2019fa/slide/2019-10-10-Legorithmics.erb b/src/teaching/cse-662/2019fa/slide/2019-10-10-Legorithmics.erb new file mode 100644 index 00000000..b4f8596f --- /dev/null +++ b/src/teaching/cse-662/2019fa/slide/2019-10-10-Legorithmics.erb @@ -0,0 +1,463 @@ +--- +template: templates/cse662_2019_slides.erb +title: Legorithmics +date: October 10 +--- + +
+
+

Legorithmics

+

October 23, 2017

+
+ +
+ + + + + + +

Same great algorithms,
awesome new hardware flavor

+
+ + +
+

Adapting Software to Hardware

+ +

Hardware adaptation uses standard transformations: +

    +
  • Batching: Prefetch blocks of data to avoid random seeks.
  • +
  • Partitioning: Group related blocks of data to minimize cross-partition work.
  • +
  • Reordering: Cluster data accesses for better cache locality.
  • +
+

+

The hard part is picking where to apply the transformations and selecting values for the transformation's parameters

+
+ +
+

Adapting Software to Hardware

+ +

Key Insights: +

    +
  • The basic algorithms haven't changed since the 70s.
  • +
  • Hardware changes very slowly
  • +
  • You have as much time as you need to design a hardware-specific algorithm
  • +
+

+

Automate the search!

+
+ +
+

Adapting Software to Hardware

+ +

What do we need?

+
    +
  • A way to describe the algorithm.
  • +
  • A way to describe the hardware.
  • +
  • A cost model.
  • +
  • Transformation rules.
  • +
  • A (possibly exp-time) optimizer
  • +
+
+ +
+ +
+
+

Typesystems

+ +

A collection of rules that assign a property called a type to the parts of a computer program: variables, expressions, etc...
[Wikipedia]

+ +
+ +
+

Typesystems

+ +

A typesystem allows you to: +

    +
  • Define interfaces between different parts of a program.
  • +
  • Check that these parts have been connected consistently.
  • +
  • Define global properties in terms of local properties.
  • +
+

+ +
+ +
+

Typesystems

+ +
+

A type
+ ($\tau := D\;|\;[\tau]\;|<\tau,\tau>\;|\;\tau\rightarrow\tau$) +

+
+

A set of inference rules
+ ($\frac{e\;:\;\tau}{[e]\;:\;[\tau]}$) +

+
+

These example types are part of the monad algebra

+
+ +
+

Types

+ + + + + + +
TypeMeaning
$D$Primitive Type (int, float, etc...)
$[\tau]$An array of elements with type $\tau$
$<\tau_1,\tau_2>$A pair of elements of types $\tau_1$ and $\tau_2$.
$\tau_1\rightarrow\tau_2$A function with one argument of type $\tau_1$ and one return value of type $\tau_2$.
+
+ +
+

Type Examples

+ +
$[ < int,float > ]$
+
$[int] \rightarrow float$
+
$ < [int], [int] >\; \rightarrow [ < int,int > ]$
+
+ +
+

Inference Rules

+
+

Defined over a language of expressions like $f(a, b)$.

+ $$\frac{a : \tau_a\;\;\;b : \tau_b}{f(a,b):\tau_f(\tau_a, \tau_b)}$$ +

If expression $a$ has type $\tau_a$ and expression $b$ has type $\tau_b$...

+

...then expression $f(a, b)$ has type $\tau_f(\tau_a, \tau_b)$.

+
+ +
+

Inference Examples

+
+ $\frac{e: \tau}{[e] : [\tau]}$ +
+
+ $\frac{c: Bool\;\;\;e_1:\tau\;\;\;e_2:\tau}{(\textbf{if}\;c\;\textbf{then}\;e_1\;\textbf{else}\;e_2)\;:\; \tau}$ +
+
+ $\frac{e: < \tau_1, \tau_2 >\;\;\;i \in \{1,2\}}{e.i\; :\; \tau_i}$ +
+
+
+ +
+
+

Monad Algebra

+ +

A primitive language for describing data processing.

+ + + + + + +
OperatorMeaning
$\lambda x.e$Define a function with body $e$ that uses variable $x$.
$e_1\;e_2$Apply the function defined by $e_1$ to the value obtained from $e_2$.
$ \textbf{if}\;c\;\textbf{then}\;e_1\;\textbf{else}\;e_2$If $c$ is true then evaluate $e_1$, and otherwise evaluate $e_2$.
+ +
+
+

Monad Algebra

+ + + + + + + + +
OperatorMeaning
$ < e_1, e_2 >$Construct a tuple from $e_1$ and $e_2$.
$ e.i $Extract attribute $i$ from the tuple $e$.
$ [e] $Construct a single-element array from e.
$ [] $Construct an empty array.
$ e_1 \sqcup e_2 $Concatenate the arrays $e_1$ and $e_2$.
+ +
+ +
+ $$\textbf{flatMap}(f : \tau_1 \rightarrow [\tau_2])(e : [\tau_1])$$ +

Apply function $f$ to every element of array $e$. Concatenate all of the arrays returned by $f$.

+ + + $$\textbf{foldL}(c : \tau_2, f : < \tau_2, \tau_1 >)(e : [\tau_1])$$ +

Apply function $f$ to every element of array $e$, with each invocation passing its return value to the next call (e.g., aggregation)

+ + $$\textbf{for}(xB : [\tau_1] [k] \leftarrow e_{in} : [\tau_1])(e_{loop} : [\tau_2])$$ +

Extract blocks of size $k$ from $e_{in}$. For each block compute a flatMap using expression $e_{loop}$.

+ +

+ +
+

Example - Average

+ +
+ $(\lambda tot.(tot.1 / tot.2))$
+      + $(\textbf{foldL}($ + $< 0, 0 >,$ + $(\lambda < a, x >.<$ + $a.1 + x$ + $,$ + $a.2 + 1$ + $ > $ + $))$ +
+

Fold implements aggregation

+

Fold takes a 'previous' $a$ and a 'current' $x$

+

We need a sum, and a count

+

Initial sum and count are both 0

+

Postprocess with division ($\lambda$ creates a variable $tot$)

+ +
+
+ +
+
+

Cost Estimation

+

We need... +

    +
  • Cardinality estimation
  • +
  • A model of IO
  • +
  • IO Cost relative to cardinality
  • +

+
+ +
+

Cardinality Estimation

+ +

Basic Approach: Define a second type for tracking data sizes

+ + $$\alpha\;:=\;[\alpha]^x\;|\;< \alpha_1, \alpha_2 >|\;c$$ + +
+
+

e.g., $[ < 1, [1]^y > ]^x$ corresponds to: +

    +
  • an array with $x$ elements, where each element consists of...
  • +
  • a value of fixed-size 1, and...
  • +
  • an array of $y$ fixed-size values.
  • +

+
+
+ +
+

Cardinality Estimation

+ +
    +
  • $size(c) = c$
  • +
  • $size([\alpha]^x) = x \cdot size(\alpha)$
  • +
  • $size( < \alpha_1, \alpha_2 >) = size(\alpha_1) + size(\alpha_2)$
  • +
+
+ +
+

Cardinality Estimation

+ +

$R(\Gamma, e)$ computes the cardinality type for $e$

+

$\Gamma : x \rightarrow \alpha$ is a context / scope

+
+ +
+

For Loops

+
+ $$R\left(\Gamma, \textbf{for}(x [k] \leftarrow e_1)\; e_2\right) := $$ +

+ $\frac{cardinality(R(\Gamma, e_1))}{k}\cdot$ + $R(\Gamma', e_2)$ +

+
+
$\Gamma' := \Gamma \cup \{x \mapsto [sizeofElement(R(\Gamma, e_1))]^k\}$
+
+

The cardinality is based on that of $e_2$

+

Repeated once for every time through the loop

+

And $e_2$ is evaluated in the context of a $k$ element array.

+
+ +
+

If Then Else

+
+ $$R\left(\Gamma, \textbf{if}\;c\;\textbf{then}\;e_1\;\textbf{else}\;e_2\right) := $$ +

+ $max(R\left(\Gamma, e_1\right), R\left(\Gamma, e_2\right))$ +

+

Pessimistic assumption of biggest possible size.
Avoids needing to estimate $p(c = true)$.

+
+
+ +
+
+

Example: Block-Nested-Loop Join

+
+
# Loop over blocks in outer rel.
+
$\textbf{for}( xB [k_1] \leftarrow R )$
+
# Loop over blocks in inner rel.
+
$\textbf{for}( yB [k_2] \leftarrow S )$
+
# Loop over elems in outer block.
+
$\textbf{for}( x \leftarrow xB )$
+
# Loop over elems in inner block.
+
$\textbf{for}( y \leftarrow yB )$
+
# Join test.
+
$\textbf{if}\;joinCond(x, y)$
+
# Add pair if success.
+
$\textbf{then}\;[< x, y >]$
+
# Add nothing if not.
+
$\textbf{else}\;[]$
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + +
ExpressionContextResult Size
$\textbf{for}( xB [k_1] \leftarrow R )$$\Gamma_1 = R \mapsto [1]^x, S \mapsto [1]^y$$[ < 1, 1 > ]^{\frac{x}{k_1} \cdot \frac{y}{k_2} \cdot k_1 \cdot k_2}$
$\textbf{for}( yB [k_2] \leftarrow S )$$\Gamma_2 = \Gamma_1 \cup xB \mapsto [1]^{k_1}$$[ < 1, 1 > ]^{\frac{y}{k_2} \cdot k_1 \cdot k_2}$
$\textbf{for}( x \leftarrow xB )$$\Gamma_3 = \Gamma_2 \cup yB \mapsto [1]^{k_2}$$[ < 1, 1 > ]^{k_1 \cdot k_2}$
$\textbf{for}( y \leftarrow yB )$$\Gamma_4 = \Gamma_3 \cup x \mapsto 1$$[ < 1, 1 > ]^{k_2}$
$\textbf{if}\;joinCond(x, y)$$\Gamma_5 = \Gamma_4 \cup y \mapsto 1$$[ < 1, 1 > ]^1$
$\textbf{then}\;[< x, y >]$$\Gamma_5$$[ < 1, 1 > ]^1$
$\textbf{else}\;[]$$\Gamma_5$$0$
+
+
+ +
+
+

IO Model

+

IO Costs have 2 components: +

    +
  • $InitCom$: The cost of initializing a connection (e.g., seek time).
  • +
  • $UnitTr$: The cost of transferring one unit of data.
  • +

+

Costs are defined for every pair of memory hierarchy levels: +

    +
  • $UnitTr(HDD \rightarrow RAM)$ is the cost of reading from HDD into Ram.
  • +
  • $InitCom(RAM \rightarrow HDD)$ is the cost of seeking to a write from Ram onto a HDD.
  • +

+ +
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ExpressionResult SizeHDD to RAMRAM to HDD
$\textbf{for}( xB [k_1] \leftarrow R )$$[ < 1, 1 > ]^{\frac{x}{k_1} \cdot \frac{y}{k_2} \cdot k_1 \cdot k_2}$$x+\frac{x}{k_1}y$$2xy$
$\textbf{for}( yB [k_2] \leftarrow S )$$[ < 1, 1 > ]^{\frac{y}{k_2} \cdot k_1 \cdot k_2}$$y$$2k_1y$
$\textbf{for}( x \leftarrow xB )$$[ < 1, 1 > ]^{k_1 \cdot k_2}$$0$$2k_1k_2$
$\textbf{for}( y \leftarrow yB )$$[ < 1, 1 > ]^{k_2}$$0$$(1+1)k_2$
$\textbf{if}\;joinCond(x, y)$$[ < 1, 1 > ]^1$$0$$(1+1)k_2$
$\textbf{then}\;[< x, y >]$$[ < 1, 1 > ]^1$$0$$(1+1)k_2$
$\textbf{else}\;[]$$0$$0$$0$
+

HDD: $R, S, Result$ RAM: $x, xB, y, yB$

+
+
+ +
+
+

Rewrite Rules

+

Batching

+ $$for(x [1] \leftarrow R)\; e \Rightarrow for(xB [k] \leftarrow R)\; for(x [1] \leftarrow xB)\; e$$ + +

Reordering Iterators

+ $$for(x_1 [k_1] \leftarrow R_1)\;for(x_2 [k_2] \leftarrow R_2) \Rightarrow$$ + $$for(x_2 [k_2] \leftarrow R_2)\;for(x_1 [k_1] \leftarrow R_1)$$ + +

Size-Dependent, Commutative Functions

+ $$f \Rightarrow (\lambda< x_1, x_2>.f(\textbf{if}\;|x_1|\leq |x_2|\;\textbf{then}< x_1, x_2 >\;\textbf{else}\;< x_2, x_1 >))$$ + $$f \Rightarrow (\lambda< x_1, x_2>.f(\textbf{if}\;|x_1|\leq |x_2|\;\textbf{then}< x_2, x_1 >\;\textbf{else}\;< x_1, x_2 >))$$ +
+ +
+

The Optimizer

+ Starting with $e$... +
    +
  1. For every possible rewrite of $e$:
  2. +
  3. Use a linear optimizer to find the best $k$s
  4. +
  5. If the rewrite improved the cost then recur.
  6. +
+
+ +
+

Legorithmics

+
    +
  • Define your algorithms once.
  • +
  • Define your hardware spec once.
  • +
  • Let an automatic tool fit the algoroithm to the hardware!
  • +
+
+