484 lines
15 KiB
HTML
484 lines
15 KiB
HTML
|
<!doctype html>
|
|||
|
<html lang="en">
|
|||
|
|
|||
|
<head>
|
|||
|
<meta charset="utf-8">
|
|||
|
|
|||
|
<title>CSE 662 - Languages and Runtimes for Big Data - Fall 2015</title>
|
|||
|
|
|||
|
<meta name="description" content="CSE662 - Fall 2015">
|
|||
|
<meta name="author" content="Oliver Kennedy & Lukasz Ziarek">
|
|||
|
|
|||
|
<meta name="apple-mobile-web-app-capable" content="yes" />
|
|||
|
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent" />
|
|||
|
|
|||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
|
|||
|
|
|||
|
<link rel="stylesheet" href="../reveal.js-3.1.0/css/reveal.css">
|
|||
|
<link rel="stylesheet" href="ubodin.css" id="theme">
|
|||
|
|
|||
|
<!-- Code syntax highlighting -->
|
|||
|
<link rel="stylesheet" href="../reveal.js-3.1.0/lib/css/zenburn.css">
|
|||
|
|
|||
|
<!-- Printing and PDF exports -->
|
|||
|
<script>
|
|||
|
var link = document.createElement( 'link' );
|
|||
|
link.rel = 'stylesheet';
|
|||
|
link.type = 'text/css';
|
|||
|
link.href = window.location.search.match( /print-pdf/gi ) ? '../reveal.js-3.1.0/css/print/pdf.css' : '../reveal.js-3.1.0/css/print/paper.css';
|
|||
|
document.getElementsByTagName( 'head' )[0].appendChild( link );
|
|||
|
</script>
|
|||
|
|
|||
|
<!--[if lt IE 9]>
|
|||
|
<script src="../reveal.js-3.1.0/lib/js/html5shiv.js"></script>
|
|||
|
<![endif]-->
|
|||
|
</head>
|
|||
|
|
|||
|
<body>
|
|||
|
|
|||
|
<div class="reveal">
|
|||
|
<div class="header">
|
|||
|
<!-- Any Talk-Specific Header Content Goes Here -->
|
|||
|
University at Buffalo
|
|||
|
</div>
|
|||
|
<div class="footer">
|
|||
|
<!-- Any Talk-Specific Footer Content Goes Here -->
|
|||
|
<div style="float: left;">
|
|||
|
CSE 662
|
|||
|
</div>
|
|||
|
<div style="float: right;">
|
|||
|
Languages and Runtimes for Big Data
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
|
|||
|
|
|||
|
<!-- Any section element inside of this container is displayed as a slide -->
|
|||
|
<div class="slides">
|
|||
|
|
|||
|
<section>
|
|||
|
<section>
|
|||
|
<h2>$Bloom^L$</h2>
|
|||
|
<h4>October 5, 2015</h4>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>How is distributed consistency enforced?</h3>
|
|||
|
<ol>
|
|||
|
<li class="fragment"><b>Don't allow the program to get into an inconsistent state.</b></li>
|
|||
|
<li class="fragment"><b>Detect inconsistencies and fix them after the fact.</b></li>
|
|||
|
<li class="fragment"><b class="fragment grow">Eventually converge to a consistent state.</b></li>
|
|||
|
</ol>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>The CALM Principle</h3>
|
|||
|
<p>A <i>monotonic</i> program eventually converges naturally.</p>
|
|||
|
</section>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<section>
|
|||
|
<h3>Monotonicity</h3>
|
|||
|
<p>"Once you learn a fact, it never becomes false" (although you might never learn all available facts)</p>
|
|||
|
|
|||
|
<div class="fragment">
|
|||
|
<h4>Computation under monotonicity</h4>
|
|||
|
<ol>
|
|||
|
<li>What facts do you know right now?</li>
|
|||
|
<li class="fragment">What facts can you compute given what you know?</li>
|
|||
|
<li class="fragment">Broadcast/react to newly discovered facts</li>
|
|||
|
</ol>
|
|||
|
</div>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>What causes concurrency violations?</h3>
|
|||
|
<p class="fragment">A computation step needs a <b>complete input</b> before it can produce a <b>complete output</b></p>
|
|||
|
<p class="fragment">The output is incorrect if...
|
|||
|
<ul>
|
|||
|
<li class="fragment">The input is incomplete, and...</li>
|
|||
|
<li class="fragment">An incomplete output is not correct.</li>
|
|||
|
</ul>
|
|||
|
</p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<ul>
|
|||
|
<li class="fragment">Avoid incomplete inputs<ul>
|
|||
|
<li>Block until you know all inputs are ready (Point of Order)</li>
|
|||
|
</ul></li>
|
|||
|
<li class="fragment">Avoid computations where incomplete outputs are incorrect<ul>
|
|||
|
<li>Monotonic programs never produce incorrect outputs, just incomplete ones.</li>
|
|||
|
</ul></li>
|
|||
|
</ul>
|
|||
|
</section>
|
|||
|
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>What isn't monotonic?</h3>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Negation</h3>
|
|||
|
$$R = \{A, B, C\}; S = \{C, D\}$$
|
|||
|
|
|||
|
<p class="fragment">Let's say that $T = R - S$<p>
|
|||
|
<p class="fragment">You know two facts about $T$: $A \in T$ and $B \in T$</p>
|
|||
|
|
|||
|
<p class="fragment">If you ever learn that $A \in S$,<br/>the "fact" that $A \in T$ becomes false</p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Aggregation</h3>
|
|||
|
$$R = \{1, 2, 3\}$$
|
|||
|
|
|||
|
<p class="fragment">Let's say that $T = \sum_{i \in R} i$<p>
|
|||
|
<p class="fragment">You know several facts about $T$ including: $T = 6$</p>
|
|||
|
|
|||
|
<p class="fragment">If you ever learn that $4 \in R$,<br/>the "fact" that $T = 6$ becomes false</p>
|
|||
|
</section>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>What <u>is</u> monotonic?</h3>
|
|||
|
|
|||
|
<h4 class="fragment">Sets!</h4>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Datalog</h3>
|
|||
|
|
|||
|
<p class="fragment"><b>Atoms</b>: $Parent(A, B)$<br/><small>($A$ is a $Parent$ of $B$)</small></p>
|
|||
|
|
|||
|
<p class="fragment"><b>Rules</b>: $Ancestor(A, B)$ :- $Parent(A, B)$<br/><small>(If $A$ is a $Parent$ of $B$, then $A$ is also an $Ancestor$ of $B$)</small></p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<p>$Ancestor(A, B)$ :- $Parent(A, B)$</p>
|
|||
|
<p>$Ancestor(A, C)$ :- $Parent(A, B), Ancestor(B, C)$</p>
|
|||
|
<p><small>($A$ is an $Ancestor$ of $C$ if $A$ is a $Parent$ of $B$ and $B$ is an $Ancestor$ of $C$)</small></p>
|
|||
|
<br/>
|
|||
|
<p class="fragment">$Ancestor$ computes the <i>transitive closure</i> of $Parent$</p>
|
|||
|
<br/>
|
|||
|
<p class="fragment">No fact (atom) that you can ever learn will invalidate a fact that you've already learned.</p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Datalog is Monotonic</h3>
|
|||
|
<h4 class="fragment">(unless you need negation or aggregation)</h4>
|
|||
|
</section>
|
|||
|
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<section>
|
|||
|
<h3>Bloom</h3>
|
|||
|
<br/>
|
|||
|
<p class="fragment">Datalog with timesteps and asynchronous events</p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<table>
|
|||
|
<tr><th>Symbol</th><th>Meaning</th></tr>
|
|||
|
<tr><td><tt><=</tt></td><td>Add a new fact right now</td></tr>
|
|||
|
<tr><td><tt><+</tt></td><td>Add a new fact in the next timestep</td></tr>
|
|||
|
<tr><td><tt><-</tt></td><td>Remove a fact from the next timestep</td></tr>
|
|||
|
<tr><td><tt><~</tt></td><td>Send a fact to another node</td></tr>
|
|||
|
</table>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Example: Shortest Paths (Dijkstra's)</h3>
|
|||
|
<pre><code class="hljs ruby">class ShortestPaths
|
|||
|
include Bud
|
|||
|
|
|||
|
state {
|
|||
|
table :link, [:from, :to] => [:cost]
|
|||
|
scratch :path, [:from, :to, :next_hop, :cost]
|
|||
|
scratch :min_cost, [:from, :to] => [:cost]
|
|||
|
 }
|
|||
|
|
|||
|
bloom {
|
|||
|
path <= link {|l| [l.from, l.to, l.to, l.cost]}
|
|||
|
path <= (link*path).pairs(:to => :from) { |l,p|
|
|||
|
[l.from, p.to, l.to, l.cost + p.cost]
|
|||
|
}
|
|||
|
min_cost <= path.group([:from, :to], min(:cost))
|
|||
|
}
|
|||
|
end</code></pre>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Optimization</h3>
|
|||
|
<pre><code class="hljs ruby">
|
|||
|
path <= (link*path).pairs(:to => :from) { |l,p|
|
|||
|
[l.from, p.to, l.to, l.cost + p.cost]
|
|||
|
}
|
|||
|
</code></pre>
|
|||
|
<p class="fragment"><b>Compute only new facts</b>: This step can be performed incrementally as path entries are added</p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Example: Quorum Voting</h3>
|
|||
|
<pre><code class="hljs ruby">
|
|||
|
class QuorumVote
|
|||
|
include Bud
|
|||
|
state {
|
|||
|
channel :vote_chn, [:@addr, :voter_id]
|
|||
|
channel :result_chn, [:@addr]
|
|||
|
table :votes, [:voter_id]
|
|||
|
scratch :cnt, [] => [:cnt]
|
|||
|
}
|
|||
|
|
|||
|
bloom {
|
|||
|
votes <= vote_chn {|v| [v.voter_id]}
|
|||
|
cnt <= votes.group(nil, count(:voter_id))
|
|||
|
result_chn <~ cnt {|c| [RET_ADDR] if c >= QUORUM_SIZE}
|
|||
|
}
|
|||
|
end</code></pre>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Problem!</h3>
|
|||
|
<pre><code class="hljs ruby">
|
|||
|
cnt <= votes.group(nil, count(:voter_id))
|
|||
|
result_chn <~ cnt {|c| [RET_ADDR] if c >= QUORUM_SIZE}
|
|||
|
</code></pre>
|
|||
|
<p class="fragment"><b>cnt isn't monotonic</b>: It can't be computed until all votes are present and available!</p>
|
|||
|
|
|||
|
</section>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<section>
|
|||
|
<h3>(Positive) Set operations are monotonic</h3>
|
|||
|
<h3 class="fragment">... but not the only thing that's monotonic</h3>
|
|||
|
<ul class="fragment">
|
|||
|
<li>Growing Integers</li>
|
|||
|
<li>Facts that become true</li>
|
|||
|
<li>Many aggregates (MAX, MIN, COUNT)</li>
|
|||
|
<li>Vector Clocks</li>
|
|||
|
</ul>
|
|||
|
<p class="fragment">There's a term for these... <b>bounded join semilattices</b></p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Bounded Join Semilattice</h3>
|
|||
|
|
|||
|
$$< S, \sqcup, \bot >$$
|
|||
|
|
|||
|
<p class="fragment">$S$: A set (Integers, Boolean values, Sets of facts)</p>
|
|||
|
<p class="fragment">$\sqcup : S \times S \rightarrow S$: A 'merge' operation for elements of $S$</p>
|
|||
|
<p class="fragment">$\bot \in S$: A 'starting' element of $S$</p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>"Merge"</h3>
|
|||
|
<p><small>(Least upper bound)</small></p>
|
|||
|
<ul>
|
|||
|
<li><b>Associative</b>: $(a \sqcup b) \sqcup c = a \sqcup (b \sqcup c)$</li>
|
|||
|
<li><b>Commutative</b>: $a \sqcup b = b \sqcup a$</li>
|
|||
|
<li><b>Idempotent</b>: $a \sqcup a = a$</li>
|
|||
|
</ul>
|
|||
|
<p class="fragment">Defines a partial order: $a < b$ if $a \sqcup b = b$</p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Examples</h3>
|
|||
|
|
|||
|
<div class="fragment">$$< \mathbb R, MAX, -\infty >$$</div>
|
|||
|
<div class="fragment">$$< \mathbb R, MIN, +\infty >$$</div>
|
|||
|
<div class="fragment">$$< \mathbb B, \wedge, T >$$</div>
|
|||
|
<div class="fragment">$$< \mathbb B, \vee, F >$$</div>
|
|||
|
<div class="fragment">$$< sets\ of\ \mathbb R, \cup, \emptyset >$$</div>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<p><b>New notion of 'Fact'</b>: How 'far' in the lattice's set you are.</p>
|
|||
|
|
|||
|
<table class="fragment">
|
|||
|
<tr><th>Event</th><th>Alice</th><th>Bob</th><th>Carol</th><th>Dave</th></tr>
|
|||
|
<tr><td>Initial State</td><td>6</td><td>1</td><td>7</td><td>10</td></tr>
|
|||
|
<tr class="fragment"><td>$Alice \sqcup Bob$</td><td>6</td><td>6</td><td>7</td><td>10</td></tr>
|
|||
|
<tr class="fragment"><td>$Carol \sqcup Dave$</td><td>6</td><td>6</td><td>10</td><td>10</td></tr>
|
|||
|
<tr class="fragment"><td>$Alice \sqcup Dave$</td><td>10</td><td>6</td><td>10</td><td>10</td></tr>
|
|||
|
<tr class="fragment"><td>$Bob \sqcup Carol$</td><td>10</td><td>10</td><td>10</td><td>10</td></tr>
|
|||
|
</table>
|
|||
|
|
|||
|
<p class="fragment">The <tt>lmax</tt> lattice always goes up</p>
|
|||
|
</section>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<section>
|
|||
|
<h3>Programs don't rely exclusively on one type!</h3>
|
|||
|
<p class="fragment">We need mappings between different lattice types</p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Monotone Functions</h3>
|
|||
|
$$f : S \rightarrow T$$
|
|||
|
<p>For any monotone $f$,<br/>whenever $a <_S b$ then $f(a) <_T f(b)$</p>
|
|||
|
|
|||
|
<br/>
|
|||
|
<p class="fragment"><b>Monotone functions preserve partial orders</b></p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Example Monotone Functions</h3>
|
|||
|
|
|||
|
<ul>
|
|||
|
<li>sizeof : $set \rightarrow \mathbb N$</li>
|
|||
|
<li>$\sum$ : $set\ of\ \mathbb R^+ \rightarrow \mathbb R^+$</li>
|
|||
|
<li>$\cap$ : $set \times set \rightarrow \mathbb set$</li>
|
|||
|
<li>$>(\mathbb R)$ : $lmax \rightarrow \mathbb B$</li>
|
|||
|
<ul>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>If all computations in a program are monotone functions, the program is naturally eventually consistent</h3>
|
|||
|
<p class="fragment">... but we can do better</p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Morphism</h3>
|
|||
|
$$f : S \rightarrow T$$
|
|||
|
<p>
|
|||
|
$f$ is a morphism if<br/>
|
|||
|
$f$ is monotone and $f(a \sqcup b) = f(a) \sqcup f(b)$<br/>
|
|||
|
($f$ commutes with $\sqcup$)
|
|||
|
</p>
|
|||
|
<br/>
|
|||
|
<p class="fragment"><b>Monotone functions are decomposable</b></p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Example Morphisms</h3>
|
|||
|
|
|||
|
<ul>
|
|||
|
<li>$\cap$ : $set \times set \rightarrow \mathbb set$</li>
|
|||
|
<li>$>(\mathbb R)$ : $lmax \rightarrow \mathbb B$</li>
|
|||
|
<ul>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<pre><code class="hljs ruby">
|
|||
|
path <= link {|l| [l.from, l.to, l.to, l.cost]}
|
|||
|
path <= (link*path).pairs(:to => :from) { |l,p|
|
|||
|
[l.from, p.to, l.to, l.cost + p.cost]
|
|||
|
}
|
|||
|
min_cost <= path.group([:from, :to]) { |group|
|
|||
|
group.project(:cost).min
|
|||
|
}
|
|||
|
</code></pre>
|
|||
|
|
|||
|
<table class="fragment"><tr><th>Morphisms <small style="vertical-align: middle">(using bags & lmin)</small></th></tr><tr><td>
|
|||
|
<code class="fragment">(link * path).pairs</code><br/>
|
|||
|
<code class="fragment">+</code><br/>
|
|||
|
<code class="fragment">project</code><br/>
|
|||
|
<code class="fragment">group</code><br/>
|
|||
|
<code class="fragment">min</code><br/>
|
|||
|
</td></tr></table>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Incremental Computation</h3>
|
|||
|
<p>We need to update an input with new data and compute:
|
|||
|
$$f(old \sqcup new)$$
|
|||
|
</p>
|
|||
|
<p class="fragment">We (probably) already have $f(old)$.</p>
|
|||
|
<p class="fragment"><b>Insight:</b> Computing $f(old) \sqcup f(new)$ is probably cheaper.</p>
|
|||
|
<p class="fragment">... but is only correct if $f$ is a morphism</p>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<img src="graphics/BloomL-TransitiveClosure.png"/>
|
|||
|
</section>
|
|||
|
</section>
|
|||
|
|
|||
|
<section>
|
|||
|
<section>
|
|||
|
<h3>Example: Set Lattice</h3>
|
|||
|
<pre><code class="hljs ruby">class Bud::SetLattice < Bud::Lattice
|
|||
|
wrapper_name :lset
|
|||
|
def initialize(x=[])
|
|||
|
@v = x.uniq # Remove duplicates from input
|
|||
|
end
|
|||
|
def merge(i)
|
|||
|
self.class.new(@v | i.reveal)
|
|||
|
end
|
|||
|
morph :intersect do |i|
|
|||
|
self.class.new(@v & i.reveal)
|
|||
|
end
|
|||
|
morph :contains? do |i|
|
|||
|
Bud::BoolLattice.new(@v.member? i)
|
|||
|
end
|
|||
|
monotone :size do
|
|||
|
Bud::MaxLattice.new(@v.size)
|
|||
|
end
|
|||
|
end</code></pre>
|
|||
|
</section>
|
|||
|
|
|||
|
|
|||
|
<section>
|
|||
|
<h3>Example: A Key Value Store</h3>
|
|||
|
<pre><code class="hljs ruby">class KvsReplica
|
|||
|
include Bud
|
|||
|
include KvsProtocol
|
|||
|
state { lmap :kv_store }
|
|||
|
bloom do
|
|||
|
# Fulfil any put requests
|
|||
|
kv_store <= kvput {|c| {c.key => c.val}}
|
|||
|
# Acknowledge any put requests
|
|||
|
kvput_resp <~ kvput {|c|
|
|||
|
[ c.reqid, c.client_addr, ip_port ]}
|
|||
|
# Respond to any get requests
|
|||
|
kvget_resp <~ kvget {|c|
|
|||
|
[ c.reqid, c.client_addr,
|
|||
|
kv_store.at(c.key), ip_port ]}
|
|||
|
end
|
|||
|
end</code></pre>
|
|||
|
</section>
|
|||
|
|
|||
|
</section>
|
|||
|
|
|||
|
|
|||
|
</div></div>
|
|||
|
|
|||
|
<script src="../reveal.js-3.1.0/lib/js/head.min.js"></script>
|
|||
|
<script src="../reveal.js-3.1.0/js/reveal.js"></script>
|
|||
|
|
|||
|
<script>
|
|||
|
|
|||
|
// Full list of configuration options available at:
|
|||
|
// https://github.com/hakimel/../reveal.js#configuration
|
|||
|
Reveal.initialize({
|
|||
|
controls: false,
|
|||
|
progress: true,
|
|||
|
history: true,
|
|||
|
center: true,
|
|||
|
|
|||
|
transition: 'fade', // none/fade/slide/convex/concave/zoom
|
|||
|
|
|||
|
// Optional ../reveal.js plugins
|
|||
|
dependencies: [
|
|||
|
{ src: '../reveal.js-3.1.0/lib/js/classList.js', condition: function() { return !document.body.classList; } },
|
|||
|
{ src: '../reveal.js-3.1.0/plugin/math/math.js', condition: function() { return true; } },
|
|||
|
{ src: '../reveal.js-3.1.0/plugin/markdown/marked.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
|
|||
|
{ src: '../reveal.js-3.1.0/plugin/markdown/markdown.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
|
|||
|
{ src: '../reveal.js-3.1.0/plugin/highlight/highlight.js', async: true, condition: function() { return !!document.querySelector( 'pre code' ); }, callback: function() { hljs.initHighlightingOnLoad(); } },
|
|||
|
{ src: '../reveal.js-3.1.0/plugin/zoom-js/zoom.js', async: true },
|
|||
|
{ src: '../reveal.js-3.1.0/plugin/notes/notes.js', async: true }
|
|||
|
]
|
|||
|
});
|
|||
|
|
|||
|
</script>
|
|||
|
|
|||
|
</body>
|
|||
|
</html>
|