Website/slides/cse662fa2017/2017-10-02-BloomL.html

484 lines
15 KiB
HTML
Raw Normal View History

2017-10-02 12:03:02 -04:00
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>CSE 662 - Languages and Runtimes for Big Data - Fall 2017</title>
<meta name="description" content="CSE662 - Fall 2016">
<meta name="author" content="Oliver Kennedy">
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
<link rel="stylesheet" href="reveal.js-3.1.0/css/reveal.css">
<link rel="stylesheet" href="ubodin.css" id="theme">
<!-- Code syntax highlighting -->
<link rel="stylesheet" href="reveal.js-3.1.0/lib/css/zenburn.css">
<!-- Printing and PDF exports -->
<script>
var link = document.createElement( 'link' );
link.rel = 'stylesheet';
link.type = 'text/css';
link.href = window.location.search.match( /print-pdf/gi ) ? 'reveal.js-3.1.0/css/print/pdf.css' : 'reveal.js-3.1.0/css/print/paper.css';
document.getElementsByTagName( 'head' )[0].appendChild( link );
</script>
<!--[if lt IE 9]>
<script src="reveal.js-3.1.0/lib/js/html5shiv.js"></script>
<![endif]-->
</head>
<body>
<div class="reveal">
<div class="header">
<!-- Any Talk-Specific Header Content Goes Here -->
University at Buffalo
</div>
<div class="footer">
<!-- Any Talk-Specific Footer Content Goes Here -->
<div style="float: left;">
CSE 662
</div>
<div style="float: right;">
Languages and Runtimes for Big Data
</div>
</div>
<!-- Any section element inside of this container is displayed as a slide -->
<div class="slides">
<section>
<section>
<h2>$Bloom^L$</h2>
<h4>Oct 2, 2017</h4>
</section>
<section>
<h3>How is distributed consistency enforced?</h3>
<ol>
<li class="fragment"><b>Don't allow the program to get into an inconsistent state.</b></li>
<li class="fragment"><b>Detect inconsistencies and fix them after the fact.</b></li>
<li class="fragment"><b class="fragment grow">Eventually converge to a consistent state.</b></li>
</ol>
</section>
<section>
<h3>The CALM Principle</h3>
<p>A <i>monotonic</i> program eventually converges naturally.</p>
</section>
</section>
<section>
<section>
<h3>Monotonicity</h3>
<p>"Once you learn a fact, it never becomes false" (although you might never learn all available facts)</p>
<div class="fragment">
<h4>Computation under monotonicity</h4>
<ol>
<li>What facts do you know right now?</li>
<li class="fragment">What facts can you compute given what you know?</li>
<li class="fragment">Broadcast/react to newly discovered facts</li>
</ol>
</div>
</section>
<section>
<h3>What causes concurrency violations?</h3>
<p class="fragment">A computation step needs a <b>complete input</b> before it can produce a <b>complete output</b></p>
<p class="fragment">The output is incorrect if...
<ul>
<li class="fragment">The input is incomplete, and...</li>
<li class="fragment">An incomplete output is not correct.</li>
</ul>
</p>
</section>
<section>
<ul>
<li class="fragment">Avoid incomplete inputs<ul>
<li>Block until you know all inputs are ready (Point of Order)</li>
</ul></li>
<li class="fragment">Avoid computations where incomplete outputs are incorrect<ul>
<li>Monotonic programs never produce incorrect outputs, just incomplete ones.</li>
</ul></li>
</ul>
</section>
</section>
<section>
<section>
<h3>What isn't monotonic?</h3>
</section>
<section>
<h3>Negation</h3>
$$R = \{A, B, C\}; S = \{C, D\}$$
<p class="fragment">Let's say that $T = R - S$<p>
<p class="fragment">You know two facts about $T$: $A \in T$ and $B \in T$</p>
<p class="fragment">If you ever learn that $A \in S$,<br/>the "fact" that $A \in T$ becomes false</p>
</section>
<section>
<h3>Aggregation</h3>
$$R = \{1, 2, 3\}$$
<p class="fragment">Let's say that $T = \sum_{i \in R} i$<p>
<p class="fragment">You know several facts about $T$ including: $T = 6$</p>
<p class="fragment">If you ever learn that $4 \in R$,<br/>the "fact" that $T = 6$ becomes false</p>
</section>
</section>
<section>
<section>
<h3>What <u>is</u> monotonic?</h3>
<h4 class="fragment">Sets!</h4>
</section>
<section>
<h3>Datalog</h3>
<p class="fragment"><b>Atoms</b>: $Parent(A, B)$<br/><small>($A$ is a $Parent$ of $B$)</small></p>
<p class="fragment"><b>Rules</b>: $Ancestor(A, B)$ :- $Parent(A, B)$<br/><small>(If $A$ is a $Parent$ of $B$, then $A$ is also an $Ancestor$ of $B$)</small></p>
</section>
<section>
<p>$Ancestor(A, B)$ :- $Parent(A, B)$</p>
<p>$Ancestor(A, C)$ :- $Parent(A, B), Ancestor(B, C)$</p>
<p><small>($A$ is an $Ancestor$ of $C$ if $A$ is a $Parent$ of $B$ and $B$ is an $Ancestor$ of $C$)</small></p>
<br/>
<p class="fragment">$Ancestor$ computes the <i>transitive closure</i> of $Parent$</p>
<br/>
<p class="fragment">No fact (atom) that you can ever learn will invalidate a fact that you've already learned.</p>
</section>
<section>
<h3>Datalog is Monotonic</h3>
<h4 class="fragment">(unless you need negation or aggregation)</h4>
</section>
</section>
<section>
<section>
<h3>Bloom</h3>
<br/>
<p class="fragment">Datalog with timesteps and asynchronous events</p>
</section>
<section>
<table>
<tr><th>Symbol</th><th>Meaning</th></tr>
<tr><td><tt>&lt;=</tt></td><td>Add a new fact right now</td></tr>
<tr><td><tt>&lt;+</tt></td><td>Add a new fact in the next timestep</td></tr>
<tr><td><tt>&lt;-</tt></td><td>Remove a fact from the next timestep</td></tr>
<tr><td><tt>&lt;~</tt></td><td>Send a fact to another node</td></tr>
</table>
</section>
<section>
<h3>Example: Shortest Paths (Dijkstra's)</h3>
<pre><code class="hljs ruby">class ShortestPaths
include Bud
state {
table :link, [:from, :to] => [:cost]
scratch :path, [:from, :to, :next_hop, :cost]
scratch :min_cost, [:from, :to] => [:cost]
}
bloom {
path <= link {|l| [l.from, l.to, l.to, l.cost]}
path <= (link*path).pairs(:to => :from) { |l,p|
[l.from, p.to, l.to, l.cost + p.cost]
}
min_cost <= path.group([:from, :to], min(:cost))
}
end</code></pre>
</section>
<section>
<h3>Optimization</h3>
<pre><code class="hljs ruby">
path <= (link*path).pairs(:to => :from) { |l,p|
[l.from, p.to, l.to, l.cost + p.cost]
}
</code></pre>
<p class="fragment"><b>Compute only new facts</b>: This step can be performed incrementally as path entries are added</p>
</section>
<section>
<h3>Example: Quorum Voting</h3>
<pre><code class="hljs ruby">
class QuorumVote
include Bud
state {
channel :vote_chn, [:@addr, :voter_id]
channel :result_chn, [:@addr]
table :votes, [:voter_id]
scratch :cnt, [] => [:cnt]
}
bloom {
votes <= vote_chn {|v| [v.voter_id]}
cnt <= votes.group(nil, count(:voter_id))
result_chn <~ cnt {|c| [RET_ADDR] if c >= QUORUM_SIZE}
}
end</code></pre>
</section>
<section>
<h3>Problem!</h3>
<pre><code class="hljs ruby">
cnt <= votes.group(nil, count(:voter_id))
result_chn <~ cnt {|c| [RET_ADDR] if c >= QUORUM_SIZE}
</code></pre>
<p class="fragment"><b>cnt isn't monotonic</b>: It can't be computed until all votes are present and available!</p>
</section>
</section>
<section>
<section>
<h3>(Positive) Set operations are monotonic</h3>
<h3 class="fragment">... but not the only thing that's monotonic</h3>
<ul class="fragment">
<li>Growing Integers</li>
<li>Facts that become true</li>
<li>Many aggregates (MAX, MIN, COUNT)</li>
<li>Vector Clocks</li>
</ul>
<p class="fragment">There's a term for these... <b>bounded join semilattices</b></p>
</section>
<section>
<h3>Bounded Join Semilattice</h3>
$$< S, \sqcup, \bot >$$
<p class="fragment">$S$: A set (Integers, Boolean values, Sets of facts)</p>
<p class="fragment">$\sqcup : S \times S \rightarrow S$: A 'merge' operation for elements of $S$</p>
<p class="fragment">$\bot \in S$: A 'starting' element of $S$</p>
</section>
<section>
<h3>"Merge"</h3>
<p><small>(Least upper bound)</small></p>
<ul>
<li><b>Associative</b>: $(a \sqcup b) \sqcup c = a \sqcup (b \sqcup c)$</li>
<li><b>Commutative</b>: $a \sqcup b = b \sqcup a$</li>
<li><b>Idempotent</b>: $a \sqcup a = a$</li>
</ul>
<p class="fragment">Defines a partial order: $a < b$ if $a \sqcup b = b$</p>
</section>
<section>
<h3>Examples</h3>
<div class="fragment">$$< \mathbb R, MAX, -\infty >$$</div>
<div class="fragment">$$< \mathbb R, MIN, +\infty >$$</div>
<div class="fragment">$$< \mathbb B, \wedge, T >$$</div>
<div class="fragment">$$< \mathbb B, \vee, F >$$</div>
<div class="fragment">$$< sets\ of\ \mathbb R, \cup, \emptyset >$$</div>
</section>
<section>
<p><b>New notion of 'Fact'</b>: How 'far' in the lattice's set you are.</p>
<table class="fragment">
<tr><th>Event</th><th>Alice</th><th>Bob</th><th>Carol</th><th>Dave</th></tr>
<tr><td>Initial State</td><td>6</td><td>1</td><td>7</td><td>10</td></tr>
<tr class="fragment"><td>$Alice \sqcup Bob$</td><td>6</td><td>6</td><td>7</td><td>10</td></tr>
<tr class="fragment"><td>$Carol \sqcup Dave$</td><td>6</td><td>6</td><td>10</td><td>10</td></tr>
<tr class="fragment"><td>$Alice \sqcup Dave$</td><td>10</td><td>6</td><td>10</td><td>10</td></tr>
<tr class="fragment"><td>$Bob \sqcup Carol$</td><td>10</td><td>10</td><td>10</td><td>10</td></tr>
</table>
<p class="fragment">The <tt>lmax</tt> lattice always goes up</p>
</section>
</section>
<section>
<section>
<h3>Programs don't rely exclusively on one type!</h3>
<p class="fragment">We need mappings between different lattice types</p>
</section>
<section>
<h3>Monotone Functions</h3>
$$f : S \rightarrow T$$
<p>For any monotone $f$,<br/>whenever $a <_S b$ then $f(a) <_T f(b)$</p>
<br/>
<p class="fragment"><b>Monotone functions preserve partial orders</b></p>
</section>
<section>
<h3>Example Monotone Functions</h3>
<ul>
<li>sizeof : $set \rightarrow \mathbb N$</li>
<li>$\sum$ : $set\ of\ \mathbb R^+ \rightarrow \mathbb R^+$</li>
<li>$\cap$ : $set \times set \rightarrow \mathbb set$</li>
<li>$>(\mathbb R)$ : $lmax \rightarrow \mathbb B$</li>
<ul>
</section>
<section>
<h3>If all computations in a program are monotone functions, the program is naturally eventually consistent</h3>
<p class="fragment">... but we can do better</p>
</section>
<section>
<h3>Morphism</h3>
$$f : S \rightarrow T$$
<p>
$f$ is a morphism if<br/>
$f$ is monotone and $f(a \sqcup b) = f(a) \sqcup f(b)$<br/>
($f$ commutes with $\sqcup$)
</p>
<br/>
<p class="fragment"><b>Monotone functions are decomposable</b></p>
</section>
<section>
<h3>Example Morphisms</h3>
<ul>
<li>$\cap$ : $set \times set \rightarrow \mathbb set$</li>
<li>$>(\mathbb R)$ : $lmax \rightarrow \mathbb B$</li>
<ul>
</section>
<section>
<pre><code class="hljs ruby">
path <= link {|l| [l.from, l.to, l.to, l.cost]}
path <= (link*path).pairs(:to => :from) { |l,p|
[l.from, p.to, l.to, l.cost + p.cost]
}
min_cost <= path.group([:from, :to]) { |group|
group.project(:cost).min
}
</code></pre>
<table class="fragment"><tr><th>Morphisms <small style="vertical-align: middle">(using bags &amp; lmin)</small></th></tr><tr><td>
<code class="fragment">(link * path).pairs</code><br/>
<code class="fragment">+</code><br/>
<code class="fragment">project</code><br/>
<code class="fragment">group</code><br/>
<code class="fragment">min</code><br/>
</td></tr></table>
</section>
<section>
<h3>Incremental Computation</h3>
<p>We need to update an input with new data and compute:
$$f(old \sqcup new)$$
</p>
<p class="fragment">We (probably) already have $f(old)$.</p>
<p class="fragment"><b>Insight:</b> Computing $f(old) \sqcup f(new)$ is probably cheaper.</p>
<p class="fragment">... but is only correct if $f$ is a morphism</p>
</section>
<section>
<img src="graphics/BloomL-TransitiveClosure.png"/>
</section>
</section>
<section>
<section>
<h3>Example: Set Lattice</h3>
<pre><code class="hljs ruby">class Bud::SetLattice < Bud::Lattice
wrapper_name :lset
def initialize(x=[])
@v = x.uniq # Remove duplicates from input
end
def merge(i)
self.class.new(@v | i.reveal)
end
morph :intersect do |i|
self.class.new(@v & i.reveal)
end
morph :contains? do |i|
Bud::BoolLattice.new(@v.member? i)
end
monotone :size do
Bud::MaxLattice.new(@v.size)
end
end</code></pre>
</section>
<section>
<h3>Example: A Key Value Store</h3>
<pre><code class="hljs ruby">class KvsReplica
include Bud
include KvsProtocol
state { lmap :kv_store }
bloom do
# Fulfil any put requests
kv_store <= kvput {|c| {c.key => c.val}}
# Acknowledge any put requests
kvput_resp <~ kvput {|c|
[ c.reqid, c.client_addr, ip_port ]}
# Respond to any get requests
kvget_resp <~ kvget {|c|
[ c.reqid, c.client_addr,
kv_store.at(c.key), ip_port ]}
end
end</code></pre>
</section>
</section>
</div></div>
<script src="reveal.js-3.1.0/lib/js/head.min.js"></script>
<script src="reveal.js-3.1.0/js/reveal.js"></script>
<script>
// Full list of configuration options available at:
// https://github.com/hakimel/reveal.js#configuration
Reveal.initialize({
controls: false,
progress: true,
history: true,
center: true,
transition: 'fade', // none/fade/slide/convex/concave/zoom
// Optional reveal.js plugins
dependencies: [
{ src: 'reveal.js-3.1.0/lib/js/classList.js', condition: function() { return !document.body.classList; } },
{ src: 'reveal.js-3.1.0/plugin/math/math.js', condition: function() { return true; } },
{ src: 'reveal.js-3.1.0/plugin/markdown/marked.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
{ src: 'reveal.js-3.1.0/plugin/markdown/markdown.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
{ src: 'reveal.js-3.1.0/plugin/highlight/highlight.js', async: true, condition: function() { return !!document.querySelector( 'pre code' ); }, callback: function() { hljs.initHighlightingOnLoad(); } },
{ src: 'reveal.js-3.1.0/plugin/zoom-js/zoom.js', async: true },
{ src: 'reveal.js-3.1.0/plugin/notes/notes.js', async: true }
]
});
</script>
</body>
</html>