Website/slides/cse662fa2017/2017-10-02-BloomL.html
2017-10-02 12:03:02 -04:00

484 lines
15 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>CSE 662 - Languages and Runtimes for Big Data - Fall 2017</title>
<meta name="description" content="CSE662 - Fall 2016">
<meta name="author" content="Oliver Kennedy">
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
<link rel="stylesheet" href="reveal.js-3.1.0/css/reveal.css">
<link rel="stylesheet" href="ubodin.css" id="theme">
<!-- Code syntax highlighting -->
<link rel="stylesheet" href="reveal.js-3.1.0/lib/css/zenburn.css">
<!-- Printing and PDF exports -->
<script>
var link = document.createElement( 'link' );
link.rel = 'stylesheet';
link.type = 'text/css';
link.href = window.location.search.match( /print-pdf/gi ) ? 'reveal.js-3.1.0/css/print/pdf.css' : 'reveal.js-3.1.0/css/print/paper.css';
document.getElementsByTagName( 'head' )[0].appendChild( link );
</script>
<!--[if lt IE 9]>
<script src="reveal.js-3.1.0/lib/js/html5shiv.js"></script>
<![endif]-->
</head>
<body>
<div class="reveal">
<div class="header">
<!-- Any Talk-Specific Header Content Goes Here -->
University at Buffalo
</div>
<div class="footer">
<!-- Any Talk-Specific Footer Content Goes Here -->
<div style="float: left;">
CSE 662
</div>
<div style="float: right;">
Languages and Runtimes for Big Data
</div>
</div>
<!-- Any section element inside of this container is displayed as a slide -->
<div class="slides">
<section>
<section>
<h2>$Bloom^L$</h2>
<h4>Oct 2, 2017</h4>
</section>
<section>
<h3>How is distributed consistency enforced?</h3>
<ol>
<li class="fragment"><b>Don't allow the program to get into an inconsistent state.</b></li>
<li class="fragment"><b>Detect inconsistencies and fix them after the fact.</b></li>
<li class="fragment"><b class="fragment grow">Eventually converge to a consistent state.</b></li>
</ol>
</section>
<section>
<h3>The CALM Principle</h3>
<p>A <i>monotonic</i> program eventually converges naturally.</p>
</section>
</section>
<section>
<section>
<h3>Monotonicity</h3>
<p>"Once you learn a fact, it never becomes false" (although you might never learn all available facts)</p>
<div class="fragment">
<h4>Computation under monotonicity</h4>
<ol>
<li>What facts do you know right now?</li>
<li class="fragment">What facts can you compute given what you know?</li>
<li class="fragment">Broadcast/react to newly discovered facts</li>
</ol>
</div>
</section>
<section>
<h3>What causes concurrency violations?</h3>
<p class="fragment">A computation step needs a <b>complete input</b> before it can produce a <b>complete output</b></p>
<p class="fragment">The output is incorrect if...
<ul>
<li class="fragment">The input is incomplete, and...</li>
<li class="fragment">An incomplete output is not correct.</li>
</ul>
</p>
</section>
<section>
<ul>
<li class="fragment">Avoid incomplete inputs<ul>
<li>Block until you know all inputs are ready (Point of Order)</li>
</ul></li>
<li class="fragment">Avoid computations where incomplete outputs are incorrect<ul>
<li>Monotonic programs never produce incorrect outputs, just incomplete ones.</li>
</ul></li>
</ul>
</section>
</section>
<section>
<section>
<h3>What isn't monotonic?</h3>
</section>
<section>
<h3>Negation</h3>
$$R = \{A, B, C\}; S = \{C, D\}$$
<p class="fragment">Let's say that $T = R - S$<p>
<p class="fragment">You know two facts about $T$: $A \in T$ and $B \in T$</p>
<p class="fragment">If you ever learn that $A \in S$,<br/>the "fact" that $A \in T$ becomes false</p>
</section>
<section>
<h3>Aggregation</h3>
$$R = \{1, 2, 3\}$$
<p class="fragment">Let's say that $T = \sum_{i \in R} i$<p>
<p class="fragment">You know several facts about $T$ including: $T = 6$</p>
<p class="fragment">If you ever learn that $4 \in R$,<br/>the "fact" that $T = 6$ becomes false</p>
</section>
</section>
<section>
<section>
<h3>What <u>is</u> monotonic?</h3>
<h4 class="fragment">Sets!</h4>
</section>
<section>
<h3>Datalog</h3>
<p class="fragment"><b>Atoms</b>: $Parent(A, B)$<br/><small>($A$ is a $Parent$ of $B$)</small></p>
<p class="fragment"><b>Rules</b>: $Ancestor(A, B)$ :- $Parent(A, B)$<br/><small>(If $A$ is a $Parent$ of $B$, then $A$ is also an $Ancestor$ of $B$)</small></p>
</section>
<section>
<p>$Ancestor(A, B)$ :- $Parent(A, B)$</p>
<p>$Ancestor(A, C)$ :- $Parent(A, B), Ancestor(B, C)$</p>
<p><small>($A$ is an $Ancestor$ of $C$ if $A$ is a $Parent$ of $B$ and $B$ is an $Ancestor$ of $C$)</small></p>
<br/>
<p class="fragment">$Ancestor$ computes the <i>transitive closure</i> of $Parent$</p>
<br/>
<p class="fragment">No fact (atom) that you can ever learn will invalidate a fact that you've already learned.</p>
</section>
<section>
<h3>Datalog is Monotonic</h3>
<h4 class="fragment">(unless you need negation or aggregation)</h4>
</section>
</section>
<section>
<section>
<h3>Bloom</h3>
<br/>
<p class="fragment">Datalog with timesteps and asynchronous events</p>
</section>
<section>
<table>
<tr><th>Symbol</th><th>Meaning</th></tr>
<tr><td><tt>&lt;=</tt></td><td>Add a new fact right now</td></tr>
<tr><td><tt>&lt;+</tt></td><td>Add a new fact in the next timestep</td></tr>
<tr><td><tt>&lt;-</tt></td><td>Remove a fact from the next timestep</td></tr>
<tr><td><tt>&lt;~</tt></td><td>Send a fact to another node</td></tr>
</table>
</section>
<section>
<h3>Example: Shortest Paths (Dijkstra's)</h3>
<pre><code class="hljs ruby">class ShortestPaths
include Bud
state {
table :link, [:from, :to] => [:cost]
scratch :path, [:from, :to, :next_hop, :cost]
scratch :min_cost, [:from, :to] => [:cost]
}
bloom {
path <= link {|l| [l.from, l.to, l.to, l.cost]}
path <= (link*path).pairs(:to => :from) { |l,p|
[l.from, p.to, l.to, l.cost + p.cost]
}
min_cost <= path.group([:from, :to], min(:cost))
}
end</code></pre>
</section>
<section>
<h3>Optimization</h3>
<pre><code class="hljs ruby">
path <= (link*path).pairs(:to => :from) { |l,p|
[l.from, p.to, l.to, l.cost + p.cost]
}
</code></pre>
<p class="fragment"><b>Compute only new facts</b>: This step can be performed incrementally as path entries are added</p>
</section>
<section>
<h3>Example: Quorum Voting</h3>
<pre><code class="hljs ruby">
class QuorumVote
include Bud
state {
channel :vote_chn, [:@addr, :voter_id]
channel :result_chn, [:@addr]
table :votes, [:voter_id]
scratch :cnt, [] => [:cnt]
}
bloom {
votes <= vote_chn {|v| [v.voter_id]}
cnt <= votes.group(nil, count(:voter_id))
result_chn <~ cnt {|c| [RET_ADDR] if c >= QUORUM_SIZE}
}
end</code></pre>
</section>
<section>
<h3>Problem!</h3>
<pre><code class="hljs ruby">
cnt <= votes.group(nil, count(:voter_id))
result_chn <~ cnt {|c| [RET_ADDR] if c >= QUORUM_SIZE}
</code></pre>
<p class="fragment"><b>cnt isn't monotonic</b>: It can't be computed until all votes are present and available!</p>
</section>
</section>
<section>
<section>
<h3>(Positive) Set operations are monotonic</h3>
<h3 class="fragment">... but not the only thing that's monotonic</h3>
<ul class="fragment">
<li>Growing Integers</li>
<li>Facts that become true</li>
<li>Many aggregates (MAX, MIN, COUNT)</li>
<li>Vector Clocks</li>
</ul>
<p class="fragment">There's a term for these... <b>bounded join semilattices</b></p>
</section>
<section>
<h3>Bounded Join Semilattice</h3>
$$< S, \sqcup, \bot >$$
<p class="fragment">$S$: A set (Integers, Boolean values, Sets of facts)</p>
<p class="fragment">$\sqcup : S \times S \rightarrow S$: A 'merge' operation for elements of $S$</p>
<p class="fragment">$\bot \in S$: A 'starting' element of $S$</p>
</section>
<section>
<h3>"Merge"</h3>
<p><small>(Least upper bound)</small></p>
<ul>
<li><b>Associative</b>: $(a \sqcup b) \sqcup c = a \sqcup (b \sqcup c)$</li>
<li><b>Commutative</b>: $a \sqcup b = b \sqcup a$</li>
<li><b>Idempotent</b>: $a \sqcup a = a$</li>
</ul>
<p class="fragment">Defines a partial order: $a < b$ if $a \sqcup b = b$</p>
</section>
<section>
<h3>Examples</h3>
<div class="fragment">$$< \mathbb R, MAX, -\infty >$$</div>
<div class="fragment">$$< \mathbb R, MIN, +\infty >$$</div>
<div class="fragment">$$< \mathbb B, \wedge, T >$$</div>
<div class="fragment">$$< \mathbb B, \vee, F >$$</div>
<div class="fragment">$$< sets\ of\ \mathbb R, \cup, \emptyset >$$</div>
</section>
<section>
<p><b>New notion of 'Fact'</b>: How 'far' in the lattice's set you are.</p>
<table class="fragment">
<tr><th>Event</th><th>Alice</th><th>Bob</th><th>Carol</th><th>Dave</th></tr>
<tr><td>Initial State</td><td>6</td><td>1</td><td>7</td><td>10</td></tr>
<tr class="fragment"><td>$Alice \sqcup Bob$</td><td>6</td><td>6</td><td>7</td><td>10</td></tr>
<tr class="fragment"><td>$Carol \sqcup Dave$</td><td>6</td><td>6</td><td>10</td><td>10</td></tr>
<tr class="fragment"><td>$Alice \sqcup Dave$</td><td>10</td><td>6</td><td>10</td><td>10</td></tr>
<tr class="fragment"><td>$Bob \sqcup Carol$</td><td>10</td><td>10</td><td>10</td><td>10</td></tr>
</table>
<p class="fragment">The <tt>lmax</tt> lattice always goes up</p>
</section>
</section>
<section>
<section>
<h3>Programs don't rely exclusively on one type!</h3>
<p class="fragment">We need mappings between different lattice types</p>
</section>
<section>
<h3>Monotone Functions</h3>
$$f : S \rightarrow T$$
<p>For any monotone $f$,<br/>whenever $a <_S b$ then $f(a) <_T f(b)$</p>
<br/>
<p class="fragment"><b>Monotone functions preserve partial orders</b></p>
</section>
<section>
<h3>Example Monotone Functions</h3>
<ul>
<li>sizeof : $set \rightarrow \mathbb N$</li>
<li>$\sum$ : $set\ of\ \mathbb R^+ \rightarrow \mathbb R^+$</li>
<li>$\cap$ : $set \times set \rightarrow \mathbb set$</li>
<li>$>(\mathbb R)$ : $lmax \rightarrow \mathbb B$</li>
<ul>
</section>
<section>
<h3>If all computations in a program are monotone functions, the program is naturally eventually consistent</h3>
<p class="fragment">... but we can do better</p>
</section>
<section>
<h3>Morphism</h3>
$$f : S \rightarrow T$$
<p>
$f$ is a morphism if<br/>
$f$ is monotone and $f(a \sqcup b) = f(a) \sqcup f(b)$<br/>
($f$ commutes with $\sqcup$)
</p>
<br/>
<p class="fragment"><b>Monotone functions are decomposable</b></p>
</section>
<section>
<h3>Example Morphisms</h3>
<ul>
<li>$\cap$ : $set \times set \rightarrow \mathbb set$</li>
<li>$>(\mathbb R)$ : $lmax \rightarrow \mathbb B$</li>
<ul>
</section>
<section>
<pre><code class="hljs ruby">
path <= link {|l| [l.from, l.to, l.to, l.cost]}
path <= (link*path).pairs(:to => :from) { |l,p|
[l.from, p.to, l.to, l.cost + p.cost]
}
min_cost <= path.group([:from, :to]) { |group|
group.project(:cost).min
}
</code></pre>
<table class="fragment"><tr><th>Morphisms <small style="vertical-align: middle">(using bags &amp; lmin)</small></th></tr><tr><td>
<code class="fragment">(link * path).pairs</code><br/>
<code class="fragment">+</code><br/>
<code class="fragment">project</code><br/>
<code class="fragment">group</code><br/>
<code class="fragment">min</code><br/>
</td></tr></table>
</section>
<section>
<h3>Incremental Computation</h3>
<p>We need to update an input with new data and compute:
$$f(old \sqcup new)$$
</p>
<p class="fragment">We (probably) already have $f(old)$.</p>
<p class="fragment"><b>Insight:</b> Computing $f(old) \sqcup f(new)$ is probably cheaper.</p>
<p class="fragment">... but is only correct if $f$ is a morphism</p>
</section>
<section>
<img src="graphics/BloomL-TransitiveClosure.png"/>
</section>
</section>
<section>
<section>
<h3>Example: Set Lattice</h3>
<pre><code class="hljs ruby">class Bud::SetLattice < Bud::Lattice
wrapper_name :lset
def initialize(x=[])
@v = x.uniq # Remove duplicates from input
end
def merge(i)
self.class.new(@v | i.reveal)
end
morph :intersect do |i|
self.class.new(@v & i.reveal)
end
morph :contains? do |i|
Bud::BoolLattice.new(@v.member? i)
end
monotone :size do
Bud::MaxLattice.new(@v.size)
end
end</code></pre>
</section>
<section>
<h3>Example: A Key Value Store</h3>
<pre><code class="hljs ruby">class KvsReplica
include Bud
include KvsProtocol
state { lmap :kv_store }
bloom do
# Fulfil any put requests
kv_store <= kvput {|c| {c.key => c.val}}
# Acknowledge any put requests
kvput_resp <~ kvput {|c|
[ c.reqid, c.client_addr, ip_port ]}
# Respond to any get requests
kvget_resp <~ kvget {|c|
[ c.reqid, c.client_addr,
kv_store.at(c.key), ip_port ]}
end
end</code></pre>
</section>
</section>
</div></div>
<script src="reveal.js-3.1.0/lib/js/head.min.js"></script>
<script src="reveal.js-3.1.0/js/reveal.js"></script>
<script>
// Full list of configuration options available at:
// https://github.com/hakimel/reveal.js#configuration
Reveal.initialize({
controls: false,
progress: true,
history: true,
center: true,
transition: 'fade', // none/fade/slide/convex/concave/zoom
// Optional reveal.js plugins
dependencies: [
{ src: 'reveal.js-3.1.0/lib/js/classList.js', condition: function() { return !document.body.classList; } },
{ src: 'reveal.js-3.1.0/plugin/math/math.js', condition: function() { return true; } },
{ src: 'reveal.js-3.1.0/plugin/markdown/marked.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
{ src: 'reveal.js-3.1.0/plugin/markdown/markdown.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
{ src: 'reveal.js-3.1.0/plugin/highlight/highlight.js', async: true, condition: function() { return !!document.querySelector( 'pre code' ); }, callback: function() { hljs.initHighlightingOnLoad(); } },
{ src: 'reveal.js-3.1.0/plugin/zoom-js/zoom.js', async: true },
{ src: 'reveal.js-3.1.0/plugin/notes/notes.js', async: true }
]
});
</script>
</body>
</html>