From c10ca4f522257a3230ee1fde311c1b316a4d91fc Mon Sep 17 00:00:00 2001 From: Oliver Kennedy Date: Sun, 1 Oct 2017 09:57:58 -0400 Subject: [PATCH] In progress --- slides/talks/2017-5-Tour-Mimir/index.html | 547 +++++++++++++--------- 1 file changed, 316 insertions(+), 231 deletions(-) diff --git a/slides/talks/2017-5-Tour-Mimir/index.html b/slides/talks/2017-5-Tour-Mimir/index.html index 0e168526..1a983492 100644 --- a/slides/talks/2017-5-Tour-Mimir/index.html +++ b/slides/talks/2017-5-Tour-Mimir/index.html @@ -304,6 +304,7 @@

The database is in the way

+

Why?

@@ -395,266 +396,350 @@
-

On representing incomplete information in a relational data base

-

T. Imielinski & W. Lipski Jr.(VLDB 1981)

-

- Incomplete and Probabilistic Databases
have existed since the 1980s -

+
+

On representing incomplete information in a relational data base

+

T. Imielinski & W. Lipski Jr.(VLDB 1981)

+

+ Incomplete and Probabilistic Databases
have existed since the 1980s +

+
+ +
+ + + + + + + + + + + + Q(D) + + + + + + Q(D) + Q(D) + Q(D) + + + + + + + + + + + ? + + + + + + + Probab. + Cert. A. + + + + + + +

+ We've gotten good at query processing on uncertain data.
+ But not at "sourcing" uncertain data + ... or communicating results. +

+
+ +
+

Challenges

+
    +
  • Where do Probabilities/Possible Worlds Come From?
  • +
  • How do I use the output of a probablistic DB query?
  • +
  • Probablistic DB queries are sloooooow.
  • +
+

A small shift in how we think about PDBs addresses all three points.

+
- - +
+

It's not the data that's uncertain,
it's the interpretation

+
+ +
+ + + + + + + + +
TimeSensor ReadingTemp Around Sensor
131.6Roughly 31.6˚C
2-999Around 30˚C?
428.1Roughly 28.1˚C?
332.2Roughly 32.2˚C
+

The reading is deterministic

+

... but what we care about is what the reading measures

+
+ +
+ - + + + + + + Q1(D) + Q2(D) + Q3(D) + Q4(D) + + + - - - - Q(D) - - + + - Q(D) - Q(D) - Q(D) - - - - - - - - - - ? - - - - - - - - ? - - -

- We've gotten good at query processing on uncertain data.
- But not at "sourcing" uncertain data - ... or communicating results. -

-
- + +

Insight: Treat data as 100% deterministic.

+

Instead, queries propose alternative interpretations.

+ -
-
-

It's not the data that's uncertain,
it's the interpretation

-
- -
- - - - - - - - -
TimeSensor ReadingTemp Around Sensor
131.6Roughly 31.6˚C
2-999Around 30˚C?
428.1Roughly 28.1˚C?
332.2Roughly 32.2˚C
-

The reading is deterministic

-

... but what we care about is what the reading measures

-
- -
- - - - - - - - Q1(D) - Q2(D) - Q3(D) - Q4(D) - - - - - - - - - - - - - - -

Insight 1: Treat data as 100% deterministic.
Instead, queries propose alternative interpretations.

+
+

Effects

+
    +
  1. It's clear where uncertainty comes from.
  2. +
  3. Results can be communicated through provenance.
  4. +
  5. Query evaluation is decoupled from physical layout.
  6. +
+
+
+

Non-Deterministic Queries

+
+ +
+

+
+ +
+ +
+ +
+

Uncertainty as Provenance

+

+ Introduce Best-Guess queries and the idea of explanations. Key points: +

    +
  • Best-guess queries
  • +
  • Generating explanations
  • +
  • Ranking explanations
  • +
+

+
+ +
+ +

Demo

+
+ +
+ +
+ +
+

Virtualized Uncertainty

+

+ Optimizing sampling-based query evaluation +

+
+ +
+ +
+ +
+

Schema-Level Uncertainty

+
+