Update ReadingList Probabilistic DBs

2019-02-21 12:05:00 -05:00 · 2019-02-21 12:05:00 -05:00 · 054370dcc2
parent 24192057e6
commit 054370dcc2
1 changed files with 9 additions and 2 deletions
--- a/ReadingList-Probabilistic-DBs.md
+++ b/ReadingList-Probabilistic-DBs.md
@ -159,7 +159,6 @@ Jigsaw is a variant of MCDB.  The underlying system implementation basically fol
 * [Jigsaw: Efficient optimization over uncertain enterprise data](http://dl.acm.org/citation.cfm?id=1989410)
 * DEMO: [Fuzzy prophet: parameter exploration in uncertain enterprise scenarios](http://dl.acm.org/citation.cfm?id=1989482)

-
 Querying Machine Learning Models
 -----------------------------------------------------
 Virtually all probabilistic database systems adopt a data model based on tuples.  A number of efforts have come up looking at how to use similar techniques to directly query data defined by a graphical model and/or how to represent graphical models in a database.
@ -305,4 +304,12 @@ Classically, PDBs assume that you come to them with data already annotated with
 #### BetaPDBs
 A popular model for probabilistic databases is called the Tuple-Independent model (creating TI-PDBs for short).  Tuple-independent probabilistic databases annotate each input tuple with a Bernoulli-distributed random variable.  That is, we assume that each row of the input data is effectively present according to a random coin-flip.  In a Beta-PDB, this is instead a Beta-Bernoulli distribution.  It's still a coin flip, but the bias of the coin comes from training data given by two parameters (typically called a, b).  Naively, these parameters represent samples: You flip a coin a+b times, and it comes up with a heads, that corresponds to a beta-distribution with parameters a, b.  Propagating this training data through queries turns out to be surprisingly harder, which is the subject of this paper.

-* http://odin.cse.buffalo.edu/papers/2017/SIGMOD-BetaPDBs-final.pdf
+* http://odin.cse.buffalo.edu/papers/2017/SIGMOD-BetaPDBs-final.pdf
+
+
+
+#### PayGo
+
+A Graph-ish database with missing values that prioritizes triples for cleaning based on the anticipated number of added results the triple could produce.
+
+* [Pay-as-you-go user feedback for dataspace systems](https://dl.acm.org/citation.cfm?id=1376701)