Adding Zack

pull/1/head
Oliver Kennedy 2015-12-23 17:12:34 -05:00
parent 74660e0615
commit 439b8aeb09
2 changed files with 8 additions and 9 deletions

View File

@ -1,5 +1,6 @@
---
title: Mimir
acronym: Modular Interface for Managing Incomplete Records
pageLinks:
- id: pageTop
title: Mimir

View File

@ -5,7 +5,8 @@ schedule:
what: Brainstorming Joint Project Ideas
who: The UB Database Group
- when: Feb. 1
what: TBD
who: Zack Ives (UPenn)
what: Title TBD
- when: Feb. 8
what: TBD
- when: Feb. 15
@ -24,12 +25,9 @@ schedule:
who: Wolfgang Gatterbauer (CMU)
what: Approximate lifted inference with probabilistic databases
abstract: |
Probabilistic inference over large data sets is becoming a central data management problem. Recent large knowledge bases, such as Yago, Nell or DeepDive have millions to billions of uncertain tuples. Yet probabilistic inference is known to be #P-hard in the size of the database, even for some very simple queries. This talk shows a new approach that allows ranking answers to hard probabilistic queries in guaranteed polynomial time, and by using only basic operators of existing database management systems (e.g., no sampling required).
(1) The first part of this talk develops upper and lower bounds for the probability of Boolean functions by treating multiple occurrences of variables as independent and assigning them new individual probabilities. We call this approach dissociation and give an exact characterization of optimal oblivious bounds, i.e. when the new probabilities are chosen independent of the probabilities of all other variables. Our new bounds shed light on the connection between previous relaxation-based and model-based approximations and unify them as concrete choices in a larger design space.
(2) The second part then draws the connection to lifted inference and shows how application of this theory allows a standard relational database management system to both upper and lower bound hard probabilistic queries in guaranteed polynomial time. We give experimental evidence on synthetic TPC-H data that our approach is by orders of magnitude faster and also more accurate than currently used sampling-based approaches.
Probabilistic inference over large data sets is becoming a central data management problem. Recent large knowledge bases, such as Yago, Nell or DeepDive have millions to billions of uncertain tuples. Yet probabilistic inference is known to be #P-hard in the size of the database, even for some very simple queries. This talk shows a new approach that allows ranking answers to hard probabilistic queries in guaranteed polynomial time, and by using only basic operators of existing database management systems (e.g., no sampling required).<br/>
(1) The first part of this talk develops upper and lower bounds for the probability of Boolean functions by treating multiple occurrences of variables as independent and assigning them new individual probabilities. We call this approach dissociation and give an exact characterization of optimal oblivious bounds, i.e. when the new probabilities are chosen independent of the probabilities of all other variables. Our new bounds shed light on the connection between previous relaxation-based and model-based approximations and unify them as concrete choices in a larger design space.<br/>
(2) The second part then draws the connection to lifted inference and shows how application of this theory allows a standard relational database management system to both upper and lower bound hard probabilistic queries in guaranteed polynomial time. We give experimental evidence on synthetic TPC-H data that our approach is by orders of magnitude faster and also more accurate than currently used sampling-based approaches.<br/>
(Talk based on joint work with Dan Suciu from TODS 2014 and VLDB 2015: http://arxiv.org/abs/1409.6052, http://arxiv.org/pdf/1412.1069)
bio: |
Wolfgang Gatterbauer is an Assistant Professor in Business Technologies and Computer Science at CMU. His current research focus is on scalable approaches to perform inference over uncertain data. He received degrees in Mechanical Engineering, Electrical Engineering & Computer Science, and Technology & Policy, and then got his PhD in Computer Science from Vienna University of Technolgoy. Prior to joining CMU, he was a Post-Doc in the Database group at University of Washington. In earlier times, he won a Bronze medal at the International Physics Olympiad, worked in the steam turbine development department of ABB Alstom Power, and in the German office of McKinsey & Company.
@ -77,11 +75,11 @@ The UBDB seminar meets on Mondays at 10:30 AM, typically in Davis 113A. Subscri
{{#if url}}<center><a href="{{url}}" class="paper">{{url}}</a></center>{{/if}}
{{#if abstract}}
<div class="heading">Abstract</div>
{{abstract}}
{{{abstract}}}
{{/if}}
{{#if bio}}
<div class="heading">Bio</div>
{{bio}}
{{{bio}}}
{{/if}}
</div>
</div>