...with Gokhan Kul, Duc Thanh Anh Luong, Ting Xie, Shambhu, Varun, Hung
...with Jerry Ajay, Geoff, Luke
...with Luke
Alice spends weeks cleaning her data before using it.
Can we start with automation and work our way up?
Here's a problem with my data. Fix it.
Each lens implements one automated data repair task with minimal configuration or training.
CREATE LENS PRODUCTS
AS SELECT * FROM PRODUCTS_RAW
USING DOMAIN_REPAIR(DEPARTMENT NOT NULL);
AS
clause defines source data.USING
clause requests repairs.
CREATE LENS PRODUCTS
AS SELECT * FROM PRODUCTS_RAW
USING DOMAIN_REPAIR(DEPARTMENT NOT NULL);
CREATE VIEW PRODUCTS
AS SELECT ID, NAME, ...,
CASE WHEN DEPARTMENT IS NOT NULL THEN DEPARTMENT
ELSE VAR('PRODUCTS.DEPARTMENT', ROWID)
END AS DEPARTMENT
FROM PRODUCTS_RAW;
ID | Name | ... | Department |
---|---|---|---|
123 | Apple 6s, White | ... | Phone |
34234 | Dell, Intel 4 core | ... | Computer |
34235 | HP, AMD 2 core | ... | $Prod.Dept_3$ |
... | ... | ... | ... |
CREATE LENS PRODUCTS
AS SELECT * FROM PRODUCTS_RAW
USING DOMAIN_REPAIR(DEPARTMENT NOT NULL);
SELECT * FROM PRODUCTS_RAW;
An estimator for each $Prod.Dept_{ROWID}$
SELECT NAME, DEPARTMENT FROM PRODUCTS;
Name | Department |
---|---|
Apple 6s, White | Phone |
Dell, Intel 4 core | Computer |
HP, AMD 2 core | Computer |
... | ... |
Simple UI: Highlight values (and rows) based on guesses.
SELECT NAME, DEPARTMENT FROM PRODUCTS;
Name | Department |
---|---|
Apple 6s, White | Phone |
Dell, Intel 4 core | Computer |
HP, AMD 2 core | Computer |
... | ... |
Allow users to EXPLAIN
uncertain outputs
Explanations include reasons given in English
UB: Ying Yang, Niccolo Meneghetti,
Arindam Nandi, Vinayak Karuppasamy
Oracle: Ronny Fehling, Zhen-Hua Liu, Dieter Gawlick