diff --git a/slides/cse4562sp2018/2018-01-31-SQL+Physical.html b/slides/cse4562sp2018/2018-01-31-SQL+Physical.html index 5845f396..e7237f29 100644 --- a/slides/cse4562sp2018/2018-01-31-SQL+Physical.html +++ b/slides/cse4562sp2018/2018-01-31-SQL+Physical.html @@ -126,13 +126,13 @@
What is the ID, Commmon Name and Borough of Trees in Brooklyn?
TREE_ID | SPC_COMMON | BORONAME | -
---|---|---|
204026 | 'honeylocust' | 'Brooklyn' | -
204337 | 'honeylocust' | 'Brooklyn' | -
189565 | 'American linden' | 'Brooklyn' | -
192755 | 'London planetree' | 'Brooklyn' | -
189465 | 'London planetree' | 'Brooklyn' | -
... and 177287 more | +||
TREE_ID | SPC_COMMON | BORONAME |
204026 | 'honeylocust' | 'Brooklyn' |
204337 | 'honeylocust' | 'Brooklyn' |
189565 | 'American linden' | 'Brooklyn' |
192755 | 'London planetree' | 'Brooklyn' |
189465 | 'London planetree' | 'Brooklyn' |
... and 177287 more |
First we focus on sets and bags.
+ +Delete rows that fail the condition $c$.
+TREE_ID | SPC_COMMON | BORONAME | ... |
---|---|---|---|
204026 | 'honeylocust' | 'Brooklyn' | ... |
204337 | 'honeylocust' | 'Brooklyn' | ... |
189565 | 'American linden' | 'Brooklyn' | ... |
192755 | 'London planetree' | 'Brooklyn' | ... |
189465 | 'London planetree' | 'Brooklyn' | ... |
... and 177287 more |
Delete attributes not in the projection list $A$.
+ +BORONAME |
---|
Queens |
Brooklyn |
Manhatten |
Bronx |
Staten Island |
Only 5 results... not 683788?
+Set and Bag Projection are different
+What are these queries schemas?
+Takes two relations that are union-compatible...
+(Both relations have the same number of fields with the same types)
+... and returns all tuples appearing in either relation
+We use $\uplus$ if we explicitly mean bag union
+Return all tuples appearing in both
of two union-compatible relations
What is this query asking?
+Return all tuples appearing in the first, but not the second
of two union-compatible relations
What is this query asking?
+What is the schema of the result of any of these operators?
+Create all pairs of tuples.
+ +SPC_COMMON | AVG_HEIGHT |
---|---|
cedar elm | 60 |
lacebark elm | 45 |
... and more |
SPC_COMMON | BORONAME | SPC_COMMON | AVG_HEIGHT |
---|---|---|---|
'honeylocust' | 'Brooklyn' | cedar elm | 60 |
'honeylocust' | 'Brooklyn' | cedar elm | 60 |
'American linden' | 'Brooklyn' | cedar elm | 60 |
'London planetree' | 'Manhattan' | cedar elm | 60 |
'London planetree' | 'Manhattan' | cedar elm | 60 |
... | |||
'honeylocust' | 'Brooklyn' | lacebark elm | 45 |
'honeylocust' | 'Brooklyn' | lacebark elm | 45 |
'American linden' | 'Brooklyn' | lacebark elm | 45 |
'London planetree' | 'Manhattan' | lacebark elm | 45 |
'London planetree' | 'Manhattan' | lacebark elm | 45 |
... and more |
What is the schema of the resulting relation?
+The relation has a naming conflict
(two attributes with the same name)
What is the schema of the resulting relation?
+When writing cross-products on the board,
I will use implicit renaming
Pair tuples according to a condition c.
+Equi-joins are joins with only equality tests in the condition.
+(Which operators behave differently in Set- and Bag-RA?)
+ +Operator | Symbol | Duplicates? |
---|---|---|
Selection | $\sigma$ | No |
Projection | $\pi$ | Yes |
Cross-product | $\times$ | No |
Set-difference | $-$ | No |
Union | $\cup$ | Yes |
Join | $\bowtie$ | No |
Find the BORONAMEs of all boroughs that do have trees with an average height of below 45 inches
+ +SPC_COMMON | AVG_HEIGHT |
---|---|
cedar elm | 60 |
lacebark elm | 45 |
... and more |
SPC_COMMON | BORONAME |
---|---|
'honeylocust' | 'Brooklyn' |
'honeylocust' | 'Brooklyn' |
'American linden' | 'Brooklyn' |
'London planetree' | 'Manhattan' |
'London planetree' | 'Manhattan' |
... and more |
Not typically supported as a primitive operator,
but useful for expressing queries like:
Find species that appear in all boroughs
++ $$R / S \equiv \{\; \left<\vec t\right> \;|\; \forall \left<\vec s\right> \in S, \left< \vec t \vec s \right> \in R \;\}$$ +
+BORO | SPC_COMMON |
---|---|
Brooklyn | honeylocust |
Brooklyn | American linden |
Brooklyn | London planetree |
Manhattan | honeylocust |
Manhattan | American linden |
Manhattan | pin oak |
Queens | honeylocust |
Queens | American linden |
Bronx | honeylocust |
/ { honeylocust } | = Brooklyn, Manhattan, Queens, Bronx |
/ { honeylocust, American linden } | = Brooklyn, Manhattan, Queens |
/ { honeylocust, American linden, pin oak } | = Manhattan |
If time permits: Implement division using other operators.
+
+ A simple way to think about and work with
+ computations over collections.
+
… simple → easy to evaluate
+… simple → easy to optimize
++ Next time, Optimizing RA +