diff --git a/src/teaching/cse-562/2021sp/index.erb b/src/teaching/cse-562/2021sp/index.erb index 44de50ae..399edc0a 100644 --- a/src/teaching/cse-562/2021sp/index.erb +++ b/src/teaching/cse-562/2021sp/index.erb @@ -87,7 +87,7 @@ In this course, you will learn...
  • TAs:
  • -
  • Course Discussions: Canvas
  • +
  • Course Discussions: Piazza
  • No Required Textbook
  • Optional References:
    Practicum (50% of Grade)
    @@ -294,19 +328,29 @@ textbook: "Ch. 1, 2.1-2.2"
    - + +
    -

    I've torn the guts out of Apache Spark.
    - Your mission: Replace them (sort of).

    -
    @@ -389,7 +437,7 @@ textbook: "Ch. 1, 2.1-2.2" @@ -474,6 +522,10 @@ textbook: "Ch. 1, 2.1-2.2" +
    + +
    +
    @@ -505,17 +557,21 @@ textbook: "Ch. 1, 2.1-2.2"

    Your data is currently an Unordered Set
    - of Tuples with 100 fields each. + of Tuples with 100 attributes each.

    Tomorrow, you’ll be repeatedly asked for 1 specific attribute
    - of 5 specific rows identified by the first attribute + from 5 specific tuples identified by the first attribute

    Can you do better?

    +
    + <%= sli_do_link %> +
    +

    Better Idea: Rewrite data into a 99-Tuple of Maps keyed on the 1st attribute

    This representation is equivalent and better for your needs.

    @@ -523,3 +579,248 @@ textbook: "Ch. 1, 2.1-2.2"

    Declarative specifications make it easier to find equivalences.

    + + +
    +
    +

    Declarative Languages

    + +
      +
    • Don't need to think about algorithms.
    • +
    • Independent of the data representation.
    • +
    +
    + +
    +

    SQL

    +
      +
    • Developed by IBM (for System R) in the 1970s.
    • +
    • Standard used by many vendors.
        +
      • SQL-86 (original standard)
      • +
      • SQL-89 (minor revisions; integrity constraints)
      • +
      • SQL-92 (major revision; basis for modern SQL)
      • +
      • SQL-99 (XML, window queries, generated default values)
      • +
      • SQL 2003 (major revisions to XML support)
      • +
      • SQL 2008 (minor extensions)
      • +
      • SQL 2011 (minor extensions; temporal databases)
      • +
      +
    +
    + +
    +

    A Basic SQL Query

    + +
    + +
    +
    
    +            SELECT  [DISTINCT] targetlist
    +            FROM    relationlist
    +            WHERE   condition
    +    
    +
      +
    1. Compute the $2^n$ combinations of tuples in all relations appearing in relationlist
    2. +
    3. Discard tuples that fail the condition
    4. +
    5. Delete attributes not in targetlist
    6. +
    7. If DISTINCT is specified, eliminate duplicate rows
    8. +
    +

    + This is the least efficient strategy to compute a query! + A good optimizer will find more efficient strategies to compute the same answer. +

    +
    + +
    +

    Example Data

    + +
    + +
    +
    SELECT * FROM Trees;
    + +

    Wildcards (*, tablename.*) are special targets that select all attributes.

    + +
    + + + + + + + + +
    CREATED_ATTREE_IDBLOCK_IDTHE_GEOMTREE_DBHSTUMP_DIAMCURB_LOCSTATUSHEALTHSPC_LATINSPC_COMMONSTEWARDGUARDSSIDEWALKUSER_TYPEPROBLEMSROOT_STONEROOT_GRATEROOT_OTHERTRNK_WIRETRNK_LIGHTTRNK_OTHERBRNCH_LIGHBRNCH_SHOEBRNCH_OTHEADDRESSZIPCODEZIP_CITYCB_NUMBOROCODEBORONAMECNCLDISTST_ASSEMST_SENATENTANTA_NAMEBORO_CTSTATELATITUDELONGITUDEX_SPY_SP
    '08/27/2015'180683348711'POINT (-73.84421521958048 40.723091773924274)'30'OnCurb''Alive''Fair''Acer rubrum''red maple''None''None''NoDamage''TreesCount Staff''None''No''No''No''No''No''No''No''No''No''108-005 70 AVENUE''11375''Forest Hills'4064'Queens'292816'QN17''Forest Hills'4073900'New York'40.72309177-73.844215221027431.14821202756.768749
    '09/03/2015'200540315986'POINT (-73.81867945834878 40.79411066708779)'210'OnCurb''Alive''Fair''Quercus palustris''pin oak''None''None''Damage''TreesCount Staff''Stones''Yes''No''No''No''No''No''No''No''No''147-074 7 AVENUE''11357''Whitestone'4074'Queens'192711'QN49''Whitestone'4097300'New York'40.79411067-73.818679461034455.70109228644.837379
    '09/05/2015'204026218365'POINT (-73.93660770459083 40.717580740099116)'30'OnCurb''Alive''Good''Gleditsia triacanthos var. inermis''honeylocust''1or2''None''Damage''Volunteer''None''No''No''No''No''No''No''No''No''No''390 MORGAN AVENUE''11211''Brooklyn'3013'Brooklyn'345018'BK90''East Williamsburg'3044900'New York'40.71758074-73.93660771001822.83131200716.891267
    '09/05/2015'204337217969'POINT (-73.93445615919741 40.713537494833226)'100'OnCurb''Alive''Good''Gleditsia triacanthos var. inermis''honeylocust''None''None''Damage''Volunteer''Stones''Yes''No''No''No''No''No''No''No''No''1027 GRAND STREET''11211''Brooklyn'3013'Brooklyn'345318'BK90''East Williamsburg'3044900'New York'40.71353749-73.934456161002420.35833199244.253136
    '08/30/2015'189565223043'POINT (-73.97597938483258 40.66677775537875)'210'OnCurb''Alive''Good''Tilia americana''American linden''None''None''Damage''Volunteer''Stones''Yes''No''No''No''No''No''No''No''No''603 6 STREET''11215''Brooklyn'3063'Brooklyn'394421'BK37''Park Slope-Gowanus'3016500'New York'40.66677776-73.97597938990913.775046182202.425999
    ... and 683783 more
    +
    +
    + +
    +
    
    +            SELECT tree_id, spc_common, boroname
    +            FROM Trees
    +            WHERE boroname = 'Brooklyn'
    +    
    + +

    In English, what does this query compute?

    + + <%= sli_do_link_small %> +
    + +
    +

    What is the ID, Commmon Name and Borough of Trees in Brooklyn?

    + + + + + + + + + +
    TREE_IDSPC_COMMONBORONAME
    204026'honeylocust''Brooklyn'
    204337'honeylocust''Brooklyn'
    189565'American linden''Brooklyn'
    192755'London planetree''Brooklyn'
    189465'London planetree''Brooklyn'
    ... and 177287 more
    +
    + +
    +
    
    +      SELECT latitude, longitude 
    +      FROM Trees, SpeciesInfo
    +      WHERE Trees.spc_common = SpeciesInfo.name
    +        AND SpeciesInfo.has_unpleasant_smell = 'Yes';
    +    
    + +

    In English, what does this query compute?

    + + <%= sli_do_link_small %> + +
    + +
    +

    What are the coordinates of Trees with bad smells?

    + + + + + + + + + +
    LATITUDELONGITUDE
    40.59378755-73.9915968
    40.69149917-73.97258754
    40.74829709-73.98065645
    40.68767857-73.96764605
    40.739991-73.86526993
    ... and more
    +
    + +
    +
    
    +      SELECT Trees.latitude, Trees.longitude 
    +      FROM Trees, SpeciesInfo
    +      WHERE Trees.spc_common = SpeciesInfo.name
    +        AND SpeciesInfo.has_unpleasant_smell = 'Yes';
    +    
    + +

    ... is the same as ...

    + +
    
    +      SELECT T.latitude, T.longitude 
    +      FROM Trees T, SpeciesInfo S
    +      WHERE T.spc_common = S.name
    +        AND S.has_unpleasant_smell = 'Yes';
    +    
    + +

    ... is (usually) the same as ...

    + +
    
    +      SELECT latitude, longitude 
    +      FROM Trees, SpeciesInfo
    +      WHERE spc_common = name
    +        AND has_unpleasant_smell = 'Yes';
    +    
    + +
    + +
    +

    Expressions

    + +
    
    +            SELECT tree_id, 
    +                   stump_diam / 2 AS stump_radius,
    +                   stump_area = 3.14 * stump_diam * stump_diam / 4
    +            FROM Trees;
    +    
    + +

    + Arithmetic expressions can appear in targets or conditions. + Use ‘=’ or ‘AS’ to assign names to these attributes. + (The behavior of unnamed attributes is unspecified) +

    +
    + +
    +

    Expressions

    + +
    
    +  SELECT tree_id, spc_common FROM Trees WHERE spc_common LIKE '%maple'
    +    
    + + + + + + + +
    TREE_IDSPC_COMMON
    180683'red maple'
    204325'sycamore maple'
    205044'Amur maple'
    184031'red maple'
    208974'red maple'
    +

    SQL uses single quotes for ‘string literals’

    +

    LIKE is used for String Matches

    +

    %’ matches 0 or more characters

    +
    + +
    +

    Union

    +
    
    +    SELECT tree_id FROM Trees WHERE spc_common = 'red maple'
    +    UNION [ALL]
    +    SELECT tree_id FROM Trees WHERE spc_common = 'sycamore maple'
    +    
    +

    Computes the set-union of any two union-compatible sets of tuples

    +

    Adding ALL preserves duplicates across the inputs (bag-union).

    +
    + +
    +

    Aggregate Queries

    +
    
    +    SELECT [DISTINCT] targetlist
    +    FROM relationlist
    +    WHERE condition
    +    GROUP BY groupinglist
    +    HAVING groupcondition
    +    
    +
    +

    The targetlist now contains (a) Grouped attributes, and (b) Aggregate expressions.

    +

    Targets of type (a) must be a subset of the grouping-list

    +

    (intuitively each answer tuple corresponds to a single group, and each group must have a single value for each attribute)

    +

    groupcondition is applied after aggregation and may contain aggregate expressions.

    +
    +
    + +
    +

    Aggregate Queries

    +
    
    +    SELECT spc_common, count(*) FROM Trees GROUP BY spc_common
    +    
    + + + + + + + + +
    SPC_COMMON COUNT
    ''Schubert' chokecherry' 4888
    'American beech' 273
    'American elm' 7975
    'American hophornbeam' 1081
    'American hornbeam' 1517
    ... and more
    + +
    + +
    + + +
    +

    Next time...

    + +

    Scala for Java programmers.

    +
    diff --git a/src/teaching/cse-562/2021sp/slide/graphics/2021-02-02-parts_of_sql.svg b/src/teaching/cse-562/2021sp/slide/graphics/2021-02-02-parts_of_sql.svg new file mode 100644 index 00000000..691a0702 --- /dev/null +++ b/src/teaching/cse-562/2021sp/slide/graphics/2021-02-02-parts_of_sql.svg @@ -0,0 +1,277 @@ + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + SELECT [DISTINCT] target-listFROM relation-listWHERE condition + + A list of relation names (possibly with a range-variable after each name) + + + + A list of attributes of relations in relation-list + + + + Comparisons (‘=’, ‘<>’, ‘<‘, ‘>’, ‘<=’, ‘>=’) and other boolean predicates, combined using AND, OR, and NOT (a boolean formula) + + + + (optional) keyword indicating that the answer should not contain duplicates + + + + diff --git a/src/teaching/cse-562/2021sp/slide/graphics/slido.png b/src/teaching/cse-562/2021sp/slide/graphics/slido.png new file mode 100644 index 00000000..74e1b953 Binary files /dev/null and b/src/teaching/cse-562/2021sp/slide/graphics/slido.png differ diff --git a/src/teaching/cse-562/2021sp/slide/ubodin.css b/src/teaching/cse-562/2021sp/slide/ubodin.css index 70860477..48b79632 100644 --- a/src/teaching/cse-562/2021sp/slide/ubodin.css +++ b/src/teaching/cse-562/2021sp/slide/ubodin.css @@ -2,37 +2,37 @@ font-family: 'News Cycle'; font-style: normal; font-weight: 400; - src: local('News Cycle'), local('NewsCycle'), url(../reveal.js-3.1.0/fonts/9Xe8dq6pQDsPyVH2D3tMQsDdSZkkecOE1hvV7ZHvhyU.ttf) format('truetype'); + src: local('News Cycle'), local('NewsCycle'), url(../../../slide/reveal.js-3.1.0/fonts/9Xe8dq6pQDsPyVH2D3tMQsDdSZkkecOE1hvV7ZHvhyU.ttf) format('truetype'); } @font-face { font-family: 'News Cycle'; font-style: normal; font-weight: 700; - src: local('News Cycle Bold'), local('NewsCycle-Bold'), url(../reveal.js-3.1.0/fonts/G28Ny31cr5orMqEQy6ljt8BaWKZ57bY3RXgXH6dOjZ0.ttf) format('truetype'); + src: local('News Cycle Bold'), local('NewsCycle-Bold'), url(../../../slide/reveal.js-3.1.0/fonts/G28Ny31cr5orMqEQy6ljt8BaWKZ57bY3RXgXH6dOjZ0.ttf) format('truetype'); } @font-face { font-family: 'Lato'; font-style: normal; font-weight: 400; - src: local('Lato Regular'), local('Lato-Regular'), url(../reveal.js-3.1.0/fonts/1EqTbJWOZQBfhZ0e3RL9uvesZW2xOQ-xsNqO47m55DA.ttf) format('truetype'); + src: local('Lato Regular'), local('Lato-Regular'), url(../../../slide/reveal.js-3.1.0/fonts/1EqTbJWOZQBfhZ0e3RL9uvesZW2xOQ-xsNqO47m55DA.ttf) format('truetype'); } @font-face { font-family: 'Lato'; font-style: normal; font-weight: 700; - src: local('Lato Bold'), local('Lato-Bold'), url(../reveal.js-3.1.0/fonts/MZ1aViPqjfvZwVD_tzjjkwLUuEpTyoUstqEm5AMlJo4.ttf) format('truetype'); + src: local('Lato Bold'), local('Lato-Bold'), url(../../../slide/reveal.js-3.1.0/fonts/MZ1aViPqjfvZwVD_tzjjkwLUuEpTyoUstqEm5AMlJo4.ttf) format('truetype'); } @font-face { font-family: 'Lato'; font-style: italic; font-weight: 400; - src: local('Lato Italic'), local('Lato-Italic'), url(../reveal.js-3.1.0/fonts/61V2bQZoWB5DkWAUJStypevvDin1pK8aKteLpeZ5c0A.ttf) format('truetype'); + src: local('Lato Italic'), local('Lato-Italic'), url(../../../slide/reveal.js-3.1.0/fonts/61V2bQZoWB5DkWAUJStypevvDin1pK8aKteLpeZ5c0A.ttf) format('truetype'); } @font-face { font-family: 'Lato'; font-style: italic; font-weight: 700; - src: local('Lato Bold Italic'), local('Lato-BoldItalic'), url(../reveal.js-3.1.0/fonts/HkF_qI1x_noxlxhrhMQYECZ2oysoEQEeKwjgmXLRnTc.ttf) format('truetype'); + src: local('Lato Bold Italic'), local('Lato-BoldItalic'), url(../../../slide/reveal.js-3.1.0/fonts/HkF_qI1x_noxlxhrhMQYECZ2oysoEQEeKwjgmXLRnTc.ttf) format('truetype'); }