Website/src/teaching/cse-462/2016sp/index.erb

327 lines
14 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

---
title: CSE 462 - Spring 2016
classContent:
- date: Jan. 26
topic: Introduction and Outline
meta:
slides: slides/01-IntroAndStructure.pdf
- date: Jan. 28
topic: Relational Algebra
meta:
slides: slides/02-RelationalAlgebra.pdf
video: https://www.youtube.com/watch?v=qj67794J8eg
- date: Feb. 2
topic: SQL
meta:
slides: slides/03-SQL.pdf
video: https://www.youtube.com/watch?v=tp2X0NrlulY
- date: Feb. 4
topic: SQL to RA and Evaluation
meta:
slides: slides/04-SQLToRAAndEval.pdf
video: https://www.youtube.com/watch?v=T-A9pMY-mPk
- date: Feb. 9
topic: Project 1 Overview, Generalized RA
meta:
slides: slides/05-ExtendedRAProj1.pdf
video: https://www.youtube.com/watch?v=uB3wuS_Ykdg
- date: Feb. 11
topic: Data Layout (Serialization, Paging, Columnar)
meta:
slides: slides/06-Storage.pdf
video: https://www.youtube.com/watch?v=p8YJYruhNfY
- date: Feb. 16
topic: Data Organization (Sorting, Tree Indexes)
meta:
slides: slides/07-Indexes.pdf
video: https://www.youtube.com/watch?v=Ai6C4foE5Kk
- date: Feb. 18
topic: Hash Indexes, Joins, and Optimization
meta:
slides: slides/08-Indexes2.pdf
- date: Feb. 23
topic: Joins and Optimization
meta:
slides: slides/09-Optimization.pdf
- date: Feb. 25
topic: Midterm 1 Review
meta:
slides: slides/10-Review.pdf
video: https://www.youtube.com/watch?v=qMdfQ65XKas
- date: Mar. 1
topic: <b>Midterm 1</b>
- date: Mar. 3
topic: External Algorithms
meta:
slides: slides/11-ExternalSort.pdf
video: https://youtu.be/yhdCd_ScMWA
- date: Mar. 8
topic: Project 2 Review
meta:
code: slides/12-Checkpoint2.zip
video: https://youtu.be/ksV3QM3EQMo
- date: Mar. 10
topic: Data Modeling (E/R and Constraints)
meta:
slides: slides/13-DataModeling.pdf
video: https://youtu.be/c5Sw9ryGcqM
- date: Mar. 15
topic: <b>Spring Break!</b>
- date: Mar. 17
topic: <b>Spring Break!</b>
- date: Mar. 22
topic: Cost-Based-Optimization
meta:
slides: slides/14-CostBasedOptimization.pdf
video: https://youtu.be/cOHfnBQ2kNw
- date: Mar. 24
topic: Transaction Correctness
meta:
slides: slides/15-TransactionCorrectness.pdf
videos: https://youtu.be/NK5oeZY6BmA
- date: Mar. 29
topic: Transactions-Locking
meta:
slides: slides/16-TransactionsAndLocking.pdf
video: https://youtu.be/XinEfFjNpow
- date: Mar. 31
topic: Transactions-OCC, Versioning
meta:
slides: slides/17-TransactionOCC.pdf
video: https://youtu.be/PYiVR90_2Lc
- date: Apr. 5
topic: ARIES (Write-Ahead Logging, Undo-Logging, Recovery)
meta:
slides: slides/18-Logging.pdf
video: https://youtu.be/C-yiTS_uPTI
- date: Apr. 7
topic: Midterm 2 Review
meta:
slides: slides/19-Review.pdf
video: https://youtu.be/YNXJzjTis14
- date: Apr. 12
topic: <b>Midterm 2</b>
- date: Apr. 14
topic: Project 3 Review
meta:
video: https://youtu.be/mCqN45CsZ9k
- date: Apr. 19
topic: Views
meta:
slides: slides/21-Views.pdf
video: https://youtu.be/hUZvLL6UhFk
- date: Apr. 21
topic: Stream Queries
meta:
slides: slides/22-Streams.pdf
- date: Apr. 26
topic: Parallel Data
meta:
slides: slides/23-ParallelQueries.pdf
- date: Apr. 28
topic: Parallel Joins
meta:
slides: slides/24-ParallelJoins.pdf
- date: May 3
topic: Parallel Updates
meta:
slides: slides/25-ParallelUpdates.pdf
- date: May 5
topic: Final Review
---
<h1>CSE-462 Spring 2016</h1>
<p>Data Management Systems (including Relational Databases, Non-Relational Databases, and NoSQL storage systems) are the basis for any big data project.  A data management system is responsible for storing data, enabling efficient access to that data, as well as mediating concurrent modifications.  This class teaches data management systems both in terms of the basic principles of their design, and the practical challenges of implementing them. The course is built around a term-long programming assignment, in which you will build a system for answering SQL queries efficiently.  Course lectures will focus on the conceptual basis for this system and how the techniques that you implement in the project generalize (e.g., to the use of NoSQL systems)</p>
<p>In this course, you will learn...
<ul>
<li>... how to efficiently store and retrieve data programatically.</li>
<li>... how to optimize big-data computations.</li>
<li>... how to use index structures to accelerate computations.</li>
<li>... how to safely and efficiently manipulate data concurrently.</li>
<li>... how to recover state after software and hardware failures.</li>
<li>... how to query and update distributed data consistently.</li>
</ul>
</p>
<hr/>
<h2>Course Details</h2>
<ul>
<li><b>Class</b>: T/Th, 12:30-1:50 PM in <a href="http://www.buffalo.edu/buildings/building?id=nsc">NSC 220</a></li>
<li><b>Class Forum</b>: <a href="https://piazza.com/buffalo/spring2016/cse462/home">Piazza</a></li>
<li><b>Textbook</b>: "Database Systems, The Complete Book" 2nd Edition<br/> by Garcia-Molina, Ullman, and Widom.</li>
<li><b>Optional Readings</b>:
<ul>
<li>"<a href="https://infosys.uni-saarland.de/datenbankenlernen/Patterns_In_Data_Management_Preview.pdf">Patterns in Data Management</a>"<br/> by Jens Dittrich</li>
<li>"<a href="http://www.redbook.io/">The Red Book: Readings in Databases</a>"<br/> ed. Bailis, Hellerstein, and Stonebraker</li>
</ul></li>
<li><b>Instructor</b>: <a href="https://odin.cse.buffalo.edu/people/oliver_kennedy.html">Oliver Kennedy</a> (Davis 338H, Office Hours Weds 10 AM-11 AM or by appointment; okennedy at buffalo)</li>
<li><b>TA</b>: Jun Chu (TA Lounge; Office Hours Tu 3:30-5:30; jchu6 at buffalo)
<ul>
<li>Recitations: Tuesday 9 AM, Wednesdays 12 Noon</li>
</ul></li>
<li><b>Project Submission: </b><a href="http://dubstep.odin.cse.buffalo.edu">http://dubstep.odin.cse.buffalo.edu</a></li>
<li><b>Project Groups</b>: 1-4 people</li>
<li><b>Grading</b>:
<ul>
<li>10% Homework (Lowest 2 grades dropped)
<ul><li>Due ~1/week on Thursdays</li></ul></li>
<li>40% Exams<ul>
<li>10% Midterm 1 on <b>March 1</b> (in class)</li>
<li>10% Midterm 2 on <b>April 12</b> (in class)</li>
<li>20% Comprehensive Final on Thu May 14 (4:00-6:30)</li>
<li>OR 5%/10%/25% OR 5%/5%/30% (whichever is most advantageous to you)</li>
</ul></li>
<li>50% Projects<ul>
<li>5% <a title="Checkpoint 0" href="dubstep/checkpoint0.html">Hello World</a> due on Feb. 8, 11:59 PM</li>
<li>15% <a title="Checkpoint 1" href="dubstep/checkpoint1.html">Project 1</a> due on Mar. 7, 11:59 PM</li>
<li>15% <a title="Checkpoint 2" href="dubstep/checkpoint2.html">Project 2</a> due on Mar. 30, 11:59 PM</li>
<li>15% <a title="Checkpoint 3" href="dubstep/checkpoint3.html">Project 3</a> due on May 8, 11:59 PM</li>
</ul></li>
</ul></li>
</ul>
<hr/>
<h2>Library Documentation</h2>
<ul>
<li><b>JSqlParser</b> ( <a href="/software/jsqlparser/jsqlparser.jar">binary</a> | <a href="https://github.com/UBOdin/jsqlparser">source</a> | <a href="/software/jsqlparser/">docs</a> | <a href="https://youtu.be/U4TyaHTJ3Zg">demo</a> )</li>
</ul>
<hr/>
<h2>Lecture Schedule</h2>
<ul>
<% classContent.each do |data| %>
<li><i><%=data["date"]%></i>:&nbsp;&nbsp;&nbsp;<%=data["topic"]%>
<%if data.has_key? "meta" %> ( <%= data["meta"].map { |r,url| "<a href=\"#{url}\">#{r}</a>" }.join(" | ") %> )<% end %></li>
<% end %>
<hr />
<h2>Content Outline</h2>
<ul>
<li><a title="Checkpoint 0" href="dubstep/checkpoint0.html">Project 0</a> - Basic Setup</li>
<li><a title="Checkpoint 1" href="dubstep/checkpoint1.html">Project 1</a> - Infrastructure &amp; Query Evaluation<ul>
<li><b>Relational Algebra</b> (Ch 2.4, 5.1)</li>
<li><b>SQL</b> (Ch 2.3, 6.1-6.4 and 16.1)</li>
<li><b>Query Compilers</b> (Ch 15.1-15.3, 16.1, 16.3)</li>
<li><b>Data Modeling</b> (Ch 2.1-2.2)</li>
</ul></li>
<li><a title="Checkpoint 2" href="dubstep/checkpoint2.html">Project 2</a> - Optimization &amp; External Algorithms<ul>
<li><b>Algebraic Query Optimization</b> (Ch 16.2)</li>
<li><b>Join Algorithms</b> (Ch 15.4, 15.5)</li>
<li><b>Extended Relational Algebra</b> (Ch 5.2)</li>
<li><b>Buffering &amp; External Algorithms</b> (Ch 15.7-15.8)</li>
<li><b>Physical Plans</b> (Ch 16.7)</li>
</ul></li>
<li><a title="Checkpoint 3" href="dubstep/checkpoint3.html">Project 3</a> - Indexing &amp; Physical Layout<ul>
<li><b>The Memory Hierarchy</b> (Ch 13.1-13.3)</li>
<li><b>Physical Design</b> (Ch 13.5-13.7)</li>
<li><b>Indexing</b> (Ch 8.3, 14.1-14.4)</li>
<li><b>Materialized Views</b> (Ch 8.1-8.2, 8.5)</li>
<li><b>Cost-Based Optimization</b> (Ch 8.4, 16.4-16.6)</li>
</ul></li>
<li>Concepts (No Project)<ul>
<li><b>Failure Recovery</b> (Ch 13.4, 19.1, 19.3)</li>
<li><b>Updating Data</b> (Ch 6.5, 13.8)</li>
<li><b>Transactions</b> (Ch 6.6, 18.1-18.2)</li>
<li><b>Locking</b> (Ch 18.3-18.7)</li>
<li><b>Deadlocks</b> (Ch 19.2)</li>
<li><b>Lock-free Concurrency</b> (Ch 18.8-18.9)</li>
<li><b>Distributed Data Management</b> (Ch 20)</li>
<li><b>Uncertain Data Management</b></li>
<li>Time permitting, other subjects will also be covered.</li>
</ul></li>
</ul>
<hr/>
<h2>Assignment Submission</h2>
Homeworks will be collected in recitation on the day they are due, or by email to the instructor and TA(s) <b>before</b> class begins if you are unable to attend class. Late homeworks will <b>not</b> be accepted and will receive a grade of 0. Note that your lowest 2 homework grades will be dropped.
Projects are submitted through the online submission system using GIT. You may submit your assignments to be graded many times as you like. Late submissions for Checkpoint 0 will not be accepted. Submissions for checkpoints 1-3 will receive a 3/15 point penalty per day late. Late penalties are per-submission. Your group's grade is the highest of all grades received by your group for the project, so your grade can never decrease from additional submissions (even if they're late). Don't be afraid to experiment with new ideas once you've gotten a grade you're happy with!
<hr/>
<h2>Letter Grades</h2>
<table class="table table-striped">
<tr><th>You will get a(n)...</th>
<th>A</th>
<th>A-</th>
<th>B+</th>
<th>B</th>
<th>B-</th>
<th>C+</th>
<th>C</th>
<th>C-</th>
<th>D</th>
<th>F</th>
</tr>
<tr><th>If your number grade is at least...</th>
<th>93</th>
<th>90</th>
<th>87</th>
<th>83</th>
<th>80</th>
<th>77</th>
<th>73</th>
<th>70</th>
<th>60</th>
<th>$-\infty$</th>
</tr>
<tr><th>But less than...</th>
<th>$\infty$</th>
<th>93</th>
<th>90</th>
<th>87</th>
<th>83</th>
<th>80</th>
<th>77</th>
<th>73</th>
<th>70</th>
<th>59</th>
</tr>
</table>
<h2>Attendance</h2>
<p>I see attendance as a privilege rather than a requirement: You (or your parents, or your scholarship, etc...) are paying me so that you can interrupt me with a question <strike>if</strike> when I say something stupid or confusing in class. <em>There is no mandatory attendance.</em> However, if you ask me a question by email or during office hours that I already addressed to the class's satisfaction, I reserve the right to tease you mercilessly.</p>
<p>That all being said, I will <i>try</i> to post lecture videos online so that you can catch up on what you missed.</p>
<hr/>
<h2>Academic Integrity</h2>
<p>Students may discuss and advise one another on their projects, but groups are expected to turn in their own work.  Discussing concepts is permitted.  Referencing code (e.g., from another group, or from stack overflow or github) is not. When in doubt, ask a TA or the instructor. Violations typically result in an F grade in the course for all students involved.  The Department's policy on academic integrity can be reviewed at:</p>
<p><a href="http://www.cse.buffalo.edu/undergrad/policy_academic.php">UB-CSE's Academic Integrity Policy</a></p>
<hr/>
<h2>Medical Emergencies</h2>
<p>Accommodations for medical emergencies will be made on a case-by-case basis.  Requests for extensions based on medical emergencies must be accompanied by documentation of the emergency from student health services:</p>
<p><a href="http://www.student-affairs.buffalo.edu/shs/student-health/">Student Health Services</a></p>
<hr/>
<h2>Accessibility Resources</h2>
<p>If you have a diagnosed disability (physical, learning, or psychological) that will make it difficult for you to carry out the course work as outlined, or that requires accommodations such as recruiting note-takers, readers, or extended time on exams or assignments, please advise the instructor and contact the Office of Accessibility Resources during the first two weeks of the course so that we may review possible arrangements for reasonable accommodations. The Office of Accessibility Resources can be reached at:</p>
<p><a href="http://www.student-affairs.buffalo.edu/ods/">The Office of Accessibility Resources</a> |
(716) 645-2608 | <a href="mailto:accessibility@buffalo.edu">accessibility@buffalo.edu</a></p>
<p>Note that OAR gets extremely busy towards the end of the term. If you anticipate needing special testing accomodations, please make sure to contact them 2-4 weeks in advance.</p>