Merge

2024-06-17 10:29:58 -04:00 · 2024-06-17 10:29:58 -04:00 · efd2988601
parent 6d754dc032 ae3d7ff009
commit efd2988601
13 changed files with 54 additions and 2 deletions
--- a/db/cv/okennedy/talks.json
+++ b/db/cv/okennedy/talks.json
@ -1,4 +1,12 @@
 [
+  { "talk" : "Data Preparation with Vizier", "date" : "Apr. 2024", 
+    "venue" : "New York University" },
+  { "talk" : "Principled management of notebook state in Vizier", "date" : "Apr. 2024", 
+    "venue" : "University of Illinois: Chicago" },
+  { "talk" : "ASTral: A Declarative Compiler Compiler", "date" : "Mar. 2024", 
+    "venue" : "University of Massachusetts: Dartmouth" },
+  { "talk" : "Microkernel Notebooks", "date" : "Feb. 2023", 
+    "venue" : "Cornell University" },
  { "talk" : "Panel: On the Multifaceted Impact of Artificial Intelligence in Healthcare: Past, Present, and Emerging Trends", "date" : "May 2022", 
    "venue" : "UP-STAT 2022" },
  { "talk" : "Caveatting your data: Adding explainability to incomplete datasets", "date" : "Feb 2022", 
--- a/src/news/2024-05-07-HILDA.md
+++ b/src/news/2024-05-07-HILDA.md
@ -0,0 +1,21 @@
+---
+title: "ODIn @ HILDA '24"
+author: Oliver Kennedy
+---
+
+'Grats to Pratik and Juseung on their #HILDA2024 accept for "Drag, Drop, Merge: A Tool for Streamlining Integration of Longitudinal Survey Instruments", which explores schema integration in longitudinal studies.
+Longitudinal surveys, and specifically social sciences data collected through survey forms, are a really interesting case of schema integration.  
+
+The data being collected is, on the most fundamental level, about only a single class of entity.
+However, each year brings new knowledge, and new context to the survey, necessitating changes.
+For example, researchers might learn that the culture of the study population uses different names in different social contexts, necessitating a change to the survey to clarify the social context of the name being recorded.
+Alternatively, researchers might adapt a choice of phrasing like "how many of your family members live nearby" into "how many people are in your support network" to better address the nuanced situations.
+Even without changes to the survey itself, changing context can result in changing interpretations of participant answers.  
+For example, take a multiple-choice question about income levels.  
+A single answer at the start of a 20-year study may indicate a wildly different socioeconomic status than the exact same answer given in the last year of the study.
+
+The problem of integrating many years of forms is fundamentally similar to data integration, but is in some ways easier (there are few changes between successive years), and in some ways harder (there are *many* such changes over the lifetime of the survey).  Changes are also nuanced, with growing levels of divergence.
+
+The paper lays the groundwork for a tool to help researchers conducting longitudinal studies to prepare their data for publication, and for researchers trying to use this study data to reliably develop derived, 'clean' datasets useful for the needs of their specific study.
+
+**Side Note**: This paper is the result of a massively interdisciplinary collaboration between CS, Linguistics, Medicine, Stats (and soon-to-be Environmental Health).  I'm really excited that we've hit on an opportunity to develop techniques that will benefit such a diverse range of fields of study.
--- a/src/news/2024-05-09-PLDBSummary.md
+++ b/src/news/2024-05-09-PLDBSummary.md
@ -0,0 +1,10 @@
+---
+title: PL/DB Sp 2024
+author: Oliver Kennedy
+---
+
+With a talk from [Manos Athanassoulis](https://cs-people.bu.edu/mathan/) earlier this week, we've wrapped up another semester of the PL/DB seminar here at UB.  We had a *really* fantastic lineup this year, including five guest speakers ([Jelle Hellings](https://jhellings.nl/), [Hannah Gommerstadt](https://www.cs.vassar.edu/~hgommerstadt/), [Ryan Kavanagh](https://rak.ac/) [Boris Glavic](https://www.cs.uic.edu/~bglavic/dbgroup/members/bglavic.html), and Manos).  
+
+Talks this semester spanned a range of different subjects, from distributed programming models, to indexing and data access methods, query processing, compiler optimization, and provenance.  On the one hand, it's amazing to see such a diverse range of topics represented,  On the other, it was also nifty to see students from across the board engaging with all of the speakers (student or otherwise).
+
+Major props to [Andrew Hirsch](akhirsch.science), who is more/less single-handedly responsible for reviving and bringing new life into the seminar.
--- a/src/teaching/cse-350/2024sp/assignments/final.pdf
+++ b/src/teaching/cse-350/2024sp/assignments/final.pdf
--- a/src/teaching/cse-350/2024sp/assignments/midterm.pdf
+++ b/src/teaching/cse-350/2024sp/assignments/midterm.pdf
--- a/src/teaching/cse-350/2024sp/assignments/p3.pdf
+++ b/src/teaching/cse-350/2024sp/assignments/p3.pdf
--- a/src/teaching/cse-350/2024sp/assignments/p4.pdf
+++ b/src/teaching/cse-350/2024sp/assignments/p4.pdf
--- a/src/teaching/cse-350/2024sp/assignments/w3.pdf
+++ b/src/teaching/cse-350/2024sp/assignments/w3.pdf
--- a/src/teaching/cse-350/2024sp/index.erb
+++ b/src/teaching/cse-350/2024sp/index.erb
@ -93,6 +93,9 @@ schedule:
    detail: The statistical tricks that allow expensive data statistics to be computed using Count-Min and Hyperloglog
    docs:
      slides: slide/15-sketches.html
+      flajolet_martin: papers/flajolet_martin.pdf
+      count: papers/frequent.pdf
+      count_min: papers/count_min.pdf
 deliverables:
  - item: "Project 0: Setup"
    due: Jan 28
@ -124,20 +127,27 @@ deliverables:
      submit: https://autolab.cse.buffalo.edu/courses/cse410-s24/assessments/P2-B-Trees
  - item: "Written 2: B+ Tree Analysis"
    due: Mar 17
+    links:
+      assignment: assignments/w2.pdf
  - item: "Project 3: Joins"
    due: Apr 9
+    links:
+      assignment: assignments/p3.pdf
  - item: "Written 3: Joins Analysis"
    due: Apr 9
+    links:
+      assignment: assignments/w3.pdf
  - item: "Project 4: MiniDB"
    due: May 5
-  - item: "Written 4: MiniDB Analysis"
-    due: May 12
+    links:
+      assignment: assignments/p4.pdf
 dates:
  - event: Midterm 
    dates: March 4, In Class
    links: 
      review: slide/09-review.pdf
      annotated: slide/09-review-annotated.pdf
+      rubric: assignments/midterm.pdf
  - event: Oliver Traveling, No Class
    dates: March 8
  - event: Spring Break, No Class
@ -146,6 +156,9 @@ dates:
    dates: April 16
  - event: Final Exam
    dates: May 10, 3:30-6:30
+    links: 
+      review: slide/16-review.pdf
+      rubric: assignments/final.pdf
 ---
 <style>
 p {
--- a/src/teaching/cse-350/2024sp/papers/count_min.pdf
+++ b/src/teaching/cse-350/2024sp/papers/count_min.pdf
--- a/src/teaching/cse-350/2024sp/papers/flajolet_martin.pdf
+++ b/src/teaching/cse-350/2024sp/papers/flajolet_martin.pdf
--- a/src/teaching/cse-350/2024sp/papers/frequent.pdf
+++ b/src/teaching/cse-350/2024sp/papers/frequent.pdf
--- a/src/teaching/cse-350/2024sp/slide/16-review.pdf
+++ b/src/teaching/cse-350/2024sp/slide/16-review.pdf