How to Start Collaborating with CSE to Solve Global Health Problems

Oliver Kennedy

  1. CSE: Research vs Implementation?
  2. Research Highlights.
  3. Behind the Buzzwords.
  4. CSE Resources for Your Benefit.

Computer Science & Engineering

  • How hard is a particular type of problem to solve?
  • Can a particular solution be scaled to bigger problems?
  • How do I solve problems that fit a particular pattern?

Abstraction

CSE research is about creating general solutions

(motivated by specific problems)

CSE Publication

How does the new solution differ? Is it...

  • ... more general?
  • ... more efficient/reliable?
  • ... more scalable?
  • ... easier to use?

Do you need research or implementation?

UB-CSE can help with both,
but each takes a different approach.

Have clear, well-defined parameters for the problem.

Typical Implementation Problems

  • Organizing data into a database.
  • A mobile app to display information.
  • Making an R script run faster.

Implementation Resources

CSE-611

Undergraduate Research

Invenst

But...

Implementation often inspires research topics.
(Ailamaki's 7 month rule)

Implementation may synergize with existing research.
(e.g., Motivating use cases)

(so talking to a friendly CSE faculty can still be useful)

Research Highlights

  • Reproducible Datasets (Oliver Kennedy)
  • Wireless Sensor Networks (Chang Wen Chen)

Reproducible Datasets

Oliver Kennedy

Data Errors Suck

https://xkcd.com/2239/


Assumption Assumption

freesvg.org
© 20th Century Fox

The Vizier Notebook

If you're using Python, R, SQL, or Jupyter, we can help...

  • ...improve your dataset documentation
  • ...make your workflows more reproducible
  • ...make your code faster

talk to me afterwards!

Wireless Sensor Networks

Chang Wen Chen

IoT Devices Need To Communicate

Cellular or LORA Networks
Each device talks to a tower
(reliable, but requires infrastructure)
Store and Collect
Each device stores data and is recovered
(no infrastructure, but requires physical visits)
Mesh Networks
Each device communicates via nearby devices
(bandwidth/power limited)

UB-CSE is at the forefront of wireless research

Application: Monitoring the excessive antibiotic discharge into Missouri river from overdose usage as agricultural runoff

Problem: 1d "mesh" is even more limited

Buzzwords

Neural Networks / Deep Learning

Linear Regression


Spline Fitting


Graphical Models


Neural Networks

$y=f(x)$ where $f$ has 100s or 1000s (or more) DoF

The Good
• Feasible to fit very complex functions (e.g., face?)
• Minimal knowledge of problem structure required.
The Bad
• Need huge training data
• Very easy to overfit
• Not explainable (yet)

Differential Privacy

A way to mathematically prove to yourself how much PII could leak if an aggregate dataset is released.

"Can I create a statistically significant effect on the dataset by removing N individuals?"

Blockchain

https://xkcd.com/2267/

Oliver's Blockchain PSA

  • It is statistically unlikely that you need a blockchain.
    (I've yet to see a use case apart from cryptocurrency that needs a blockchain)
  • Proof of work is a huge power sink.
    (Bitcoin alone estimated at 77 TWh this year ~= the power consumption of Chile)

Please consult a Doctor (of Philosophy in CS)
before starting a blockchain project

Resources

NSF CS+X Programs

  • CSSI: Cyberinfrastructure for Sustained Scientific Innovation
  • SCH: Smart and Connected Health
  • SCC: Smart and Connected Communities