Add new file
parent
55a90da775
commit
56f7951965
|
@ -0,0 +1,61 @@
|
|||
---
|
||||
title: CIDR Recap
|
||||
projects:
|
||||
- mimir
|
||||
author: William Spoth
|
||||
---
|
||||
How big is BIG and how fast is FAST? This seemed to be a re-occurring theme of
|
||||
the CIDR 2017 conference. A general consensus and major point of many
|
||||
presentations is that RDBMS used to be the king of scaling to large data twenty
|
||||
years ago but for some inexplicable reason has become lost to the ever changing
|
||||
scope of BIG and FAST. Multiple papers attempted to address this problem in
|
||||
different ways and added to multiple different tools on the market for data
|
||||
stream processing and large calculations such as SPARK but there seemed to be
|
||||
no silver bullet. To add to the theme that big data is too big, there were
|
||||
keynote talks given by Emily Galt and Sam Madden that drove this point home and
|
||||
gave different real work scenarios and outlooks on this problem.
|
||||
|
||||
To break this theme apart I’ll split the papers into groups and explain the
|
||||
different outlooks the authors took and how they addressed this common problem.
|
||||
|
||||
The papers, Prioritizing Attention in Analytic Monitoring, The Myria Big Data
|
||||
Management and Analytics System and Cloud Services, Weld: A Common Runtime for
|
||||
High Performance Data Analysis, A Database System with Amnesia, and Releasing
|
||||
Cloud Databases for the Chains of Performance Prediction Models, were focused
|
||||
on the theme that databases are not keeping pace with the rate that data is
|
||||
growing. Sam Madden brought up an interesting point that the hardware
|
||||
components like the bus are not the bottle neck in this system. With advances
|
||||
in big data computing like apache spark, it feels like RDBMS are the end of the
|
||||
line where data goes to die. These papers looked at different ways of
|
||||
addressing this, ‘A Database System with Amnesia’ looked at throwing out unused
|
||||
data since most data in RDBMS gets put in and never used again and with the
|
||||
increasing use of data streams the problem of not being able to process and
|
||||
store this data fast enough becomes exemplified.
|
||||
|
||||
|
||||
|
||||
The second common ground problem is even if you can efficiently store and
|
||||
perform queries over your data lakes, humans often lack the ability to
|
||||
efficiently create queries or have the necessary insight into how the data is
|
||||
formatted. The papers, The Data Civilizer System, Establishing Common Ground
|
||||
with Data Context, Adaptive Schema Databases, Combining Design, and Performance
|
||||
in a Data Visualization Management System, all try to address this problem but
|
||||
from slightly different angles. The data civilizer system and adaptive
|
||||
databases look at aiding an analyst in schema and table exploration and to help
|
||||
an analyst discover unknown or desired qualities about their data source. These
|
||||
papers approach user insight in a way that would otherwise exist as internal
|
||||
middleware in large companies, the problem is that big data and messy data
|
||||
lakes are becoming more and more prevalent for other users. Medium sized
|
||||
businesses can be buried in data following user surges or new product upgrades,
|
||||
government agencies can have large amounts of uncleaned sensor and user
|
||||
submitted data that they do not have the abilities or tools to manage.
|
||||
|
||||
To me a large take away from this conference was databases need a better way to
|
||||
handle big data. Databases are the hero big data needs AND the one it deserves.
|
||||
To achieve these goals databases are going to need to relax the constraints on
|
||||
ridged schemas and ‘perfect’ data, which open up a large amount of research
|
||||
opportunities and the realization that there might not currently be a ‘right’
|
||||
answer to this problem. Either way it should be interesting to see what
|
||||
sacrifices RDBMS make to compete with the growing amount of data and if they
|
||||
are able to apply decades worth of research to this hot field that is looking
|
||||
for an answer.
|
Loading…
Reference in New Issue