Spark

Google: Databases suck! Use Map/Reduce Instead

Yahoo: Our Map/Reduce implementation is open source

Read-Only: You can't insert, update, or modify rows...
Transformable: ... but you can create (cheaply) new RDDs by modifying existing RDDs.
Opaque: Spark just sees a bunch of rows. It doesn't know how to interpret them.
Lazy: Spark saves how to construct an RDD, but waits to actually do so.
Distributed: When Spark constructs an RDD, it automatically assigns rows to workers.

A function that reads in one row and returns any number of rows.

A function that reads in one row and returns one row.

A function that reads in one row and returns true (keep) or false (toss).

Read-Only: You can't insert, update, or modify rows...
Transformable: ... but you can create (cheaply) new RDDs by modifying existing RDDs.
Opaque: Spark just sees a bunch of rows. It doesn't know how to interpret them.
Lazy: Spark saves how to construct an RDD, but waits to actually do so.
Distributed: When Spark constructs an RDD, it automatically assigns rows to workers.

RDDs with Schemas: Every row has a set of attributes and all of the records have the same attributes.

Demo