spark-instrumented-optimizer/project
Cheng Lian 1c690ddafa [SPARK-12933][SQL] Initial implementation of Count-Min sketch
This PR adds an initial implementation of count min sketch, contained in a new module spark-sketch under `common/sketch`. The implementation is based on the [`CountMinSketch` class in stream-lib][1].

As required by the [design doc][2], spark-sketch should have no external dependency.
Two classes, `Murmur3_x86_32` and `Platform` are copied to spark-sketch from spark-unsafe for hashing facilities. They'll also be used in the upcoming bloom filter implementation.

The following features will be added in future follow-up PRs:

- Serialization support
- DataFrame API integration

[1]: aac6b4d23a/src/main/java/com/clearspring/analytics/stream/frequency/CountMinSketch.java
[2]: https://issues.apache.org/jira/secure/attachment/12782378/BloomFilterandCount-MinSketchinSpark2.0.pdf

Author: Cheng Lian <lian@databricks.com>

Closes #10851 from liancheng/count-min-sketch.
2016-01-23 00:34:55 -08:00
..
project [MINOR][BUILD] Changed the comment to reflect the plugin project is there to support SBT pom reader only. 2015-11-30 09:30:58 +00:00
build.properties [SPARK-12112][BUILD] Upgrade to SBT 0.13.9 2015-12-05 08:15:30 +08:00
MimaBuild.scala [SPARK-4628][BUILD] Add a resolver to MiMaBuild.scala for mqttv3(1.0.1). 2016-01-10 23:33:57 -08:00
MimaExcludes.scala [SPARK-7997][CORE] Remove Akka from Spark Core and Streaming 2016-01-22 21:20:04 -08:00
plugins.sbt [SPARK-4628][BUILD] Remove all non-Maven-Central repositories from build 2016-01-08 20:58:53 -08:00
SparkBuild.scala [SPARK-12933][SQL] Initial implementation of Count-Min sketch 2016-01-23 00:34:55 -08:00