spark-instrumented-optimizer/project
Davies Liu 0561c45449 [SPARK-5154] [PySpark] [Streaming] Kafka streaming support in Python
This PR brings the Python API for Spark Streaming Kafka data source.

```
    class KafkaUtils(__builtin__.object)
     |  Static methods defined here:
     |
     |  createStream(ssc, zkQuorum, groupId, topics, storageLevel=StorageLevel(True, True, False, False,
2), keyDecoder=<function utf8_decoder>, valueDecoder=<function utf8_decoder>)
     |      Create an input stream that pulls messages from a Kafka Broker.
     |
     |      :param ssc:  StreamingContext object
     |      :param zkQuorum:  Zookeeper quorum (hostname:port,hostname:port,..).
     |      :param groupId:  The group id for this consumer.
     |      :param topics:  Dict of (topic_name -> numPartitions) to consume.
     |                      Each partition is consumed in its own thread.
     |      :param storageLevel:  RDD storage level.
     |      :param keyDecoder:  A function used to decode key
     |      :param valueDecoder:  A function used to decode value
     |      :return: A DStream object
```
run the example:

```
bin/spark-submit --driver-class-path external/kafka-assembly/target/scala-*/spark-streaming-kafka-assembly-*.jar examples/src/main/python/streaming/kafka_wordcount.py localhost:2181 test
```

Author: Davies Liu <davies@databricks.com>
Author: Tathagata Das <tdas@databricks.com>

Closes #3715 from davies/kafka and squashes the following commits:

d93bfe0 [Davies Liu] Update make-distribution.sh
4280d04 [Davies Liu] address comments
e6d0427 [Davies Liu] Merge branch 'master' of github.com:apache/spark into kafka
f257071 [Davies Liu] add tests for null in RDD
23b039a [Davies Liu] address comments
9af51c4 [Davies Liu] Merge branch 'kafka' of github.com:davies/spark into kafka
a74da87 [Davies Liu] address comments
dc1eed0 [Davies Liu] Update kafka_wordcount.py
31e2317 [Davies Liu] Update kafka_wordcount.py
370ba61 [Davies Liu] Update kafka.py
97386b3 [Davies Liu] address comment
2c567a5 [Davies Liu] update logging and comment
33730d1 [Davies Liu] Merge branch 'master' of github.com:apache/spark into kafka
adeeb38 [Davies Liu] Merge pull request #3 from tdas/kafka-python-api
aea8953 [Tathagata Das] Kafka-assembly for Python API
eea16a7 [Davies Liu] refactor
f6ce899 [Davies Liu] add example and fix bugs
98c8d17 [Davies Liu] fix python style
5697a01 [Davies Liu] bypass decoder in scala
048dbe6 [Davies Liu] fix python style
75d485e [Davies Liu] add mqtt
07923c4 [Davies Liu] support kafka in Python
2015-02-02 19:16:27 -08:00
..
project Support cross building for Scala 2.11 2014-11-11 21:36:48 -08:00
spark-style/src/main/scala/org/apache/spark/scalastyle [Core] Upgrading ScalaStyle version to 0.5 and removing SparkSpaceAfterCommentStartChecker. 2014-10-16 02:05:44 -04:00
build.properties [SPARK-5415] bump sbt to version to 0.13.7 2015-01-28 02:13:06 -08:00
MimaBuild.scala Bumping version to 1.3.0-SNAPSHOT. 2014-11-18 21:24:18 -08:00
MimaExcludes.scala [SPARK-5540] hide ALS.solveLeastSquares 2015-02-02 17:10:01 -08:00
plugins.sbt [SPARK-3997][Build]scalastyle should output the error location 2014-10-26 16:24:50 -07:00
SparkBuild.scala [SPARK-5154] [PySpark] [Streaming] Kafka streaming support in Python 2015-02-02 19:16:27 -08:00