spark-instrumented-optimizer/python/pyspark/streaming
Tathagata Das aa63f633d3 [SPARK-6027][SPARK-5546] Fixed --jar and --packages not working for KafkaUtils and improved error message
The problem with SPARK-6027 in short is that JARs like the kafka-assembly.jar does not work in python as the added JAR is not visible in the classloader used by Py4J. Py4J uses Class.forName(), which does not uses the systemclassloader, but the JARs are only visible in the Thread's contextclassloader. So this back uses the context class loader to create the KafkaUtils dstream object. This works for both cases where the Kafka libraries are added with --jars spark-streaming-kafka-assembly.jar or with --packages spark-streaming-kafka

Also improves the error message.

davies

Author: Tathagata Das <tathagata.das1565@gmail.com>

Closes #4779 from tdas/kafka-python-fix and squashes the following commits:

fb16b04 [Tathagata Das] Removed import
c1fdf35 [Tathagata Das] Fixed long line and improved documentation
7b88be8 [Tathagata Das] Fixed --jar not working for KafkaUtils and improved error message
2015-02-26 13:47:07 -08:00
..
__init__.py [SPARK-2377] Python API for Streaming 2014-10-12 02:46:56 -07:00
context.py [SPARK-5943][Streaming] Update the test to use new API to reduce the warning 2015-02-23 11:27:27 +00:00
dstream.py [SPARK-5785] [PySpark] narrow dependency for cogroup/join in PySpark 2015-02-17 16:54:57 -08:00
kafka.py [SPARK-6027][SPARK-5546] Fixed --jar and --packages not working for KafkaUtils and improved error message 2015-02-26 13:47:07 -08:00
tests.py [SPARK-4969][STREAMING][PYTHON] Add binaryRecords to streaming 2015-02-03 22:24:30 -08:00
util.py [SPARK-2377] Python API for Streaming 2014-10-12 02:46:56 -07:00