spark-instrumented-optimizer/python/pyspark/streaming
David Tolpin 437583f692 [SPARK-11904][PYSPARK] reduceByKeyAndWindow does not require checkpointing when invFunc is None
when invFunc is None, `reduceByKeyAndWindow(func, None, winsize, slidesize)` is equivalent to

     reduceByKey(func).window(winsize, slidesize).reduceByKey(winsize, slidesize)

and no checkpoint is necessary. The corresponding Scala code does exactly that, but Python code always creates a windowed stream with obligatory checkpointing. The patch fixes this.

I do not know how to unit-test this.

Author: David Tolpin <david.tolpin@gmail.com>

Closes #9888 from dtolpin/master.
2015-12-16 22:10:24 -08:00
..
__init__.py [SPARK-6328][PYTHON] Python API for StreamingListener 2015-11-16 11:29:27 -08:00
context.py [SPARK-6328][PYTHON] Python API for StreamingListener 2015-11-16 11:29:27 -08:00
dstream.py [SPARK-11904][PYSPARK] reduceByKeyAndWindow does not require checkpointing when invFunc is None 2015-12-16 22:10:24 -08:00
flume.py [SPARK-10447][SPARK-3842][PYSPARK] upgrade pyspark to py4j0.9 2015-10-20 10:52:49 -07:00
kafka.py [SPARK-9065][STREAMING][PYSPARK] Add MessageHandler for Kafka Python API 2015-11-17 16:57:52 -08:00
kinesis.py [SPARK-10447][SPARK-3842][PYSPARK] upgrade pyspark to py4j0.9 2015-10-20 10:52:49 -07:00
listener.py [SPARK-6328][PYTHON] Python API for StreamingListener 2015-11-16 11:29:27 -08:00
mqtt.py [SPARK-10447][SPARK-3842][PYSPARK] upgrade pyspark to py4j0.9 2015-10-20 10:52:49 -07:00
tests.py [SPARK-11713] [PYSPARK] [STREAMING] Initial RDD updateStateByKey for PySpark 2015-12-10 14:21:15 -08:00
util.py [SPARK-12002][STREAMING][PYSPARK] Fix python direct stream checkpoint recovery issue 2015-12-01 15:26:10 -08:00