spark-instrumented-optimizer/python/pyspark/streaming
Shixiong Zhu 928d631625 [SPARK-11740][STREAMING] Fix the race condition of two checkpoints in a batch
We will do checkpoint when generating a batch and completing a batch. When the processing time of a batch is greater than the batch interval, checkpointing for completing an old batch may run after checkpointing for generating a new batch. If this happens, checkpoint of an old batch actually has the latest information, so we want to recovery from it. This PR will use the latest checkpoint time as the file name, so that we can always recovery from the latest checkpoint file.

Author: Shixiong Zhu <shixiong@databricks.com>

Closes #9707 from zsxwing/fix-checkpoint.
2015-11-17 14:48:29 -08:00
..
__init__.py [SPARK-6328][PYTHON] Python API for StreamingListener 2015-11-16 11:29:27 -08:00
context.py [SPARK-6328][PYTHON] Python API for StreamingListener 2015-11-16 11:29:27 -08:00
dstream.py [SPARK-10122] [PYSPARK] [STREAMING] Fix getOffsetRanges bug in PySpark-Streaming transform function 2015-08-21 13:15:35 -07:00
flume.py [SPARK-10447][SPARK-3842][PYSPARK] upgrade pyspark to py4j0.9 2015-10-20 10:52:49 -07:00
kafka.py [SPARK-11270][STREAMING] Add improved equality testing for TopicAndPartition from the Kafka Streaming API 2015-10-27 01:29:06 -07:00
kinesis.py [SPARK-10447][SPARK-3842][PYSPARK] upgrade pyspark to py4j0.9 2015-10-20 10:52:49 -07:00
listener.py [SPARK-6328][PYTHON] Python API for StreamingListener 2015-11-16 11:29:27 -08:00
mqtt.py [SPARK-10447][SPARK-3842][PYSPARK] upgrade pyspark to py4j0.9 2015-10-20 10:52:49 -07:00
tests.py [SPARK-11740][STREAMING] Fix the race condition of two checkpoints in a batch 2015-11-17 14:48:29 -08:00
util.py [SPARK-8389] [STREAMING] [PYSPARK] Expose KafkaRDDs offsetRange in Python 2015-07-09 13:54:44 -07:00