spark-instrumented-optimizer/external
Hari Shreedharan f0500f9fa3 [SPARK-4707][STREAMING] Reliable Kafka Receiver can lose data if the blo...
...ck generator fails to store data.

The Reliable Kafka Receiver commits offsets only when events are actually stored, which ensures that on restart we will actually start where we left off. But if the failure happens in the store() call, and the block generator reports an error the receiver does not do anything and will continue reading from the current offset and not the last commit. This means that messages between the last commit and the current offset will be lost.

This PR retries the store call four times and then stops the receiver with an error message and the last exception that was received from the store.

Author: Hari Shreedharan <hshreedharan@apache.org>

Closes #3655 from harishreedharan/kafka-failure-fix and squashes the following commits:

5e2e7ad [Hari Shreedharan] [SPARK-4704][STREAMING] Reliable Kafka Receiver can lose data if the block generator fails to store data.
2015-02-04 14:20:44 -08:00
..
flume [SPARK-5006][Deploy]spark.port.maxRetries doesn't work 2015-01-13 09:29:25 -08:00
flume-sink [SPARK-4048] Enhance and extend hadoop-provided profile. 2015-01-08 17:15:13 -08:00
kafka [SPARK-4707][STREAMING] Reliable Kafka Receiver can lose data if the blo... 2015-02-04 14:20:44 -08:00
kafka-assembly [SPARK-5154] [PySpark] [Streaming] Kafka streaming support in Python 2015-02-02 19:16:27 -08:00
mqtt [SPARK-4631][streaming][FIX] Wait for a receiver to start before publishing test data. 2015-02-02 14:00:33 -08:00
twitter SPARK-4159 [CORE] Maven build doesn't run JUnit test suites 2015-01-06 12:02:08 -08:00
zeromq [SPARK-4048] Enhance and extend hadoop-provided profile. 2015-01-08 17:15:13 -08:00