spark-instrumented-optimizer/external
Shixiong Zhu 2fd101b2f0 [SPARK-18373][SPARK-18529][SS][KAFKA] Make failOnDataLoss=false work with Spark jobs
## What changes were proposed in this pull request?

This PR adds `CachedKafkaConsumer.getAndIgnoreLostData` to handle corner cases of `failOnDataLoss=false`.

It also resolves [SPARK-18529](https://issues.apache.org/jira/browse/SPARK-18529) after refactoring codes: Timeout will throw a TimeoutException.

## How was this patch tested?

Because I cannot find any way to manually control the Kafka server to clean up logs, it's impossible to write unit tests for each corner case. Therefore, I just created `test("stress test for failOnDataLoss=false")` which should cover most of corner cases.

I also modified some existing tests to test for both `failOnDataLoss=false` and `failOnDataLoss=true` to make sure it doesn't break existing logic.

Author: Shixiong Zhu <shixiong@databricks.com>

Closes #15820 from zsxwing/failOnDataLoss.
2016-11-22 14:15:57 -08:00
..
docker [SPARK-13595][BUILD] Move docker, extras modules into external 2016-03-09 18:27:44 +00:00
docker-integration-tests [SPARK-17803][TESTS] Upgrade docker-client dependency 2016-10-06 14:28:49 -07:00
flume [SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent 2016-07-19 11:59:46 +01:00
flume-assembly [SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent 2016-07-19 11:59:46 +01:00
flume-sink [SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent 2016-07-19 11:59:46 +01:00
java8-tests [SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent 2016-07-19 11:59:46 +01:00
kafka-0-8 [SPARK-18445][BUILD][DOCS] Fix the markdown for Note:/NOTE:/Note that/'''Note:''' across Scala/Java API documentation 2016-11-19 11:24:15 +00:00
kafka-0-8-assembly [SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent 2016-07-19 11:59:46 +01:00
kafka-0-10 [SPARK-17510][STREAMING][KAFKA] config max rate on a per-partition basis 2016-11-14 11:10:37 -08:00
kafka-0-10-assembly [SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent 2016-07-19 11:59:46 +01:00
kafka-0-10-sql [SPARK-18373][SPARK-18529][SS][KAFKA] Make failOnDataLoss=false work with Spark jobs 2016-11-22 14:15:57 -08:00
kinesis-asl [SPARK-18445][BUILD][DOCS] Fix the markdown for Note:/NOTE:/Note that/'''Note:''' across Scala/Java API documentation 2016-11-19 11:24:15 +00:00
kinesis-asl-assembly [SPARK-17418] Prevent kinesis-asl-assembly artifacts from being published 2016-09-21 11:38:10 -07:00
spark-ganglia-lgpl [SPARK-13238][CORE] Add ganglia dmax parameter 2016-08-05 13:07:52 -07:00