spark-instrumented-optimizer/external
Shixiong Zhu 8bb9414aaf
[SPARK-25214][SS] Fix the issue that Kafka v2 source may return duplicated records when failOnDataLoss=false
## What changes were proposed in this pull request?

When there are missing offsets, Kafka v2 source may return duplicated records when `failOnDataLoss=false` because it doesn't skip missing offsets.

This PR fixes the issue and also adds regression tests for all Kafka readers.

## How was this patch tested?

New tests.

Closes #22207 from zsxwing/SPARK-25214.

Authored-by: Shixiong Zhu <zsxwing@gmail.com>
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
2018-08-24 12:00:34 -07:00
..
avro [SPARK-24811][FOLLOWUP][SQL] Revise package of AvroDataToCatalyst and CatalystDataToAvro 2018-08-23 15:08:46 +08:00
docker [SPARK-23038][TEST] Update docker/spark-test (JDK/OS) 2018-01-13 23:26:12 -08:00
docker-integration-tests [SPARK-22814][SQL] Support Date/Timestamp in a JDBC partition column 2018-07-30 07:42:00 -07:00
flume [SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT 2018-01-13 00:37:59 +08:00
flume-assembly [SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT 2018-01-13 00:37:59 +08:00
flume-sink [SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT 2018-01-13 00:37:59 +08:00
kafka-0-8 [SPARK-21168] KafkaRDD should always set kafka clientId. 2018-04-23 13:56:11 -05:00
kafka-0-8-assembly [SPARK-23654][BUILD] remove jets3t as a dependency of spark 2018-08-16 12:34:23 -07:00
kafka-0-10 [SPARK-25116][TESTS] Fix the Kafka cluster leak and clean up cached producers 2018-08-17 14:21:08 -07:00
kafka-0-10-assembly [SPARK-23654][BUILD] remove jets3t as a dependency of spark 2018-08-16 12:34:23 -07:00
kafka-0-10-sql [SPARK-25214][SS] Fix the issue that Kafka v2 source may return duplicated records when failOnDataLoss=false 2018-08-24 12:00:34 -07:00
kinesis-asl [SPARK-20168][STREAMING KINESIS] Setting the timestamp directly would cause exception on … 2018-07-12 10:04:47 -07:00
kinesis-asl-assembly [SPARK-23654][BUILD] remove jets3t as a dependency of spark 2018-08-16 12:34:23 -07:00
spark-ganglia-lgpl [SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT 2018-01-13 00:37:59 +08:00