spark-instrumented-optimizer/external
Arun Mahadevan 14d7c1c3e9 [SPARK-24863][SS] Report Kafka offset lag as a custom metrics
## What changes were proposed in this pull request?

This builds on top of SPARK-24748 to report 'offset lag' as a custom metrics for Kafka structured streaming source.

This lag is the difference between the latest offsets in Kafka the time the metrics is reported (just after a micro-batch completes) and the latest offset Spark has processed. It can be 0 (or close to 0) if spark keeps up with the rate at which messages are ingested into Kafka topics in steady state. This measures how far behind the spark source has fallen behind (per partition) and can aid in tuning the application.

## How was this patch tested?

Existing and new unit tests

Please review http://spark.apache.org/contributing.html before opening a pull request.

Closes #21819 from arunmahadevan/SPARK-24863.

Authored-by: Arun Mahadevan <arunm@apache.org>
Signed-off-by: hyukjinkwon <gurwls223@apache.org>
2018-08-18 17:31:52 +08:00
..
avro [SPARK-25104][SQL] Avro: Validate user specified output schema 2018-08-14 04:43:14 +00:00
docker [SPARK-23038][TEST] Update docker/spark-test (JDK/OS) 2018-01-13 23:26:12 -08:00
docker-integration-tests [SPARK-22814][SQL] Support Date/Timestamp in a JDBC partition column 2018-07-30 07:42:00 -07:00
flume [SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT 2018-01-13 00:37:59 +08:00
flume-assembly [SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT 2018-01-13 00:37:59 +08:00
flume-sink [SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT 2018-01-13 00:37:59 +08:00
kafka-0-8 [SPARK-21168] KafkaRDD should always set kafka clientId. 2018-04-23 13:56:11 -05:00
kafka-0-8-assembly [SPARK-23654][BUILD] remove jets3t as a dependency of spark 2018-08-16 12:34:23 -07:00
kafka-0-10 [SPARK-25116][TESTS] Fix the Kafka cluster leak and clean up cached producers 2018-08-17 14:21:08 -07:00
kafka-0-10-assembly [SPARK-23654][BUILD] remove jets3t as a dependency of spark 2018-08-16 12:34:23 -07:00
kafka-0-10-sql [SPARK-24863][SS] Report Kafka offset lag as a custom metrics 2018-08-18 17:31:52 +08:00
kinesis-asl [SPARK-20168][STREAMING KINESIS] Setting the timestamp directly would cause exception on … 2018-07-12 10:04:47 -07:00
kinesis-asl-assembly [SPARK-23654][BUILD] remove jets3t as a dependency of spark 2018-08-16 12:34:23 -07:00
spark-ganglia-lgpl [SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT 2018-01-13 00:37:59 +08:00