spark-instrumented-optimizer/external
Kengo Seki 1f056eb313 [SPARK-27420][DSTREAMS][KINESIS] KinesisInputDStream should expose a way to configure CloudWatch metrics
## What changes were proposed in this pull request?

KinesisInputDStream currently does not provide a way to disable
CloudWatch metrics push. Its default level is "DETAILED" which pushes
10s of metrics every 10 seconds. When dealing with multiple streaming
jobs this add up pretty quickly, leading to thousands of dollars in cost.
To address this problem, this PR adds interfaces for accessing
KinesisClientLibConfiguration's `withMetrics` and
`withMetricsEnabledDimensions` methods to KinesisInputDStream
so that users can configure KCL's metrics levels and dimensions.

## How was this patch tested?

By running updated unit tests in KinesisInputDStreamBuilderSuite.
In addition, I ran a Streaming job with MetricsLevel.NONE and confirmed:

* there's no data point for the "Operation", "Operation, ShardId" and "WorkerIdentifier" dimensions on the AWS management console
* there's no DEBUG level message from Amazon KCL, such as "Successfully published xx datums."

Please review http://spark.apache.org/contributing.html before opening a pull request.

Closes #24651 from sekikn/SPARK-27420.

Authored-by: Kengo Seki <sekikn@apache.org>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-08 19:48:53 -05:00
..
avro [SPARK-28855][CORE][ML][SQL][STREAMING] Remove outdated usages of Experimental, Evolving annotations 2019-09-01 10:15:00 -05:00
docker [SPARK-27794][R][DOCS] Use https URL for CRAN repo 2019-05-22 14:28:21 -07:00
docker-integration-tests [SPARK-28744][SQL][TEST] rename SharedSQLContext to SharedSparkSession 2019-08-19 19:01:56 +08:00
kafka-0-10 [SPARK-28875][DSTREAMS][SS][TESTS] Add Task retry tests to make sure new consumer used 2019-08-26 13:12:14 -07:00
kafka-0-10-assembly [SPARK-25956] Make Scala 2.12 as default Scala version in Spark 3.0 2018-11-14 16:22:23 -08:00
kafka-0-10-sql [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer 2019-09-04 10:17:38 -07:00
kafka-0-10-token-provider [SPARK-28922][SS] Safe Kafka parameter redaction 2019-08-29 19:17:48 -07:00
kinesis-asl [SPARK-27420][DSTREAMS][KINESIS] KinesisInputDStream should expose a way to configure CloudWatch metrics 2019-09-08 19:48:53 -05:00
kinesis-asl-assembly [SPARK-25956] Make Scala 2.12 as default Scala version in Spark 3.0 2018-11-14 16:22:23 -08:00
spark-ganglia-lgpl [SPARK-25956] Make Scala 2.12 as default Scala version in Spark 3.0 2018-11-14 16:22:23 -08:00