spark-instrumented-optimizer

History

Tathagata Das c3d08e2f29 [SPARK-18516][SQL] Split state and progress in streaming This PR separates the status of a `StreamingQuery` into two separate APIs: - `status` - describes the status of a `StreamingQuery` at this moment, including what phase of processing is currently happening and if data is available. - `recentProgress` - an array of statistics about the most recent microbatches that have executed. A recent progress contains the following information: ``` { "id" : "2be8670a-fce1-4859-a530-748f29553bb6", "name" : "query-29", "timestamp" : 1479705392724, "inputRowsPerSecond" : 230.76923076923077, "processedRowsPerSecond" : 10.869565217391303, "durationMs" : { "triggerExecution" : 276, "queryPlanning" : 3, "getBatch" : 5, "getOffset" : 3, "addBatch" : 234, "walCommit" : 30 }, "currentWatermark" : 0, "stateOperators" : [ ], "sources" : [ { "description" : "KafkaSource[Subscribe[topic-14]]", "startOffset" : { "topic-14" : { "2" : 0, "4" : 1, "1" : 0, "3" : 0, "0" : 0 } }, "endOffset" : { "topic-14" : { "2" : 1, "4" : 2, "1" : 0, "3" : 0, "0" : 1 } }, "numRecords" : 3, "inputRowsPerSecond" : 230.76923076923077, "processedRowsPerSecond" : 10.869565217391303 } ] } ``` Additionally, in order to make it possible to correlate progress updates across restarts, we change the `id` field from an integer that is unique with in the JVM to a `UUID` that is globally unique. Author: Tathagata Das <tathagata.das1565@gmail.com> Author: Michael Armbrust <michael@databricks.com> Closes #15954 from marmbrus/queryProgress.		2016-11-29 17:24:17 -08:00
..
docker	[SPARK-13595][BUILD] Move docker, extras modules into external	2016-03-09 18:27:44 +00:00
docker-integration-tests	[SPARK-17803][TESTS] Upgrade docker-client dependency	2016-10-06 14:28:49 -07:00
flume	[SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in Java API documentation	2016-11-29 09:41:32 +00:00
flume-assembly	[SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent	2016-07-19 11:59:46 +01:00
flume-sink	[SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent	2016-07-19 11:59:46 +01:00
java8-tests	[SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent	2016-07-19 11:59:46 +01:00
kafka-0-8	[SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in Java API documentation	2016-11-29 09:41:32 +00:00
kafka-0-8-assembly	[SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent	2016-07-19 11:59:46 +01:00
kafka-0-10	[SPARK-17510][STREAMING][KAFKA] config max rate on a per-partition basis	2016-11-14 11:10:37 -08:00
kafka-0-10-assembly	[SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent	2016-07-19 11:59:46 +01:00
kafka-0-10-sql	[SPARK-18516][SQL] Split state and progress in streaming	2016-11-29 17:24:17 -08:00
kinesis-asl	[SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that`/`'''Note:'''` across Scala/Java API documentation	2016-11-19 11:24:15 +00:00
kinesis-asl-assembly	[SPARK-17418] Prevent kinesis-asl-assembly artifacts from being published	2016-09-21 11:38:10 -07:00
spark-ganglia-lgpl	[SPARK-13238][CORE] Add ganglia dmax parameter	2016-08-05 13:07:52 -07:00