[SPARK-7598] [DEPLOY] Add aliveWorkers metrics in Master
In Spark Standalone setup, when some workers are DEAD, they will stay in master worker list for a while.
master.workers metrics for master is only showing the total number of workers, we need to monitor how many real ALIVE workers are there to ensure the cluster is healthy.
Author: Rex Xiong <pengx@microsoft.com>
Closes #6117 from twilightgod/add-aliveWorker-metrics and squashes the following commits:
6be69a5 [Rex Xiong] Fix comment for aliveWorkers metrics
a882f39 [Rex Xiong] Fix style for aliveWorkers metrics
38ce955 [Rex Xiong] Add aliveWorkers metrics in Master
(cherry picked from commit 93dbb3ad83
)
Signed-off-by: Andrew Or <andrew@databricks.com>
This commit is contained in:
parent
fceaffc49b
commit
894214f9ea
|
@ -30,6 +30,11 @@ private[spark] class MasterSource(val master: Master) extends Source {
|
||||||
override def getValue: Int = master.workers.size
|
override def getValue: Int = master.workers.size
|
||||||
})
|
})
|
||||||
|
|
||||||
|
// Gauge for alive worker numbers in cluster
|
||||||
|
metricRegistry.register(MetricRegistry.name("aliveWorkers"), new Gauge[Int]{
|
||||||
|
override def getValue: Int = master.workers.filter(_.state == WorkerState.ALIVE).size
|
||||||
|
})
|
||||||
|
|
||||||
// Gauge for application numbers in cluster
|
// Gauge for application numbers in cluster
|
||||||
metricRegistry.register(MetricRegistry.name("apps"), new Gauge[Int] {
|
metricRegistry.register(MetricRegistry.name("apps"), new Gauge[Int] {
|
||||||
override def getValue: Int = master.apps.size
|
override def getValue: Int = master.apps.size
|
||||||
|
|
Loading…
Reference in a new issue