Executor metrics in spark application

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Executor metrics in spark application

Issac Buenrostro
Hello,

I am trying to implement metrics in a spark application (using Codahale's MetricRegistry). I have been able to see metrics for the driver using code similar to this:

class MyMetricSource extends Source{
      val metricRegistry = new MetricRegistry
      val sourceName = "example.metrics"

      metricRegistry.register(MetricRegistry.name("example","metric"),
        new Gauge[Int] {
          override def getValue: Int = {
            return counter.value
          }
        })
    }
val myMetrics = new VortexMetricSource
SparkEnv.get.metricsSystem.registerSource(myMetrics)

I am also trying to get metrics for the executors (for example, about the rate at which they process entries). I would want to use something similar to

class ExecutorSource extends Source {
      val metricRegistry = new MetricRegistry
      val sourceName = "executor.metrics"

      val lograte = metricRegistry.meter("rate")
}

val executorMetrics = new ExecutorSource
SparkEnv.get.metricsSystem.registerSource(executorMetrics)

rdd.map( x => {
      executorMetrics.lograte.mark
      x
}).map(...)

I have tried a few things so far, but have not yet been able to see these metrics working anywhere (I can see the executor.metrics.rate metric on the driver, but the rate is always zero, and I cannot see any such metrics anywhere on the worker machines). 

Does anyone have any pointers on how to produce executor metrics?

Thank you!
Issac

--
--
Issac Buenrostro
Software Engineer | 
[hidden email]
www.ooyala.com | blog | @ooyala
Reply | Threaded
Open this post in threaded view
|

Re: Executor metrics in spark application

Denes
I'm also pretty interested how to create custom Sinks in Spark. I'm using it with Ganglia and the normal metrics from JVM source do show up. I tried to create my own metric based on Issac's code, but does not show up in Ganglia. Does anyone know where is the problem?
Here's the code snippet:

class AccumulatorSource(accumulator: Accumulator[Long], name: String) extends Source {
 
  val sourceName = "accumulator.metrics"
  val metricRegistry = new MetricRegistry()
 
  metricRegistry.register(MetricRegistry.name("accumulator", name), new Gauge[Long] {
     override def getValue: Long = {
            return accumulator.value;
  }});

}

and then in the main:
val longAccumulator = sc.accumulator[Long](0);
val accumulatorMetrics = new AccumulatorSource(longAccumulator , "counters.accumulator");
SparkEnv.get.metricsSystem.registerSource(accumulatorMetrics);
Reply | Threaded
Open this post in threaded view
|

Re: Executor metrics in spark application

Denes
I meant custom Sources, sorry.
Reply | Threaded
Open this post in threaded view
|

RE: Executor metrics in spark application

Shao, Saisai
In reply to this post by Denes
Hi Denes,

I think you can register your customized metrics source into metrics system through metrics.properties, you can take metrics.propertes.template as reference,

Basically you can do as follow if you want to monitor on executor:

executor.source.accumulator.class=xx.xx.xx.your-customized-metrics-source

I think the below code can only register metrics source in client side.

SparkEnv.get.metricsSystem.registerSource(accumulatorMetrics);

BTW, it's not a good choice to register through MetricsSystem, it would be nice to register through configuration. Also you can enable console sink to verify whether the source is registered or not.

Thanks
Jerry


-----Original Message-----
From: Denes [mailto:[hidden email]]
Sent: Tuesday, July 22, 2014 2:02 PM
To: [hidden email]
Subject: Re: Executor metrics in spark application

I'm also pretty interested how to create custom Sinks in Spark. I'm using it with Ganglia and the normal metrics from JVM source do show up. I tried to create my own metric based on Issac's code, but does not show up in Ganglia.
Does anyone know where is the problem?
Here's the code snippet:

class AccumulatorSource(accumulator: Accumulator[Long], name: String) extends Source {
 
  val sourceName = "accumulator.metrics"
  val metricRegistry = new MetricRegistry()
 
  metricRegistry.register(MetricRegistry.name("accumulator", name), new Gauge[Long] {
     override def getValue: Long = {
            return accumulator.value;
  }});

}

and then in the main:
val longAccumulator = sc.accumulator[Long](0); val accumulatorMetrics = new AccumulatorSource(longAccumulator , "counters.accumulator"); SparkEnv.get.metricsSystem.registerSource(accumulatorMetrics);




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Executor-metrics-in-spark-application-tp188p10385.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

RE: Executor metrics in spark application

Denes
Hi Jerry,

I know that way of registering a metrics, but it seems defeat the whole purpose. I'd like to define a source that is set within the application, for example number of parsed messages.
If I register it in the metrics.properties, how can I obtain the instance? (or instances?)
How can I set the property? Is there a way to read an accumulator values from a Source?
Reply | Threaded
Open this post in threaded view
|

RE: Executor metrics in spark application

Shao, Saisai
Yeah, I start to know your purpose. Original design purpose of customized metrics source is focused on self-contained source, seems you need to rely on outer variable, so the way you mentioned may be is the only way to register.

Besides, as you cannot see the source in Ganglia, I think you can enable console sink to verify the outputs, also seems you want to register this source in driver, so you need to enable Ganglia sink on driver side and make sure Ganglia client can connect your driver.

Thanks
Jerry

-----Original Message-----
From: Denes [mailto:[hidden email]]
Sent: Tuesday, July 22, 2014 6:38 PM
To: [hidden email]
Subject: RE: Executor metrics in spark application

Hi Jerry,

I know that way of registering a metrics, but it seems defeat the whole purpose. I'd like to define a source that is set within the application, for example number of parsed messages.
If I register it in the metrics.properties, how can I obtain the instance?
(or instances?)
How can I set the property? Is there a way to read an accumulator values from a Source?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Executor-metrics-in-spark-application-tp188p10397.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

RE: Executor metrics in spark application

Denes
As far as I understand even if I could register the custom source, there is no way to have a cluster-wide variable to pass to it, i.e. the accumulator can be modified by tasks, but only the driver can read it and the broadcast value is constant.
So it seems this custom metrics/sinks fuctionality is not really thought out by the developers.
SRK
Reply | Threaded
Open this post in threaded view
|

RE: Executor metrics in spark application

SRK
This post has NOT been accepted by the mailing list yet.
Hi,

Were you able to setup custom metrics in GangliaSink? If so, how did you register the custom metrics?

Thanks!