Spark Profiler

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark Profiler

Jack Kolokasis
Hello all,

     I am looking for a spark profiler to trace my application to find
the bottlenecks. I need to trace CPU usage, Memory Usage and I/O usage.

I am looking forward for your reply.

--Iacovos


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Spark Profiler

manish ranjan
I have found ganglia very helpful in understanding network I/o , CPU and memory usage  for a given spark cluster. 
I have not used , but have heard good things about Dr Elephant ( which I think was contributed by LinkedIn but not 100%sure). 

On Tue, Mar 26, 2019, 5:59 AM Jack Kolokasis <[hidden email]> wrote:
Hello all,

     I am looking for a spark profiler to trace my application to find
the bottlenecks. I need to trace CPU usage, Memory Usage and I/O usage.

I am looking forward for your reply.

--Iacovos


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Spark Profiler

Luca Canali

I find that the Spark metrics system is quite useful to gather resource utilization metrics of Spark applications, including CPU, memory and I/O.

If you are interested an example how this works for us at: https://db-blog.web.cern.ch/blog/luca-canali/2019-02-performance-dashboard-apache-spark
If instead you are rather looking at ways to instrument your Spark code with performance metrics, Spark task metrics and event listeners are quite useful for that. See also https://github.com/apache/spark/blob/master/docs/monitoring.md and https://github.com/LucaCanali/sparkMeasure

 

Regards,

Luca

 

From: manish ranjan <[hidden email]>
Sent: Tuesday, March 26, 2019 15:24
To: Jack Kolokasis <[hidden email]>
Cc: user <[hidden email]>
Subject: Re: Spark Profiler

 

I have found ganglia very helpful in understanding network I/o , CPU and memory usage  for a given spark cluster. 

I have not used , but have heard good things about Dr Elephant ( which I think was contributed by LinkedIn but not 100%sure). 

 

On Tue, Mar 26, 2019, 5:59 AM Jack Kolokasis <[hidden email]> wrote:

Hello all,

     I am looking for a spark profiler to trace my application to find
the bottlenecks. I need to trace CPU usage, Memory Usage and I/O usage.

I am looking forward for your reply.

--Iacovos


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Spark Profiler

Jack Kolokasis

Thanks for your reply.  Your help is very valuable and all these links are helpful (especially your example)

Best Regards

--Iacovos

On 3/27/19 10:42 PM, Luca Canali wrote:

I find that the Spark metrics system is quite useful to gather resource utilization metrics of Spark applications, including CPU, memory and I/O.

If you are interested an example how this works for us at: https://db-blog.web.cern.ch/blog/luca-canali/2019-02-performance-dashboard-apache-spark
If instead you are rather looking at ways to instrument your Spark code with performance metrics, Spark task metrics and event listeners are quite useful for that. See also https://github.com/apache/spark/blob/master/docs/monitoring.md and https://github.com/LucaCanali/sparkMeasure

 

Regards,

Luca

 

From: manish ranjan [hidden email]
Sent: Tuesday, March 26, 2019 15:24
To: Jack Kolokasis [hidden email]
Cc: user [hidden email]
Subject: Re: Spark Profiler

 

I have found ganglia very helpful in understanding network I/o , CPU and memory usage  for a given spark cluster. 

I have not used , but have heard good things about Dr Elephant ( which I think was contributed by LinkedIn but not 100%sure). 

 

On Tue, Mar 26, 2019, 5:59 AM Jack Kolokasis <[hidden email]> wrote:

Hello all,

     I am looking for a spark profiler to trace my application to find
the bottlenecks. I need to trace CPU usage, Memory Usage and I/O usage.

I am looking forward for your reply.

--Iacovos


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Spark Profiler

bo yang
Yeah, these options are very valuable. Just add another option :) We build a jvm profiler (https://github.com/uber-common/jvm-profiler) to monitor and profile Spark applications in large scale (e.g. sending metrics to kafka / hive for batch analysis). People could try it as well.


On Wed, Mar 27, 2019 at 1:49 PM Jack Kolokasis <[hidden email]> wrote:

Thanks for your reply.  Your help is very valuable and all these links are helpful (especially your example)

Best Regards

--Iacovos

On 3/27/19 10:42 PM, Luca Canali wrote:

I find that the Spark metrics system is quite useful to gather resource utilization metrics of Spark applications, including CPU, memory and I/O.

If you are interested an example how this works for us at: https://db-blog.web.cern.ch/blog/luca-canali/2019-02-performance-dashboard-apache-spark
If instead you are rather looking at ways to instrument your Spark code with performance metrics, Spark task metrics and event listeners are quite useful for that. See also https://github.com/apache/spark/blob/master/docs/monitoring.md and https://github.com/LucaCanali/sparkMeasure

 

Regards,

Luca

 

From: manish ranjan [hidden email]
Sent: Tuesday, March 26, 2019 15:24
To: Jack Kolokasis [hidden email]
Cc: user [hidden email]
Subject: Re: Spark Profiler

 

I have found ganglia very helpful in understanding network I/o , CPU and memory usage  for a given spark cluster. 

I have not used , but have heard good things about Dr Elephant ( which I think was contributed by LinkedIn but not 100%sure). 

 

On Tue, Mar 26, 2019, 5:59 AM Jack Kolokasis <[hidden email]> wrote:

Hello all,

     I am looking for a spark profiler to trace my application to find
the bottlenecks. I need to trace CPU usage, Memory Usage and I/O usage.

I am looking forward for your reply.

--Iacovos


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Spark Profiler

Hariharan
Hi Jack,

You can try sparklens (https://github.com/qubole/sparklens). I think it won't give details at as low a level as you're looking for, but it can help you identify and remove performance bottlenecks.

~ Hariharan

On Fri, Mar 29, 2019 at 12:01 AM bo yang <[hidden email]> wrote:
Yeah, these options are very valuable. Just add another option :) We build a jvm profiler (https://github.com/uber-common/jvm-profiler) to monitor and profile Spark applications in large scale (e.g. sending metrics to kafka / hive for batch analysis). People could try it as well.


On Wed, Mar 27, 2019 at 1:49 PM Jack Kolokasis <[hidden email]> wrote:

Thanks for your reply.  Your help is very valuable and all these links are helpful (especially your example)

Best Regards

--Iacovos

On 3/27/19 10:42 PM, Luca Canali wrote:

I find that the Spark metrics system is quite useful to gather resource utilization metrics of Spark applications, including CPU, memory and I/O.

If you are interested an example how this works for us at: https://db-blog.web.cern.ch/blog/luca-canali/2019-02-performance-dashboard-apache-spark
If instead you are rather looking at ways to instrument your Spark code with performance metrics, Spark task metrics and event listeners are quite useful for that. See also https://github.com/apache/spark/blob/master/docs/monitoring.md and https://github.com/LucaCanali/sparkMeasure

 

Regards,

Luca

 

From: manish ranjan [hidden email]
Sent: Tuesday, March 26, 2019 15:24
To: Jack Kolokasis [hidden email]
Cc: user [hidden email]
Subject: Re: Spark Profiler

 

I have found ganglia very helpful in understanding network I/o , CPU and memory usage  for a given spark cluster. 

I have not used , but have heard good things about Dr Elephant ( which I think was contributed by LinkedIn but not 100%sure). 

 

On Tue, Mar 26, 2019, 5:59 AM Jack Kolokasis <[hidden email]> wrote:

Hello all,

     I am looking for a spark profiler to trace my application to find
the bottlenecks. I need to trace CPU usage, Memory Usage and I/O usage.

I am looking forward for your reply.

--Iacovos


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Spark Profiler

jcdauchy
Hello Jack,

You can also have a look at “Babar”, there is a nice “flame graph” feature
too. I haven’t had the time to test it out.

https://github.com/criteo/babar

JC




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]