Re: How to address seemingly low core utilization on a spark workload?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: How to address seemingly low core utilization on a spark workload?

Vitaliy Pisarev
That is precisely my question- what kind of leads can I look at to get a hint of where the inefficiencies lay?

On Thu, Nov 15, 2018 at 4:56 PM David Markovitz <[hidden email]> wrote:

It seems it is almost fully utilized – when it is active.

What happens in the gaps, where there is no spark activity?

 

Best regards,

 

David (דודו) Markovitz

Technology Solutions Professional, Data Platform

Microsoft Israel

 

Mobile: +972-525-834-304

Office: +972-747-119-274

 

 

From: Vitaliy Pisarev <[hidden email]>
Sent: Thursday, November 15, 2018 4:51 PM
To: user <[hidden email]>
Cc: David Markovitz <[hidden email]>
Subject: How to address seemingly low core utilization on a spark workload?

 

I have a workload that runs on a cluster of 300 cores. 

Below is a plot of the amount of active tasks over time during the execution of this workload:

 

 

What I deduce is that there are substantial intervals where the cores are heavily under-utilised. 

 

What actions can I take to:

  • Increase the efficiency (== core utilisation) of the cluster?
  • Understand the root causes behind the drops in core utilisation?

image001.png (4K) Download Attachment
image002.png (48K) Download Attachment
image001.png (4K) Download Attachment