Tuning Resource Allocation during runtime

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Tuning Resource Allocation during runtime

Donni Khan
Hi All,

Is there any way to change the  number of executors/cores  during running Saprk Job.
I have Spark Job containing two tasks: First task need many executors to run fastly. the second task has many input and output opeartions and shuffling, so it needs  few executors, otherwise it taks loong time to finish.
Does anyone knows if that possible in YARN?


Thank you.
Donni
Reply | Threaded
Open this post in threaded view
|

Re: Tuning Resource Allocation during runtime

jogesh anand
Hi Donni, 

Please check spark dynamic allocation and external shuffle service . 

On Fri, 27 Apr 2018 at 2:52 AM, Donni Khan <[hidden email]> wrote:
Hi All,

Is there any way to change the  number of executors/cores  during running Saprk Job.
I have Spark Job containing two tasks: First task need many executors to run fastly. the second task has many input and output opeartions and shuffling, so it needs  few executors, otherwise it taks loong time to finish.
Does anyone knows if that possible in YARN?


Thank you.
Donni
--
Regards,
Jogesh Anand
Reply | Threaded
Open this post in threaded view
|

Re: Tuning Resource Allocation during runtime

Vadim Semenov-2
In reply to this post by Donni Khan
You can not change dynamically the number of cores per executor or cores per task, but you can change the number of executors.

In one of my jobs I have something like this, so when I know that I don't need more than 4 executors, I kill all other executors (assuming that they don't hold any cached data), and they join other jobs (thanks to dynamic allocation)


// At this point we have 1500 parquet files
// But we want 100 files, which means about 4 executors can process everything
// assuming that they can process 30 tasks each
// So we can let other executors leave the job
val executors = SparkContextUtil.getExecutorIds(sc)
executors.take(executors.size - 4).foreach(sc.killExecutor)


package org.apache.spark
/**
* `SparkContextUtil` gives access to private methods
*/
object SparkContextUtil {
def getExecutorIds(sc: SparkContext): Seq[String] =
sc.getExecutorIds.filter(_ != SparkContext.DRIVER_IDENTIFIER)




On Fri, Apr 27, 2018 at 3:52 AM, Donni Khan <[hidden email]> wrote:
Hi All,

Is there any way to change the  number of executors/cores  during running Saprk Job.
I have Spark Job containing two tasks: First task need many executors to run fastly. the second task has many input and output opeartions and shuffling, so it needs  few executors, otherwise it taks loong time to finish.
Does anyone knows if that possible in YARN?


Thank you.
Donni



--
Sent from my iPhone