spark job delay when starting

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

spark job delay when starting

Bulldog20630405
when running spark jobs we find when running the following command:
top -H -i -p <pid>
showed that a single thread labeled "map-output-disp" was running at 99.7% for a majority of the delay period. this delay gets progressively worse with the increase in partition count.

it seems the delay comes from this class org.apache.spark.MapOutputTracker located in the core code

is there anyway to speed this up?