Spark stalling during shuffle (maybe a memory issue)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Spark stalling during shuffle (maybe a memory issue)

jonathan.keebler
Has anyone observed Spark worker threads stalling during a shuffle phase with the following message (one per worker host) being echoed to the terminal on the driver thread?

INFO spark.MapOutputTrackerActor: Asked to send map output locations for shuffle 0 to [worker host]...


At this point Spark-related activity on the hadoop cluster completely halts .. there's no network activity, disk IO or CPU activity, and individual tasks are not completing and the job just sits in this state.  At this point we just kill the job & a re-start of the Spark server service is required.

Using identical jobs we were able to by-pass this halt point by increasing available heap memory to the workers, but it's odd we don't get an out-of-memory error or any error at all.  Upping the memory available isn't a very satisfying answer to what may be going on :)

We're running Spark 0.9.0 on CDH5.0 in stand-alone mode.

Thanks for any help or ideas you may have!

Cheers,
Jonathan
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark stalling during shuffle (maybe a memory issue)

Aaron Davidson
This is very likely because the serialized map output locations buffer exceeds the akka frame size. Please try setting "spark.akka.frameSize" (default 10 MB) to some higher number, like 64 or 128. 

In the newest version of Spark, this would throw a better error, for what it's worth.



On Mon, May 19, 2014 at 8:39 PM, jonathan.keebler <[hidden email]> wrote:
Has anyone observed Spark worker threads stalling during a shuffle phase with
the following message (one per worker host) being echoed to the terminal on
the driver thread?

INFO spark.MapOutputTrackerActor: Asked to send map output locations for
shuffle 0 to [worker host]...


At this point Spark-related activity on the hadoop cluster completely halts
.. there's no network activity, disk IO or CPU activity, and individual
tasks are not completing and the job just sits in this state.  At this point
we just kill the job & a re-start of the Spark server service is required.

Using identical jobs we were able to by-pass this halt point by increasing
available heap memory to the workers, but it's odd we don't get an
out-of-memory error or any error at all.  Upping the memory available isn't
a very satisfying answer to what may be going on :)

We're running Spark 0.9.0 on CDH5.0 in stand-alone mode.

Thanks for any help or ideas you may have!

Cheers,
Jonathan




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-stalling-during-shuffle-maybe-a-memory-issue-tp6067.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark stalling during shuffle (maybe a memory issue)

jonathan.keebler
Thanks for the response, Aaron!  We'll give it a try tomorrow.


On Tue, May 20, 2014 at 12:13 AM, Aaron Davidson [via Apache Spark User List] <[hidden email]> wrote:
This is very likely because the serialized map output locations buffer exceeds the akka frame size. Please try setting "spark.akka.frameSize" (default 10 MB) to some higher number, like 64 or 128. 

In the newest version of Spark, this would throw a better error, for what it's worth.



On Mon, May 19, 2014 at 8:39 PM, jonathan.keebler <[hidden email]> wrote:
Has anyone observed Spark worker threads stalling during a shuffle phase with
the following message (one per worker host) being echoed to the terminal on
the driver thread?

INFO spark.MapOutputTrackerActor: Asked to send map output locations for
shuffle 0 to [worker host]...


At this point Spark-related activity on the hadoop cluster completely halts
.. there's no network activity, disk IO or CPU activity, and individual
tasks are not completing and the job just sits in this state.  At this point
we just kill the job & a re-start of the Spark server service is required.

Using identical jobs we were able to by-pass this halt point by increasing
available heap memory to the workers, but it's odd we don't get an
out-of-memory error or any error at all.  Upping the memory available isn't
a very satisfying answer to what may be going on :)

We're running Spark 0.9.0 on CDH5.0 in stand-alone mode.

Thanks for any help or ideas you may have!

Cheers,
Jonathan




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-stalling-during-shuffle-maybe-a-memory-issue-tp6067.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.




If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-stalling-during-shuffle-maybe-a-memory-issue-tp6067p6073.html
To unsubscribe from Spark stalling during shuffle (maybe a memory issue), click here.
NAML

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark stalling during shuffle (maybe a memory issue)

jonathan.keebler
In reply to this post by Aaron Davidson
So we upped the spark.akka.frameSize value to 128 MB and still observed the same behavior.  It's happening not necessarily when data is being sent back to the driver, but when there is an inter-cluster shuffle, for example during a groupByKey.

Is it possible we should focus on tuning these parameters: spark.storage.memoryFraction & spark.shuffle.memoryFraction ??


On Tue, May 20, 2014 at 12:09 AM, Aaron Davidson <[hidden email]> wrote:
This is very likely because the serialized map output locations buffer exceeds the akka frame size. Please try setting "spark.akka.frameSize" (default 10 MB) to some higher number, like 64 or 128. 

In the newest version of Spark, this would throw a better error, for what it's worth.



On Mon, May 19, 2014 at 8:39 PM, jonathan.keebler <[hidden email]> wrote:
Has anyone observed Spark worker threads stalling during a shuffle phase with
the following message (one per worker host) being echoed to the terminal on
the driver thread?

INFO spark.MapOutputTrackerActor: Asked to send map output locations for
shuffle 0 to [worker host]...


At this point Spark-related activity on the hadoop cluster completely halts
.. there's no network activity, disk IO or CPU activity, and individual
tasks are not completing and the job just sits in this state.  At this point
we just kill the job & a re-start of the Spark server service is required.

Using identical jobs we were able to by-pass this halt point by increasing
available heap memory to the workers, but it's odd we don't get an
out-of-memory error or any error at all.  Upping the memory available isn't
a very satisfying answer to what may be going on :)

We're running Spark 0.9.0 on CDH5.0 in stand-alone mode.

Thanks for any help or ideas you may have!

Cheers,
Jonathan




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-stalling-during-shuffle-maybe-a-memory-issue-tp6067.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark stalling during shuffle (maybe a memory issue)

Andrew Ash

If the distribution of the keys in your groupByKey is skewed (some keys appear way more often than others) you should consider modifying your job to use reduceByKey instead wherever possible.

On May 20, 2014 12:53 PM, "Jon Keebler" <[hidden email]> wrote:
So we upped the spark.akka.frameSize value to 128 MB and still observed the same behavior.  It's happening not necessarily when data is being sent back to the driver, but when there is an inter-cluster shuffle, for example during a groupByKey.

Is it possible we should focus on tuning these parameters: spark.storage.memoryFraction & spark.shuffle.memoryFraction ??


On Tue, May 20, 2014 at 12:09 AM, Aaron Davidson <[hidden email]> wrote:
This is very likely because the serialized map output locations buffer exceeds the akka frame size. Please try setting "spark.akka.frameSize" (default 10 MB) to some higher number, like 64 or 128. 

In the newest version of Spark, this would throw a better error, for what it's worth.



On Mon, May 19, 2014 at 8:39 PM, jonathan.keebler <[hidden email]> wrote:
Has anyone observed Spark worker threads stalling during a shuffle phase with
the following message (one per worker host) being echoed to the terminal on
the driver thread?

INFO spark.MapOutputTrackerActor: Asked to send map output locations for
shuffle 0 to [worker host]...


At this point Spark-related activity on the hadoop cluster completely halts
.. there's no network activity, disk IO or CPU activity, and individual
tasks are not completing and the job just sits in this state.  At this point
we just kill the job & a re-start of the Spark server service is required.

Using identical jobs we were able to by-pass this halt point by increasing
available heap memory to the workers, but it's odd we don't get an
out-of-memory error or any error at all.  Upping the memory available isn't
a very satisfying answer to what may be going on :)

We're running Spark 0.9.0 on CDH5.0 in stand-alone mode.

Thanks for any help or ideas you may have!

Cheers,
Jonathan




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-stalling-during-shuffle-maybe-a-memory-issue-tp6067.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark stalling during shuffle (maybe a memory issue)

jonathan.keebler
Thanks for the suggestion, Andrew.  We have also implemented our solution using reduceByKey, but observe the same behavior.  For example if we do the following:

map1
groupByKey
map2
saveAsTextFile

Then the stalling will occur during the map1+groupByKey execution.

If we do 

map1
reduceByKey
map2
saveAsTextFile

Then the reduceByKey finishes successfully, but the stalling will occur during the map2+saveAsTextFile execution.


On Tue, May 20, 2014 at 4:22 PM, Andrew Ash [via Apache Spark User List] <[hidden email]> wrote:

If the distribution of the keys in your groupByKey is skewed (some keys appear way more often than others) you should consider modifying your job to use reduceByKey instead wherever possible.

On May 20, 2014 12:53 PM, "Jon Keebler" <[hidden email]> wrote:
So we upped the spark.akka.frameSize value to 128 MB and still observed the same behavior.  It's happening not necessarily when data is being sent back to the driver, but when there is an inter-cluster shuffle, for example during a groupByKey.

Is it possible we should focus on tuning these parameters: spark.storage.memoryFraction & spark.shuffle.memoryFraction ??


On Tue, May 20, 2014 at 12:09 AM, Aaron Davidson <[hidden email]> wrote:
This is very likely because the serialized map output locations buffer exceeds the akka frame size. Please try setting "spark.akka.frameSize" (default 10 MB) to some higher number, like 64 or 128. 

In the newest version of Spark, this would throw a better error, for what it's worth.



On Mon, May 19, 2014 at 8:39 PM, jonathan.keebler <[hidden email]> wrote:
Has anyone observed Spark worker threads stalling during a shuffle phase with
the following message (one per worker host) being echoed to the terminal on
the driver thread?

INFO spark.MapOutputTrackerActor: Asked to send map output locations for
shuffle 0 to [worker host]...


At this point Spark-related activity on the hadoop cluster completely halts
.. there's no network activity, disk IO or CPU activity, and individual
tasks are not completing and the job just sits in this state.  At this point
we just kill the job & a re-start of the Spark server service is required.

Using identical jobs we were able to by-pass this halt point by increasing
available heap memory to the workers, but it's odd we don't get an
out-of-memory error or any error at all.  Upping the memory available isn't
a very satisfying answer to what may be going on :)

We're running Spark 0.9.0 on CDH5.0 in stand-alone mode.

Thanks for any help or ideas you may have!

Cheers,
Jonathan




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-stalling-during-shuffle-maybe-a-memory-issue-tp6067.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.





If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-stalling-during-shuffle-maybe-a-memory-issue-tp6067p6134.html
To unsubscribe from Spark stalling during shuffle (maybe a memory issue), click here.
NAML

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark stalling during shuffle (maybe a memory issue)

Aaron Davidson
So the current stalling is simply sitting there with no log output? Have you jstack'd an Executor to see where it may be hanging? Are you observing memory or disk pressure ("df" and "df -i")?


On Tue, May 20, 2014 at 2:03 PM, jonathan.keebler <[hidden email]> wrote:
Thanks for the suggestion, Andrew.  We have also implemented our solution using reduceByKey, but observe the same behavior.  For example if we do the following:

map1
groupByKey
map2
saveAsTextFile

Then the stalling will occur during the map1+groupByKey execution.

If we do 

map1
reduceByKey
map2
saveAsTextFile

Then the reduceByKey finishes successfully, but the stalling will occur during the map2+saveAsTextFile execution.


On Tue, May 20, 2014 at 4:22 PM, Andrew Ash [via Apache Spark User List] <[hidden email]> wrote:

If the distribution of the keys in your groupByKey is skewed (some keys appear way more often than others) you should consider modifying your job to use reduceByKey instead wherever possible.

On May 20, 2014 12:53 PM, "Jon Keebler" <[hidden email]> wrote:
So we upped the spark.akka.frameSize value to 128 MB and still observed the same behavior.  It's happening not necessarily when data is being sent back to the driver, but when there is an inter-cluster shuffle, for example during a groupByKey.

Is it possible we should focus on tuning these parameters: spark.storage.memoryFraction & spark.shuffle.memoryFraction ??


On Tue, May 20, 2014 at 12:09 AM, Aaron Davidson <[hidden email]> wrote:
This is very likely because the serialized map output locations buffer exceeds the akka frame size. Please try setting "spark.akka.frameSize" (default 10 MB) to some higher number, like 64 or 128. 

In the newest version of Spark, this would throw a better error, for what it's worth.



On Mon, May 19, 2014 at 8:39 PM, jonathan.keebler <[hidden email]> wrote:
Has anyone observed Spark worker threads stalling during a shuffle phase with
the following message (one per worker host) being echoed to the terminal on
the driver thread?

INFO spark.MapOutputTrackerActor: Asked to send map output locations for
shuffle 0 to [worker host]...


At this point Spark-related activity on the hadoop cluster completely halts
.. there's no network activity, disk IO or CPU activity, and individual
tasks are not completing and the job just sits in this state.  At this point
we just kill the job & a re-start of the Spark server service is required.

Using identical jobs we were able to by-pass this halt point by increasing
available heap memory to the workers, but it's odd we don't get an
out-of-memory error or any error at all.  Upping the memory available isn't
a very satisfying answer to what may be going on :)

We're running Spark 0.9.0 on CDH5.0 in stand-alone mode.

Thanks for any help or ideas you may have!

Cheers,
Jonathan




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-stalling-during-shuffle-maybe-a-memory-issue-tp6067.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.





If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-stalling-during-shuffle-maybe-a-memory-issue-tp6067p6134.html
To unsubscribe from Spark stalling during shuffle (maybe a memory issue), click here.
NAML

Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark stalling during shuffle (maybe a memory issue)

bogdanbaraila
This post has NOT been accepted by the mailing list yet.
In reply to this post by jonathan.keebler
Hello Jonathan

Did you found any working solution for your issue? If yes could you please share it?

Thanks
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark stalling during shuffle (maybe a memory issue)

vinodep
This post has NOT been accepted by the mailing list yet.
We have a similar situation where we are trying to do matrix vector multiplication using BlockMatrices. Sparse Matrix of order million. We observe that in the final stage where there is a reducebykey and a collect. The progress just stalls and we have to kill the job. We also observe that huge amount of shuffle has already exchanged. Is it those shuffles causing the stall or any thing we can try to get past this? is shuffle external service relevant here?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark stalling during shuffle (maybe a memory issue)

bogdanbaraila
This post has NOT been accepted by the mailing list yet.
The issue was fixed for me by allocating just one core per executor. If I have executors with more then 1 core the issue appears again. I didn't yet understood why is this happening but for the ones having similar issue they can try this.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark stalling during shuffle (maybe a memory issue)

fpopic
This post has NOT been accepted by the mailing list yet.
Hmm, I got the same problem,

maybe your executor-memory config should be divided by executor-cores?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark stalling during shuffle (maybe a memory issue)

Toshiba_r
This post has NOT been accepted by the mailing list yet.
In reply to this post by jonathan.keebler
Sparker worker threads are the best option in threading process and the version of SPARK 1.2 is awesome to use while shuffling.

Yahoo support number
Loading...