spark exception

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

spark exception

Amit Sharma
Hi All, sometimes i get this error in spark logs. I notice few executors are shown as dead in the executor tab during this error. Although my job get success. Please help me out the root cause of this issue. I have 3 workers with 30 cores each and 64 GB RAM each. My job uses 3 cores per executor and uses a total of 63 cores and 4GB RAM per executor.

Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages  
Reply | Threaded
Open this post in threaded view
|

Re: spark exception

Russell Spitzer
Usually this is just the sign that one of the executors quit unexpectedly which explains the dead executors you see in the ui. The next step is usually to go and look at those executor logs and see if there's any reason for the termination. if you end up seeing an abrupt truncation of the log that usually means the out of memory killer shut down the process. 

At that point it means that although you set the RAM to a very high-level the operating system was unable to service a malloc call when it was important. This means that you probably need to run with a smaller heap size because there wasn't enough working ram to handle the heap requested.

If the log ends with some other kind of exception then you need to look into why that occured.

On Fri, Jul 24, 2020, 7:42 AM Amit Sharma <[hidden email]> wrote:
Hi All, sometimes i get this error in spark logs. I notice few executors are shown as dead in the executor tab during this error. Although my job get success. Please help me out the root cause of this issue. I have 3 workers with 30 cores each and 64 GB RAM each. My job uses 3 cores per executor and uses a total of 63 cores and 4GB RAM per executor.

Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages