Garbage collection issue

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Garbage collection issue

Amit Sharma
Hi All, i am running the same batch job in my two separate spark clusters. In one of the clusters it is showing GC warning  on spark -ui  under executer tag. Garbage collection is taking longer time around 20 %  while in another cluster it is under 10 %. I am using the same configuration in my spark submit and using G1GC .

Please let me know what could be the reason for GC slowness.


Thanks
Amit
Reply | Threaded
Open this post in threaded view
|

Re: Garbage collection issue

Amit Sharma
Please help on this.


Thanks
Amit

On Fri, Jul 17, 2020 at 2:34 PM Amit Sharma <[hidden email]> wrote:
Hi All, i am running the same batch job in my two separate spark clusters. In one of the clusters it is showing GC warning  on spark -ui  under executer tag. Garbage collection is taking longer time around 20 %  while in another cluster it is under 10 %. I am using the same configuration in my spark submit and using G1GC .

Please let me know what could be the reason for GC slowness.


Thanks
Amit
Reply | Threaded
Open this post in threaded view
|

Re: Garbage collection issue

Jeff Evans
What is your heap size, and JVM vendor/version?  Generally, G1 only outperforms CMS on large heap sizes (ex: 31GB or larger).

On Mon, Jul 20, 2020 at 1:22 PM Amit Sharma <[hidden email]> wrote:
Please help on this.


Thanks
Amit

On Fri, Jul 17, 2020 at 2:34 PM Amit Sharma <[hidden email]> wrote:
Hi All, i am running the same batch job in my two separate spark clusters. In one of the clusters it is showing GC warning  on spark -ui  under executer tag. Garbage collection is taking longer time around 20 %  while in another cluster it is under 10 %. I am using the same configuration in my spark submit and using G1GC .

Please let me know what could be the reason for GC slowness.


Thanks
Amit
Reply | Threaded
Open this post in threaded view
|

Re: Garbage collection issue

Russell Spitzer
In reply to this post by Amit Sharma
High GC is relatively hard to debug in general but I can give you a few pointers. This basically means that the time spent cleaning up unused objects is high which usually means memory is be used and thrown away rapidly. It can also mean that GC is ineffective, and is being run many times in an attempt to find things to free up. Since each run is not very effective (because many things are still in use and cannot be thrown out) it has to run more often.

So usually the easiest thing to do if possible is to increase the heap size and hope that you are just seeing GC pressure because you need more free memory than the JVM had. So I would recommend that as a first step, increase the Executor Heap.

The longer and harder thing to do is to see exactly where object allocation is taking place and attempt to minimize it. This requires walking through your code, looking for long lived allocations and minimizing them if possible.

On Mon, Jul 20, 2020 at 1:22 PM Amit Sharma <[hidden email]> wrote:
Please help on this.


Thanks
Amit

On Fri, Jul 17, 2020 at 2:34 PM Amit Sharma <[hidden email]> wrote:
Hi All, i am running the same batch job in my two separate spark clusters. In one of the clusters it is showing GC warning  on spark -ui  under executer tag. Garbage collection is taking longer time around 20 %  while in another cluster it is under 10 %. I am using the same configuration in my spark submit and using G1GC .

Please let me know what could be the reason for GC slowness.


Thanks
Amit