Facing memory leak with Pyarrow enabled and toPandas()

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Facing memory leak with Pyarrow enabled and toPandas()

Divyanshu Kumar
Hi, I am facing this issue while using toPandas() and Pyarrow simultaneously. 

Reply | Threaded
Open this post in threaded view
|

Re: Facing memory leak with Pyarrow enabled and toPandas()

Gourav Sengupta
Hi
Can you please mention the spark version, give us the code for setting up spark session, and the operation you are talking about? It will be good to know the amount of memory that your system has as well and number of executors you are using per system
In general I have faced issues when doing group by or running aggregates over datasets which are more than 2 GB but my system has lower ram. 

Regards 
Gourav 

On Thu, 21 Jan 2021, 12:24 Divyanshu Kumar, <[hidden email]> wrote:
Hi, I am facing this issue while using toPandas() and Pyarrow simultaneously.