pySpark driver memory limit

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

pySpark driver memory limit

Nicolas Paris
hi there


Can anyone clarify the driver memory aspects of pySpark?
According to [1], spark.driver.memory limits JVM + python memory.

In case:
spark.driver.memory=2G
Then does it mean the user won't be able to use more than 2G, whatever
the python code + the RDD stuff he is using ?

Thanks,

[1]: http://apache-spark-user-list.1001560.n3.nabble.com/spark-is-running-extremely-slow-with-larger-data-set-like-2G-td17152.html



---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: pySpark driver memory limit

Nicolas Paris
Le 06 nov. 2017 à 19:56, Nicolas Paris écrivait :

> Can anyone clarify the driver memory aspects of pySpark?
> According to [1], spark.driver.memory limits JVM + python memory.
>
> In case:
> spark.driver.memory=2G
> Then does it mean the user won't be able to use more than 2G, whatever
> the python code + the RDD stuff he is using ?
>
> Thanks,
>
> [1]: http://apache-spark-user-list.1001560.n3.nabble.com/spark-is-running-extremely-slow-with-larger-data-set-like-2G-td17152.html
>


after some testing, the python driver memory is not limited by
spark.driver.memory
instead, there is no limit at all for those processes. This may be
managed by cgroups however.

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: pySpark driver memory limit

sebastian.piu

This is my experience too when running under yarn at least


On Thu, 9 Nov 2017, 07:11 Nicolas Paris, <[hidden email]> wrote:
Le 06 nov. 2017 à 19:56, Nicolas Paris écrivait :
> Can anyone clarify the driver memory aspects of pySpark?
> According to [1], spark.driver.memory limits JVM + python memory.
>
> In case:
> spark.driver.memory=2G
> Then does it mean the user won't be able to use more than 2G, whatever
> the python code + the RDD stuff he is using ?
>
> Thanks,
>
> [1]: http://apache-spark-user-list.1001560.n3.nabble.com/spark-is-running-extremely-slow-with-larger-data-set-like-2G-td17152.html
>


after some testing, the python driver memory is not limited by
spark.driver.memory
instead, there is no limit at all for those processes. This may be
managed by cgroups however.

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]