driver memory

classic Classic list List threaded Threaded
4 messages Options
mrm
Reply | Threaded
Open this post in threaded view
|

driver memory

mrm
Hi,

How do I increase the driver memory? This are my configs right now:

sed 's/INFO/ERROR/' spark/conf/log4j.properties.template   > ./ephemeral-hdfs/conf/log4j.properties
sed 's/INFO/ERROR/' spark/conf/log4j.properties.template  > spark/conf/log4j.properties
# Environment variables and Spark properties
export SPARK_WORKER_MEMORY="30g" # Whole memory per worker node indepedent of application (default: total memory on worker node minus 1 GB)
# SPARK_WORKER_CORES = total number of cores an application can use on a machine
# SPARK_WORKER_INSTANCES = how many workers per machine? Limit the number of cores per worker if more than one worker on a machine
export SPARK_JAVA_OPTS=" -Dspark.executor.memory=30g -Dspark.speculation.quantile=0.5 -Dspark.speculation=true -Dspark.cores.max=80 -Dspark.akka.frameSize=1000 -Dspark.rdd.compress=true" #spark.executor.memory = memory taken by spark on a machine
export SPARK_DAEMON_MEMORY="2g"

In the application UI, it says my driver has 295 MB memory. I am trying to broadcast a variable that is 0.15 gigs and it is throwing OutOfMemory errors, so I am trying to see if by increasing the driver memory I can fix this.

Thanks!
mrm
Reply | Threaded
Open this post in threaded view
|

Re: driver memory

mrm
Hi,

I figured out my problem so I wanted to share my findings. I was basically trying to broadcast an array with 4 million elements, and a size of approximatively 150 MB. Every time I was trying to broadcast, I got an OutOfMemory error. I fixed my problem by increasing the driver memory using:
export SPARK_MEM="2g"

Using SPARK_DAEMON_MEM or spark.executor.memory did not help in this case! I don't have a good understanding of all these settings and I have the feeling many people are in the same situation.
Reply | Threaded
Open this post in threaded view
|

Re: driver memory

Andrew Or-2
Hi Maria,

SPARK_MEM is actually a deprecated because it was too general; the reason it worked was because SPARK_MEM applies to everything (drivers, executors, masters, workers, history servers...). In favor of more specific configs, we broke this down into SPARK_DRIVER_MEMORY and SPARK_EXECUTOR_MEMORY and other environment variables and configs. Note that while "spark.executor.memory" is an equivalent config, "spark.driver.memory" is only used for YARN.

If you are using Spark 1.0+, the recommended way of specifying driver memory is through the "--driver-memory" command line argument of spark-submit. The equivalent also holds for executor memory (i.e. "--executor-memory").  That way you don't have to wrangle with the millions of overlapping configs / environment variables for all the deploy modes.

-Andrew


2014-07-23 4:18 GMT-07:00 mrm <[hidden email]>:
Hi,

I figured out my problem so I wanted to share my findings. I was basically
trying to broadcast an array with 4 million elements, and a size of
approximatively 150 MB. Every time I was trying to broadcast, I got an
OutOfMemory error. I fixed my problem by increasing the driver memory using:
export SPARK_MEM="2g"

Using SPARK_DAEMON_MEM or spark.executor.memory did not help in this case! I
don't have a good understanding of all these settings and I have the feeling
many people are in the same situation.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/driver-memory-tp10486p10489.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: driver memory

tgbaggio
Hi,

I am sorry for distributing you and thank you for your explication.
However, I find "spark.driver.memory" is used also for standalone.(I set this in spark/conf/spark-defaults.conf).

Cheers
Gen

Andrew Or-2 wrote
Hi Maria,

SPARK_MEM is actually a deprecated because it was too general; the reason
it worked was because SPARK_MEM applies to everything (drivers, executors,
masters, workers, history servers...). In favor of more specific configs,
we broke this down into SPARK_DRIVER_MEMORY and SPARK_EXECUTOR_MEMORY and
other environment variables and configs. Note that while
"spark.executor.memory" is an equivalent config, "spark.driver.memory" is
only used for YARN.

If you are using Spark 1.0+, the recommended way of specifying driver
memory is through the "--driver-memory" command line argument of
spark-submit. The equivalent also holds for executor memory (i.e.
"--executor-memory").  That way you don't have to wrangle with the millions
of overlapping configs / environment variables for all the deploy modes.

-Andrew


2014-07-23 4:18 GMT-07:00 mrm <[hidden email]>:

> Hi,
>
> I figured out my problem so I wanted to share my findings. I was basically
> trying to broadcast an array with 4 million elements, and a size of
> approximatively 150 MB. Every time I was trying to broadcast, I got an
> OutOfMemory error. I fixed my problem by increasing the driver memory
> using:
> export SPARK_MEM="2g"
>
> Using SPARK_DAEMON_MEM or spark.executor.memory did not help in this case!
> I
> don't have a good understanding of all these settings and I have the
> feeling
> many people are in the same situation.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/driver-memory-tp10486p10489.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>