Memory used in Spark-0.9.0-incubating

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Memory used in Spark-0.9.0-incubating

王晓雨
ENV:
Spark:0.9.0-incubating
Hadoop:2.3.0

I run spark task on Yarn. I see the log in Nodemanager:
2014-09-25 17:43:34,141 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:37,171 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:40,210 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:43,239 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:46,269 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used

My task parameter is :
--num-workers 4 --master-memory 2g --worker-memory 4g --worker-cores 4
In my
opinion "--worker-memory 4g" 4g is the maximum memory for container .
But why "4.5 GB of 5 GB physical memory used" in the log?
And where to config "5G" maxinum memory for container?

-- 

WangXiaoyu
Reply | Threaded
Open this post in threaded view
|

Re: Memory used in Spark-0.9.0-incubating

王晓雨
My yarn-site.xml config:
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>16384</value>
</property>

ENV:
Spark:0.9.0-incubating
Hadoop:2.3.0

I run spark task on Yarn. I see the log in Nodemanager:
2014-09-25 17:43:34,141 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:37,171 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:40,210 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:43,239 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:46,269 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used

My task parameter is :
--num-workers 4 --master-memory 2g --worker-memory 4g --worker-cores 4
In my
opinion "--worker-memory 4g" 4g is the maximum memory for container .
But why "4.5 GB of 5 GB physical memory used" in the log?
And where to config "5G" maxinum memory for container?

-- 

WangXiaoyu
-------------------------------------------------------------------------------------
Reply | Threaded
Open this post in threaded view
|

Re: Memory used in Spark-0.9.0-incubating

Yi Tian
You should check the log of resource manager when you submit this job to yarn.

It will be recorded how many resources your spark application actually asked from resource manager for each container.

Did you use fair scheduler?

there is a config parameter of fair scheduler “yarn.scheduler.increment-allocation-mb”, default is 1024

it means if you ask 4097mb memory for a container, the resource manager will create a container which use 5120mb memory.

But I can’t figure out where 5GB come from.

Maybe there are some codes which mistake 1024 and 1000?

Best Regards,

Yi Tian
[hidden email]




On Sep 25, 2014, at 18:41, 王晓雨 <[hidden email]> wrote:

My yarn-site.xml config:
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>16384</value>
</property>

ENV:
Spark:0.9.0-incubating
Hadoop:2.3.0

I run spark task on Yarn. I see the log in Nodemanager:
2014-09-25 17:43:34,141 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:37,171 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:40,210 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:43,239 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:46,269 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used

My task parameter is :
--num-workers 4 --master-memory 2g --worker-memory 4g --worker-cores 4
In my
opinion "--worker-memory 4g" 4g is the maximum memory for container .
But why "4.5 GB of 5 GB physical memory used" in the log?
And where to config "5G" maxinum memory for container?

-- 

WangXiaoyu
-------------------------------------------------------------------------------------

Reply | Threaded
Open this post in threaded view
|

Re: Memory used in Spark-0.9.0-incubating

王晓雨
Thanks Yi Tian!

Yes, I use fair scheduler.
In resource manager log. I see the container's start shell:
/home/export/Data/hadoop/tmp/nm-local-dir/usercache/hpc/appcache/application_1411693809133_0002/container_1411693809133_0002_01_000002/launch_container.sh
In the end:
exec /bin/bash -c "$JAVA_HOME/bin/java -server  -XX:OnOutOfMemoryError='kill %p' -Xms4096m -Xmx4096m  -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Djava.io.tmpdir=$PWD/tmp  org.apache.spark.executor.CoarseGrainedExecutorBackend akka.tcp://spark@node99:6177/user/CoarseGrainedScheduler 2 node99 4 1> /home/export/Logs/yarn/application_1411693809133_0002/container_1411693809133_0002_01_000002/stdout 2> /home/export/Logs/yarn/application_1411693809133_0002/container_1411693809133_0002_01_000002/stderr"

the container's maximum memory is 4096m

And I see the source of ContainerImpl.java. The monitor print "5G" is from:
long pmemBytes = container.getResource().getMemory() * 1024 * 1024L;

The print method when value=L then print "5G".
So only container.getResource().getMemory()=5120 can print "5G"
I don't know where is the 1024K from!!!


在 2014年09月25日 21:43, Yi Tian 写道:
You should check the log of resource manager when you submit this job to yarn.

It will be recorded how many resources your spark application actually asked from resource manager for each container.

Did you use fair scheduler?

there is a config parameter of fair scheduler “yarn.scheduler.increment-allocation-mb”, default is 1024

it means if you ask 4097mb memory for a container, the resource manager will create a container which use 5120mb memory.

But I can’t figure out where 5GB come from.

Maybe there are some codes which mistake 1024 and 1000?

Best Regards,

Yi Tian
[hidden email]




On Sep 25, 2014, at 18:41, 王晓雨 <[hidden email]> wrote:

My yarn-site.xml config:
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>16384</value>
</property>

ENV:
Spark:0.9.0-incubating
Hadoop:2.3.0

I run spark task on Yarn. I see the log in Nodemanager:
2014-09-25 17:43:34,141 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:37,171 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:40,210 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:43,239 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used
2014-09-25 17:43:46,269 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 549 for container-id container_1411635522254_0001_01_000005: 4.5 GB of 5 GB physical memory used; 5.0 GB of 10.5 GB virtual memory used

My task parameter is :
--num-workers 4 --master-memory 2g --worker-memory 4g --worker-cores 4
In my
opinion "--worker-memory 4g" 4g is the maximum memory for container .
But why "4.5 GB of 5 GB physical memory used" in the log?
And where to config "5G" maxinum memory for container?

-- 

WangXiaoyu
-------------------------------------------------------------------------------------