Spark on Yarn - A small issue !

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark on Yarn - A small issue !

Sai Prasanna
Hi All, 

I wanted to launch Spark on Yarn, interactive - yarn client mode.

With default settings of yarn-site.xml and spark-env.sh, i followed the given link 

I get the pi value correct when i run without launching the shell.

When i launch the shell, with following command,
SPARK_JAR=./assembly/target/scala-2.9.3/spark-assembly-0.8.1-incubating-hadoop2.3.0.jar \
SPARK_YARN_APP_JAR=examples/target/scala-2.9.3/spark-examples-assembly-0.8.1-incubating.jar \
MASTER=yarn-client ./spark-shell
And try to create RDDs and do some action on it, i get nothing. After sometime tasks fails.

LogFile of spark: 

519095 14/05/12 13:30:40 INFO YarnClientClusterScheduler: YarnClientClusterScheduler.postStartHook done

519096 14/05/12 13:30:40 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager s1:38355 with 324.4 MB RAM

519097 14/05/12 13:31:38 INFO MemoryStore: ensureFreeSpace(202584) called with curMem=0, maxMem=340147568

519098 14/05/12 13:31:38 INFO MemoryStore: Block broadcast_0 stored as values to memory (estimated size 197.8 KB, free 324.2 MB)

519099 14/05/12 13:31:49 INFO FileInputFormat: Total input paths to process : 1

519100 14/05/12 13:31:49 INFO NetworkTopology: Adding a new node: /default-rack/192.168.1.100:50010

519101 14/05/12 13:31:49 INFO SparkContext: Starting job: top at <console>:15

519102 14/05/12 13:31:49 INFO DAGScheduler: Got job 0 (top at <console>:15) with 4 output partitions (allowLocal=false)

519103 14/05/12 13:31:49 INFO DAGScheduler: Final stage: Stage 0 (top at <console>:15)

519104 14/05/12 13:31:49 INFO DAGScheduler: Parents of final stage: List()

519105 14/05/12 13:31:49 INFO DAGScheduler: Missing parents: List()

519106 14/05/12 13:31:49 INFO DAGScheduler: Submitting Stage 0 (MapPartitionsRDD[2] at top at <console>:15), which has no missing par       ents

519107 14/05/12 13:31:49 INFO DAGScheduler: Submitting 4 missing tasks from Stage 0 (MapPartitionsRDD[2] at top at <console>:15)

519108 14/05/12 13:31:49 INFO YarnClientClusterScheduler: Adding task set 0.0 with 4 tasks

519109 14/05/12 13:31:49 INFO RackResolver: Resolved s1 to /default-rack

519110 14/05/12 13:31:49 INFO ClusterTaskSetManager: Starting task 0.0:3 as TID 0 on executor 1: s1 (PROCESS_LOCAL)

519111 14/05/12 13:31:49 INFO ClusterTaskSetManager: Serialized task 0.0:3 as 1811 bytes in 4 ms

519112 14/05/12 13:31:49 INFO ClusterTaskSetManager: Starting task 0.0:0 as TID 1 on executor 1: s1 (NODE_LOCAL)

519113 14/05/12 13:31:49 INFO ClusterTaskSetManager: Serialized task 0.0:0 as 1811 bytes in 1 ms

519114 14/05/12 13:32:18 INFO YarnClientSchedulerBackend: Executor 1 disconnected, so removing it

519115 14/05/12 13:32:18 ERROR YarnClientClusterScheduler: Lost executor 1 on s1: remote Akka client shutdown

519116 14/05/12 13:32:18 INFO ClusterTaskSetManager: Re-queueing tasks for 1 from TaskSet 0.0

519117 14/05/12 13:32:18 WARN ClusterTaskSetManager: Lost TID 1 (task 0.0:0)

519118 14/05/12 13:32:18 WARN ClusterTaskSetManager: Lost TID 0 (task 0.0:3)

519119 14/05/12 13:32:18 INFO DAGScheduler: Executor lost: 1 (epoch 0)

519120 14/05/12 13:32:18 INFO BlockManagerMasterActor: Trying to remove executor 1 from BlockManagerMaster.

519121 14/05/12 13:32:18 INFO BlockManagerMaster: Removed 1 successfully in removeExecutor



 Do i need to set any other env-variable specifically for SPARK on YARN. What could be the isuue ??

Can anyone please help me in this regard.

Thanks in Advance !!




Reply | Threaded
Open this post in threaded view
|

Re: Spark on Yarn - A small issue !

Tom Graves
You need to look at the logs files for yarn.  Generally this can be done with "yarn logs -applicationId <your_app_id>".  That only works if you have log aggregation enabled though.   You should be able to see atleast the application master logs through the yarn resourcemanager web ui.  I would try that first. 

If that doesn't work you can turn on debug in the nodemanager:

To review per-container launch environment, increase yarn.nodemanager.delete.debug-delay-sec to a large value (e.g. 36000), and then access the application cache through yarn.nodemanager.local-dirs on the nodes on which containers are launched. This directory contains the launch script, jars, and all environment variables used for launching each container. This process is useful for debugging classpath problems in particular. (Note that enabling this requires admin privileges on cluster settings and a restart of all node managers. Thus, this is not applicable to hosted clusters).


Tom


On Monday, May 12, 2014 9:38 AM, Sai Prasanna <[hidden email]> wrote:
Hi All, 

I wanted to launch Spark on Yarn, interactive - yarn client mode.

With default settings of yarn-site.xml and spark-env.sh, i followed the given link 

I get the pi value correct when i run without launching the shell.

When i launch the shell, with following command,
SPARK_JAR=./assembly/target/scala-2.9.3/spark-assembly-0.8.1-incubating-hadoop2.3.0.jar \
SPARK_YARN_APP_JAR=examples/target/scala-2.9.3/spark-examples-assembly-0.8.1-incubating.jar \
MASTER=yarn-client ./spark-shell
And try to create RDDs and do some action on it, i get nothing. After sometime tasks fails.

LogFile of spark: 
519095 14/05/12 13:30:40 INFO YarnClientClusterScheduler: YarnClientClusterScheduler.postStartHook done
519096 14/05/12 13:30:40 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager s1:38355 with 324.4 MB RAM
519097 14/05/12 13:31:38 INFO MemoryStore: ensureFreeSpace(202584) called with curMem=0, maxMem=340147568
519098 14/05/12 13:31:38 INFO MemoryStore: Block broadcast_0 stored as values to memory (estimated size 197.8 KB, free 324.2 MB)
519099 14/05/12 13:31:49 INFO FileInputFormat: Total input paths to process : 1
519100 14/05/12 13:31:49 INFO NetworkTopology: Adding a new node: /default-rack/192.168.1.100:50010
519101 14/05/12 13:31:49 INFO SparkContext: Starting job: top at <console>:15
519102 14/05/12 13:31:49 INFO DAGScheduler: Got job 0 (top at <console>:15) with 4 output partitions (allowLocal=false)
519103 14/05/12 13:31:49 INFO DAGScheduler: Final stage: Stage 0 (top at <console>:15)
519104 14/05/12 13:31:49 INFO DAGScheduler: Parents of final stage: List()
519105 14/05/12 13:31:49 INFO DAGScheduler: Missing parents: List()
519106 14/05/12 13:31:49 INFO DAGScheduler: Submitting Stage 0 (MapPartitionsRDD[2] at top at <console>:15), which has no missing par       ents
519107 14/05/12 13:31:49 INFO DAGScheduler: Submitting 4 missing tasks from Stage 0 (MapPartitionsRDD[2] at top at <console>:15)
519108 14/05/12 13:31:49 INFO YarnClientClusterScheduler: Adding task set 0.0 with 4 tasks
519109 14/05/12 13:31:49 INFO RackResolver: Resolved s1 to /default-rack
519110 14/05/12 13:31:49 INFO ClusterTaskSetManager: Starting task 0.0:3 as TID 0 on executor 1: s1 (PROCESS_LOCAL)
519111 14/05/12 13:31:49 INFO ClusterTaskSetManager: Serialized task 0.0:3 as 1811 bytes in 4 ms
519112 14/05/12 13:31:49 INFO ClusterTaskSetManager: Starting task 0.0:0 as TID 1 on executor 1: s1 (NODE_LOCAL)
519113 14/05/12 13:31:49 INFO ClusterTaskSetManager: Serialized task 0.0:0 as 1811 bytes in 1 ms
519114 14/05/12 13:32:18 INFO YarnClientSchedulerBackend: Executor 1 disconnected, so removing it
519115 14/05/12 13:32:18 ERROR YarnClientClusterScheduler: Lost executor 1 on s1: remote Akka client shutdown
519116 14/05/12 13:32:18 INFO ClusterTaskSetManager: Re-queueing tasks for 1 from TaskSet 0.0
519117 14/05/12 13:32:18 WARN ClusterTaskSetManager: Lost TID 1 (task 0.0:0)
519118 14/05/12 13:32:18 WARN ClusterTaskSetManager: Lost TID 0 (task 0.0:3)
519119 14/05/12 13:32:18 INFO DAGScheduler: Executor lost: 1 (epoch 0)
519120 14/05/12 13:32:18 INFO BlockManagerMasterActor: Trying to remove executor 1 from BlockManagerMaster.
519121 14/05/12 13:32:18 INFO BlockManagerMaster: Removed 1 successfully in removeExecutor


 Do i need to set any other env-variable specifically for SPARK on YARN. What could be the isuue ??

Can anyone please help me in this regard.

Thanks in Advance !!






Reply | Threaded
Open this post in threaded view
|

Re: Spark on Yarn - A small issue !

martinxu
This post has NOT been accepted by the mailing list yet.
In reply to this post by Sai Prasanna
I encounted  the same issue when I run shark 0.9 on yarn
about 1/3 executor lost,    any way to fix this?