spark-ec2 - HDFS doesn't start on AWS EC2 cluster

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

spark-ec2 - HDFS doesn't start on AWS EC2 cluster

Jan Warchoł
Hi all,

I'm trying to launch an EC2 cluster using spark-ec2 script, and it seems that the script fails to configure HDFS properly.  What's most puzzling is that it did work perfectly on Sunday.  Here's the command i'm using:

./spark-1.1.0-bin-hadoop2.4/ec2/spark-ec2 \
-k xxxxxxxxxxxxxx \
-i xxxxxxxxxxxxxx \
-s 5 --instance-type c3.xlarge --spot-price=0.15 \
--spark-version=1.0.0 --region us-west-2 \
launch xxxxxxxxxxxxxx

When i used this command on Sunday (the only difference was the number of slaves), it ran for a few minutes without any intervention and created a cluster with working HDFS set up across all nodes.  Today the results are quite different:

Firstly, i can see an error about JAVA_HOME in the script output when launching ephemeral HDFS - here's the relevant part of the output:

RSYNC'ing /root/ephemeral-hdfs/conf to slaves...
xxxxxxxxxxxxxx
xxxxxxxxxxxxxx
Formatting ephemeral HDFS namenode...
14/10/08 09:48:09 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = ip-xxxxxxxxxxxxxx
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.0.4
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
************************************************************/
14/10/08 09:48:09 INFO util.GSet: VM type       = 64-bit
14/10/08 09:48:09 INFO util.GSet: 2% max memory = 17.78 MB
14/10/08 09:48:09 INFO util.GSet: capacity      = 2^21 = 2097152 entries
14/10/08 09:48:09 INFO util.GSet: recommended=2097152, actual=2097152
14/10/08 09:48:09 INFO namenode.FSNamesystem: fsOwner=root
14/10/08 09:48:09 INFO namenode.FSNamesystem: supergroup=supergroup
14/10/08 09:48:09 INFO namenode.FSNamesystem: isPermissionEnabled=true
14/10/08 09:48:09 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
14/10/08 09:48:09 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
14/10/08 09:48:09 INFO namenode.NameNode: Caching file names occuring more than 10 times 
14/10/08 09:48:09 INFO common.Storage: Image file of size 110 saved in 0 seconds.
14/10/08 09:48:09 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
14/10/08 09:48:09 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ip-xxxxxxxxxxxxxx
************************************************************/
Starting ephemeral HDFS...
./ephemeral-hdfs/setup.sh: line 31: /root/ephemeral-hdfs/sbin/start-dfs.sh: No such file or directory
starting namenode, logging to /root/ephemeral-hdfs/libexec/../logs/hadoop-root-namenode-.out
localhost: starting datanode, logging to /root/ephemeral-hdfs/libexec/../logs/hadoop-root-datanode-ip-xxxxxxxxxxxxxx.out
localhost: Error: JAVA_HOME is not set.
localhost: starting secondarynamenode, logging to /root/ephemeral-hdfs/libexec/../logs/hadoop-root-secondarynamenode-ip-xxxxxxxxxxxxxx.out
localhost: Error: JAVA_HOME is not set.

(unfortunately i don't have the log from Sunday to compare...)

Secondly, right after that i'm asked about formattig persistent HDFS:

Setting up persistent-hdfs
~/spark-ec2/persistent-hdfs ~/spark-ec2
Pseudo-terminal will not be allocated because stdin is not a terminal.
Pseudo-terminal will not be allocated because stdin is not a terminal.
RSYNC'ing /root/persistent-hdfs/conf to slaves...
Formatting persistent HDFS namenode...
14/10/08 10:11:05 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = ip-172-31-5-156/172.31.5.156
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.0.4
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
************************************************************/
Re-format filesystem in /tmp/hadoop-root/dfs/name ? (Y or N)

And i haven't been asked this question at all when running this script earlier!

Regardless of what i answer, the end result is that the HDFS in the cluster is not working correctly, i.e. it seems that instead of being set up across all nodes, it points to a local directory on master node.  For example, when i ssh to cluster master and run `/root/ephemeral-hdfs/bin/hadoop fs -du .` i should see paths like

hdfs://namenode-ip:9000/folder-on-hdfs
hdfs://namenode-ip:9000/file-on-hdfs
etc.

but i see 

file:/root/....

and it seems that the elements listed there are identical to contents of /root dir in local filesystem.

Why the script could be behaving differrently than before, and what can i do to fix this?

best,
Jan Warchoł

--
Jan Warchoł
Data Engineer


-----------------------------------------
M: +48 509 078 203
E: [hidden email]
-----------------------------------------

CodiLime Sp. z o.o. - Ltd. company with its registered office in Poland, 01-167 Warsaw, ul. Zawiszy 14/97. Registered by The District Court for the Capital City of Warsaw, XII Commercial Department of the National Court Register. Entered into National Court Register under No. KRS 0000388871. Tax identification number (NIP) 5272657478. Statistical number (REGON) 142974628.

-----------------------------------------

Reply | Threaded
Open this post in threaded view
|

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

pierrelanvin
This post was updated on .
CONTENTS DELETED
The author has deleted this message.
mrm
Reply | Threaded
Open this post in threaded view
|

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

mrm
This post was updated on .
I have the same problems. I get asked the questions about hdfs, and then the cluster finishes launching. However, when I am running an application later, the webUI only shows one executor (the driver node). Anybody knows what is happening?
Reply | Threaded
Open this post in threaded view
|

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

in4maniac
This post has NOT been accepted by the mailing list yet.
In reply to this post by Jan Warchoł
I have the same problem  too. Cant reach any of the slaves :(
mrm
Reply | Threaded
Open this post in threaded view
|

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

mrm
In reply to this post by Jan Warchoł
Has anybody found a workaround for this? It would be great if you could share it!
Reply | Threaded
Open this post in threaded view
|

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

pierrelanvin
This post has NOT been accepted by the mailing list yet.
I think I found something :
https://github.com/mesos/spark-ec2/commit/de9ed51c359926ce32b337e35374b3c23cbcb9fc
This has been commited 21 hours ago.
I cannot test right now but maybe someone could test again setting s3 key and secret before launching the cluster ? I will mention this post on the github pull request so that people know something is broken (github pull request : https://github.com/mesos/spark-ec2/pull/58)
mrm
Reply | Threaded
Open this post in threaded view
|

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

mrm
They reverted to a previous version of the spark-ec2 script and things are working again!
Reply | Threaded
Open this post in threaded view
|

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

Nick Chammas

Yup, though to be clear, Josh reverted a change to a hosted script that spark-ec2 references. The spark-ec2 script y’all are running locally hasn’t changed, obviously.


On Wed, Oct 8, 2014 at 12:20 PM, mrm <[hidden email]> wrote:
They reverted to a previous version of the spark-ec2 script and things are
working again!



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-ec2-HDFS-doesn-t-start-on-AWS-EC2-cluster-tp15921p15945.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

Jan Warchoł
Thanks for explanation, i was going to ask exactly about this :)

On Wed, Oct 8, 2014 at 6:23 PM, Nicholas Chammas <[hidden email]> wrote:

Yup, though to be clear, Josh reverted a change to a hosted script that spark-ec2 references. The spark-ec2 script y’all are running locally hasn’t changed, obviously.


On Wed, Oct 8, 2014 at 12:20 PM, mrm <[hidden email]> wrote:
They reverted to a previous version of the spark-ec2 script and things are
working again!



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-ec2-HDFS-doesn-t-start-on-AWS-EC2-cluster-tp15921p15945.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]





--
Jan Warchoł
Data Engineer


-----------------------------------------
M: +48 509 078 203
E: [hidden email]
-----------------------------------------

CodiLime Sp. z o.o. - Ltd. company with its registered office in Poland, 01-167 Warsaw, ul. Zawiszy 14/97. Registered by The District Court for the Capital City of Warsaw, XII Commercial Department of the National Court Register. Entered into National Court Register under No. KRS 0000388871. Tax identification number (NIP) 5272657478. Statistical number (REGON) 142974628.

-----------------------------------------

The information in this email is confidential and may be legally privileged, it may contain information that is confidential in CodiLime Sp. z o.o. It is intended solely for the addressee. Any access to this email by third parties is unauthorized. If you are not the intended recipient of this message, any disclosure, copying, distribution or any action undertaken or neglected in reliance thereon is prohibited and may result in your liability for damages.

Reply | Threaded
Open this post in threaded view
|

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

Akhil
Revert the script to an older version.

Thanks
Best Regards

On Wed, Oct 8, 2014 at 9:57 PM, Jan Warchoł <[hidden email]> wrote:
Thanks for explanation, i was going to ask exactly about this :)

On Wed, Oct 8, 2014 at 6:23 PM, Nicholas Chammas <[hidden email]> wrote:

Yup, though to be clear, Josh reverted a change to a hosted script that spark-ec2 references. The spark-ec2 script y’all are running locally hasn’t changed, obviously.


On Wed, Oct 8, 2014 at 12:20 PM, mrm <[hidden email]> wrote:
They reverted to a previous version of the spark-ec2 script and things are
working again!



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-ec2-HDFS-doesn-t-start-on-AWS-EC2-cluster-tp15921p15945.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]





--
Jan Warchoł
Data Engineer


-----------------------------------------
M: +48 509 078 203
E: [hidden email]
-----------------------------------------

CodiLime Sp. z o.o. - Ltd. company with its registered office in Poland, 01-167 Warsaw, ul. Zawiszy 14/97. Registered by The District Court for the Capital City of Warsaw, XII Commercial Department of the National Court Register. Entered into National Court Register under No. KRS 0000388871. Tax identification number (NIP) 5272657478. Statistical number (REGON) 142974628.

-----------------------------------------

The information in this email is confidential and may be legally privileged, it may contain information that is confidential in CodiLime Sp. z o.o. It is intended solely for the addressee. Any access to this email by third parties is unauthorized. If you are not the intended recipient of this message, any disclosure, copying, distribution or any action undertaken or neglected in reliance thereon is prohibited and may result in your liability for damages.