Troubles with the Spark-EC2 stuff

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Troubles with the Spark-EC2 stuff

Guillaume Pitel
Hi,

I'm making my first steps on EC2 (using 0.8.1 bin for CDH4) and some problems occured. First one is that once the cluster is created, the script cannot find it again for login, destroying and so on. Not a big deal, I can do that manually, but it's annoying.

Second problem is not really related to spark but to hdfs/mapreduce. I want to make a hadoop distcp from S3 to the local ephemeral HDFS. The distcp fails because there's no mapreduce running.

Questions :

- anyone has advice about a better way to copy from S3 to hdfs, or a way to make distcp work ?
- any idea why the spark-ec2 cannot find the clusters back ?

Thanks in advance for any experience and advices !

Guillaume
--
eXenSa
Guillaume PITEL, Président
+33(0)6 25 48 86 80 / +33(0)9 70 44 67 53

eXenSa S.A.S.
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05
Reply | Threaded
Open this post in threaded view
|

Re: Troubles with the Spark-EC2 stuff

Josh Rosen
For the second problem, just start Hadoop MapReduce before running distcp:

/root/ephemeral-hadoop/bin/start-all.sh




On Sat, Jan 4, 2014 at 12:54 PM, Guillaume Pitel <[hidden email]> wrote:
Hi,

I'm making my first steps on EC2 (using 0.8.1 bin for CDH4) and some problems occured. First one is that once the cluster is created, the script cannot find it again for login, destroying and so on. Not a big deal, I can do that manually, but it's annoying.

Second problem is not really related to spark but to hdfs/mapreduce. I want to make a hadoop distcp from S3 to the local ephemeral HDFS. The distcp fails because there's no mapreduce running.

Questions :

- anyone has advice about a better way to copy from S3 to hdfs, or a way to make distcp work ?
- any idea why the spark-ec2 cannot find the clusters back ?

Thanks in advance for any experience and advices !

Guillaume
--
eXenSa
Guillaume PITEL, Président
<a href="tel:%2B33%280%296%2025%2048%2086%2080" value="+33625488680" target="_blank">+33(0)6 25 48 86 80 / <a href="tel:%2B33%280%299%2070%2044%2067%2053" value="+33970446753" target="_blank">+33(0)9 70 44 67 53

eXenSa S.A.S.
41, rue Périer - 92120 Montrouge - FRANCE
Tel <a href="tel:%2B33%280%291%2084%2016%2036%2077" value="+33184163677" target="_blank">+33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05

Reply | Threaded
Open this post in threaded view
|

Re: Troubles with the Spark-EC2 stuff

Guillaume Pitel
Hi,

Thanks, it wasn't actually the problem but your suggestion made me found it. I've started the cluster with hadoop v2, which seems not to include mapred with it (while I think it's still possible to have it). Wondering if there's a dictcp using yarn, now...

And btw, one just need to bin/start-mapred.sh

Thanks again

Guillaume



For the second problem, just start Hadoop MapReduce before running distcp:

/root/ephemeral-hadoop/bin/start-all.sh



--
eXenSa
Guillaume PITEL, Président
+33(0)6 25 48 86 80 / +33(0)9 70 44 67 53

eXenSa S.A.S.
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05
Reply | Threaded
Open this post in threaded view
|

Re: Troubles with the Spark-EC2 stuff

Patrick Wendell
Look in /root/mapreduce. This is different for hadoop2 clusters because mapreduce is now distributed as a separate project. 


On Sat, Jan 4, 2014 at 2:04 PM, Guillaume Pitel <[hidden email]> wrote:
Hi,

Thanks, it wasn't actually the problem but your suggestion made me found it. I've started the cluster with hadoop v2, which seems not to include mapred with it (while I think it's still possible to have it). Wondering if there's a dictcp using yarn, now...

And btw, one just need to bin/start-mapred.sh

Thanks again

Guillaume



For the second problem, just start Hadoop MapReduce before running distcp:

/root/ephemeral-hadoop/bin/start-all.sh



--
eXenSa
Guillaume PITEL, Président
<a href="tel:%2B33%280%296%2025%2048%2086%2080" value="+33625488680" target="_blank">+33(0)6 25 48 86 80 / <a href="tel:%2B33%280%299%2070%2044%2067%2053" value="+33970446753" target="_blank">+33(0)9 70 44 67 53

eXenSa S.A.S.
41, rue Périer - 92120 Montrouge - FRANCE
Tel <a href="tel:%2B33%280%291%2084%2016%2036%2077" value="+33184163677" target="_blank">+33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05