Applications for Spark on HDFS

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Applications for Spark on HDFS

Paul Schooss
Hello Folks, 

I was wondering if anyone had experience placing application jars for Spark onto HDFS. Currently I have distributing the jars manually and would love to source the jar via HDFS a la distributed caching with MR. Any ideas? 

Regards, 

Paul
Reply | Threaded
Open this post in threaded view
|

Re: Applications for Spark on HDFS

Sandy Ryza
Hi Paul,

What do you mean by distributing the jars manually?  If you register jars that are local to the client with SparkContext.addJars, Spark should handle distributing them to the workers.  Are you taking advantage of this?

-Sandy


On Tue, Mar 11, 2014 at 3:09 PM, Paul Schooss <[hidden email]> wrote:
Hello Folks, 

I was wondering if anyone had experience placing application jars for Spark onto HDFS. Currently I have distributing the jars manually and would love to source the jar via HDFS a la distributed caching with MR. Any ideas? 

Regards, 

Paul

Reply | Threaded
Open this post in threaded view
|

Re: Applications for Spark on HDFS

Paul Schooss
Thanks Sandy, 

I have not taken advantage of that yet but will research how to invoke that option when submitting the application to the spark master. Currently I am running a standalone spark master and using the run-class script to invoke the application we crafted as a test. 


On Tue, Mar 11, 2014 at 5:09 PM, Sandy Ryza <[hidden email]> wrote:
Hi Paul,

What do you mean by distributing the jars manually?  If you register jars that are local to the client with SparkContext.addJars, Spark should handle distributing them to the workers.  Are you taking advantage of this?

-Sandy


On Tue, Mar 11, 2014 at 3:09 PM, Paul Schooss <[hidden email]> wrote:
Hello Folks, 

I was wondering if anyone had experience placing application jars for Spark onto HDFS. Currently I have distributing the jars manually and would love to source the jar via HDFS a la distributed caching with MR. Any ideas? 

Regards, 

Paul