Quantcast

ADD_JARS doubt.!!!!!

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

ADD_JARS doubt.!!!!!

Archit Thakur
Hi,

What does the parameter add_jars in the sc constructor exactly do?
Does it add all the files to the classpath of worker JVM?

I have some text files that I read data from while processing.
Can I add it in add jars so that it doesn't have to read it again from HDFS and read from local (Something like Distributed Cache in Hadoop Mapreduce). What path would I read it from?

Thanks and Regards,
Archit Thakur.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ADD_JARS doubt.!!!!!

Gary Malouf
I would not recommend putting your text files in via ADD_JARS.  The better thing to do is to put those files in HDFS or locally on your driver server, load them into memory and then use Spark's broadcast variable concept to spread the data out across the cluster.


On Mon, Dec 23, 2013 at 1:57 AM, Archit Thakur <[hidden email]> wrote:
Hi,

What does the parameter add_jars in the sc constructor exactly do?
Does it add all the files to the classpath of worker JVM?

I have some text files that I read data from while processing.
Can I add it in add jars so that it doesn't have to read it again from HDFS and read from local (Something like Distributed Cache in Hadoop Mapreduce). What path would I read it from?

Thanks and Regards,
Archit Thakur.

Loading...