Spark application development work flow in scala

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Spark application development work flow in scala

Aureliano Buendia
Hi,

What's a typical work flow of spark application development in scala?

One option is to write a scala application with a main function, and keep executing the app after every development change. Given a big overhead of a moderately sized development data, this could mean slow iterations.

Another option is to somehow initialize the data in REPL, and keep the development inside REPL. This would mean faster development iterations, however, it's not clear to me how to keep the code in sync with REPL. Do you just copy/paste the code into REPL, or is it possible to compile the code into jar, and keep reloading the jar in REPL?

Any other ways of doing this?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark application development work flow in scala

Mayur Rustagi
I typically use the main method and test driven approach, for most simple application that works out pretty well. Another technique is to create a jar containing the complex functionality and test it. Create another jar just for streaming/processing that hooks into it and handles all the data flow. Then integrate the two. None of this feels like a production development process :)
Regards
Mayur



On Tue, Dec 24, 2013 at 3:14 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,

What's a typical work flow of spark application development in scala?

One option is to write a scala application with a main function, and keep executing the app after every development change. Given a big overhead of a moderately sized development data, this could mean slow iterations.

Another option is to somehow initialize the data in REPL, and keep the development inside REPL. This would mean faster development iterations, however, it's not clear to me how to keep the code in sync with REPL. Do you just copy/paste the code into REPL, or is it possible to compile the code into jar, and keep reloading the jar in REPL?

Any other ways of doing this?

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spark application development work flow in scala

Hossein
In reply to this post by Aureliano Buendia
d of a moderately sized development data, this could mean slow iterations.

Another option is to somehow initialize the data in REPL, and keep the development inside REPL. This would mean faster development iterations, however, it's not clear to me how to keep the code in sync with REPL. Do you just copy/paste the code into REPL, or is it possible to compile the code into jar, and keep reloading the jar in REPL?

I have been using Vim + Conque Shell for this purpose. With this plugin you can split your Vim window to two, and run spark shell in one of them. You can write your code in the other editor window and execute it line by line.

I am sure there are equivalents in Eclipse.
 
Loading...