RDD.saveAs...

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

RDD.saveAs...

Koert Kuipers

I find the current design to write RDDs to disk (or a database, etc) kind of ugly. It will lead to a proliferation of saveAs methods. A better abstraction would be nice (perhaps a Sink trait to write to)

Reply | Threaded
Open this post in threaded view
|

Re: RDD.saveAs...

Matei Zaharia
Administrator
I agree that we can’t keep adding these to the core API, partly because it will get unwieldy to maintain and partly just because each storage system will bring in lots of dependencies. We can simply have helper classes in different modules for each storage system. There’s some discussion on this at https://spark-project.atlassian.net/browse/SPARK-1127.

Matei

On Mar 11, 2014, at 9:06 AM, Koert Kuipers <[hidden email]> wrote:

I find the current design to write RDDs to disk (or a database, etc) kind of ugly. It will lead to a proliferation of saveAs methods. A better abstraction would be nice (perhaps a Sink trait to write to)