is saveAsTextFile in java uses buffered I/O streams?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

is saveAsTextFile in java uses buffered I/O streams?

Hussam_Jarada

 

Can someone provide me details on the spark java implementation of saveAsTextFile API if it uses buffered I/O streams or not and at what point is flush it buffers if they are used?

 

I remember from attending spark summit presentations that current spark release still uses buffered I/O streams and that an upcoming option to support unbuffered I/O streams upon writing data to local file or hdfs storage.

 

 

Thanks,

Hussam

Reply | Threaded
Open this post in threaded view
|

Re: is saveAsTextFile in java uses buffered I/O streams?

Matei Zaharia
Administrator
It just uses the Hadoop FileSystem API, I don’t think there’s any extra buffering. That API itself may do buffering in the HDFS case, though newer versions of HDFS fix that.

Matei

On Jan 9, 2014, at 2:54 PM, [hidden email] wrote:

 
Can someone provide me details on the spark java implementation of saveAsTextFile API if it uses buffered I/O streams or not and at what point is flush it buffers if they are used?
 
I remember from attending spark summit presentations that current spark release still uses buffered I/O streams and that an upcoming option to support unbuffered I/O streams upon writing data to local file or hdfs storage.
 
 
Thanks,
Hussam