Quantcast

Implementing .zip file codec

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Implementing .zip file codec

hemant
This post has NOT been accepted by the mailing list yet.
Hi,

I am able to read .gz and write files through spark csv using available codecs and getting expected result. But while trying to read and write .zip file spark is giving unexpected results like wV�J�.f�T n .


I have visited https://github.com/apache/hadoop/tree/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress, but didn't find any compression codec for .zip file.

I searched on stackoverflow but didn't get any satisfactory result for that.

I have also tried solution from http://stackoverflow.com/questions/28569788/how-to-open-stream-zip-files-through-spark

But my requirement is to read and write .zip file like we read csv files by providing codecs.
Ex: sc.read.option("","").schema("userdefinedschema").‌​format("customfomat").load("abc‌​.zip")

     dataframe.write().option("codec", "customzipcodec").format("customfomat").save("outputpath")

Please provide more information if anyone has faced same issue or have any solution on that.
Loading...