Encoding not working when using a map / mapPartitions call

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Encoding not working when using a map / mapPartitions call

ccaspanello
Attached you will find a project with unit tests showing the issue at hand.

If I read in a ISO-8859-1 encoded file and simply write out what was read;
the contents in the part file matches what was read.  Which is great.

However, the second I use a map / mapPartitions function it looks like the
encoding is not correct.  In addition a simple collectAsList and writing
that list of strings to a file does not work either.  I don't think I'm
doing anything wrong.  Can someone please investigate?  I think this is a
bug.

spark-sandbox.zip
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t7751/spark-sandbox.zip>  



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]