Why chinese character gash appear when i use spark textFile?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Why chinese character gash appear when i use spark textFile?

JoneZhang
This post has NOT been accepted by the mailing list yet.
var textFile = sc.textFile("xxx");
textFile.first();
res1: String = 1.0      862910025238798 100733314       18_?????:100733314      8919173c6d49abfab02853458247e584        1:129:18_?????:1.0


hadoop fs -cat xxx
1.0     862910025238798 100733314       18_百度输入法:100733314 8919173c6d49abfab02853458247e584        1:129:18_百度输入法:1.0

Why  chinese character gash appear when i use spark textFile?
The code of hdfs file is utf-8.


Thanks