Spark not working with Hadoop 4mc compression

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark not working with Hadoop 4mc compression

Abhijeet Kumar
Hello,

I’m using 4mc compression in my Hadoop and when I’m reading file from hdfs, it’s throwing error.


I’m doing simple query in 
sc.textFile("/store.csv").getNumPartitions

Error:
java.lang.RuntimeException: Error in configuring object
  at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
  at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
  at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
  at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:187)
  at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
  at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
  at org.apache.spark.rdd.RDD.getNumPartitions(RDD.scala:267)
  ... 49 elided
Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
  ... 63 more
Caused by: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
  at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:139)
  at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:180)
  at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)
  ... 68 more
Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found
  at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
  at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132)
  ... 70 more


Thank you,
Abhijeet Kumar
Reply | Threaded
Open this post in threaded view
|

Re: Spark not working with Hadoop 4mc compression

Jiaan Geng
I think com.hadoop.compression.lzo.LzoCodec not in spark classpaths,please
put suitable hadoop-lzo.jar into directory spark_home/jars/.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]