Task failure to read input files

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Task failure to read input files


I'm running Spark job on AWS EMR that reads many lzo files from a S3 bucket partitioned by date.
Sometimes I see errors in logs similar to 

18/04/13 11:53:52 WARN TaskSetManager: Lost task 151177.0 in stage 43.0 (TID 1516123, ip-10-10-2-6.ec2.internal, executor 57): java.io.IOException: Corrupted uncompressed block
	at com.hadoop.compression.lzo.LzopInputStream.verifyChecksums(LzopInputStream.java:219)
	at com.hadoop.compression.lzo.LzopInputStream.getCompressedData(LzopInputStream.java:284)
	at com.hadoop.compression.lzo.LzopInputStream.decompress(LzopInputStream.java:261)

I don't see the jobs fail. I assume this task succeeded when it is retried.
If the input file is actually corrupted even task retries should fail and eventually job will fail based on "spark.task.maxFailures" config rt?

Is there way to make Spark/Hadoop lzo library to print the full file name when such failures happen? So that I can then manually check if the file is indeed corrupted.