SocketTimeoutException with spark-r and using latest R version

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

SocketTimeoutException with spark-r and using latest R version

Thijs Haarhuis

Hi all,

 

I am running into a problem that once in a while my job is giving me the following exception(s):

java.net.SocketTimeoutException: Accept timed out

                at java.net.PlainSocketImpl.socketAccept(Native Method)

                at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)

                at java.net.ServerSocket.implAccept(ServerSocket.java:545)

                at java.net.ServerSocket.accept(ServerSocket.java:513)

                at org.apache.spark.api.r.RRunner.compute(RRunner.scala:77)

                at org.apache.spark.sql.execution.FlatMapGroupsInRExec$$anonfun$13.apply(objects.scala:436)

                at org.apache.spark.sql.execution.FlatMapGroupsInRExec$$anonfun$13.apply(objects.scala:418)

                at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)

                at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)

                at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)

                at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)

                at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)

                at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)

                at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)

                at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)

                at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)

                at org.apache.spark.scheduler.Task.run(Task.scala:108)

                at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)

                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

                at java.lang.Thread.run(Thread.java:748)

18/10/16 08:47:18:388 INFO CoarseGrainedExecutorBackend: Got assigned task 2059

18/10/16 08:47:18:388 INFO Executor: Running task 22.0 in stage 21.0 (TID 2059)

18/10/16 08:47:18:391 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 1 blocks

18/10/16 08:47:18:391 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms

18/10/16 08:47:18:394 ERROR Executor: Exception in task 22.0 in stage 21.0 (TID 2059)

java.net.SocketException: Broken pipe (Write failed)

                at java.net.SocketOutputStream.socketWrite0(Native Method)

                at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)

                at java.net.SocketOutputStream.write(SocketOutputStream.java:155)

 

It does not happen all the time, but after 15 times or so.

I am using Apache Livy for submitting R scripts to the cluster which is running on CentOS 7.5 with Spark Version 2.2.1 and R 3.5.1

 

I am using another system with an older CentOS version 6.6 and R 3.2.2 which is running very stable without any problems and using the same R script.

 

Is it possible that this R version is somehow not compatible with spark 2.2.1?

 

Thanks

Thijs