[Spark streaming-Mesos-cluster mode] java.lang.RuntimeException: Stream jar not found

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
Report Content as Inappropriate

[Spark streaming-Mesos-cluster mode] java.lang.RuntimeException: Stream jar not found

This post has NOT been accepted by the mailing list yet.

I have a spark streaming job using hdfs and checkpointing components and running well on a standalone spark cluster with multi nodes, both in client and cluster deploy mode.
I would like to switch with Mesos cluster manager and submit job as cluster deploy mode.

First launch of the app is working well wheareas second launch (after kill) implying checkpoint recovery failed as :
java.lang.RuntimeException: Stream '/jars/application.jar' was not found.
at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:222)
This error occurs because the Driver that is in charge of exposing application jar to the executors, is trying to expose it from the jar path stored by the checkpoint (loaded from hdfs and stored in mesos workdir path = sandbox) that does not exist in the current node.

I'm confused by the dispatcher beehaviour. It's seems that there are functional gaps between checkpoint retrieving in spark streaming and the sandbox machinerie used by mesos cluster.

1. Why spark is using a rpc interface to expose application jar to executors when using hdfs, instead of executors are loading directly from source ?
2. How to fix this issue (if possible ?)

Versions : Mesos 1.2.0 spark 2.0.1 hdfs 2.7
More information, see stackoverflow issue here.