disconnected from cluster; reconnecting gives java.net.BindException

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

disconnected from cluster; reconnecting gives java.net.BindException

Nick Chammas
So I was doing stuff in pyspark on a cluster in EC2. I got booted due to a network issue. I reconnect to the cluster and start up pyspark again. I get these warnings:

14/03/05 17:54:56 WARN component.AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use

Is this Bad™? Do I need to do anything? sc appears to be available as usual.

Nick

Reply | Threaded
Open this post in threaded view
|

Re: disconnected from cluster; reconnecting gives java.net.BindException

Nick Chammas
Whoopdeedoo, after just waiting for like an hour (well, I was doing other stuff) the process holding that address seems to have died automatically and now I can start up pyspark without any warnings.

Would there be a faster way to go through this than just wait around for the orphaned process to die?

Nick


On Wed, Mar 5, 2014 at 1:01 PM, Nicholas Chammas <[hidden email]> wrote:
So I was doing stuff in pyspark on a cluster in EC2. I got booted due to a network issue. I reconnect to the cluster and start up pyspark again. I get these warnings:

14/03/05 17:54:56 WARN component.AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use

Is this Bad™? Do I need to do anything? sc appears to be available as usual.

Nick



View this message in context: disconnected from cluster; reconnecting gives java.net.BindException
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: disconnected from cluster; reconnecting gives java.net.BindException

Nick Chammas
So this happened again today. As I noted before, the Spark shell starts up fine after I reconnect to the cluster, but this time around I tried opening a file and doing some processing. I get this message over and over (and can't do anything):

14/03/06 15:43:09 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

So I know that this message is related to my getting disconnected from the cluster while in the Spark shell, and after a while it should automatically clear up. 

But how I can I resolve this directly, without waiting? Looking at the cluster UI doesn't show me anything I know to use towards resolving this.

Nick



On Wed, Mar 5, 2014 at 3:12 PM, Nicholas Chammas <[hidden email]> wrote:
Whoopdeedoo, after just waiting for like an hour (well, I was doing other stuff) the process holding that address seems to have died automatically and now I can start up pyspark without any warnings.

Would there be a faster way to go through this than just wait around for the orphaned process to die?

Nick


On Wed, Mar 5, 2014 at 1:01 PM, Nicholas Chammas <[hidden email]> wrote:
So I was doing stuff in pyspark on a cluster in EC2. I got booted due to a network issue. I reconnect to the cluster and start up pyspark again. I get these warnings:

14/03/05 17:54:56 WARN component.AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use

Is this Bad™? Do I need to do anything? sc appears to be available as usual.

Nick