Re: disconnected from cluster; reconnecting gives java.net.BindException
So this happened again today. As I noted before, the Spark shell starts up fine after I reconnect to the cluster, but this time around I tried opening a file and doing some processing. I get this message over and over (and can't do anything):
14/03/06 15:43:09 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
So I know that this message is related to my getting disconnected from the cluster while in the Spark shell, and after a while it should automatically clear up.
But how I can I resolve this directly, without waiting? Looking at the cluster UI doesn't show me anything I know to use towards resolving this.
On Wed, Mar 5, 2014 at 3:12 PM, Nicholas Chammas <[hidden email]> wrote:
Whoopdeedoo, after just waiting for like an hour (well, I was doing other stuff) the process holding that address seems to have died automatically and now I can start up pyspark without any warnings.
Would there be a faster way to go through this than just wait around for the orphaned process to die?
On Wed, Mar 5, 2014 at 1:01 PM, Nicholas Chammas <[hidden email]> wrote:
So I was doing stuff in pyspark on a cluster in EC2. I got booted due to a network issue. I reconnect to the cluster and start up pyspark again. I get these warnings: