WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Aureliano Buendia
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?
Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Aureliano Buendia
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?

Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Aureliano Buendia
The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?


Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Matei Zaharia
Administrator
Have you looked at the cluster UI, and do you see any workers registered there, and your application under running applications? Maybe you typed in the wrong master URL or something like that.

Matei

On Jan 8, 2014, at 7:07 PM, Aureliano Buendia <[hidden email]> wrote:

The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?



Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Aureliano Buendia



On Thu, Jan 9, 2014 at 3:59 AM, Matei Zaharia <[hidden email]> wrote:
Have you looked at the cluster UI, and do you see any workers registered there, and your application under running applications? Maybe you typed in the wrong master URL or something like that.

No, it's automated: cat spark-ec2/cluster-url

I think the problem might be caused by spark-class script. It seems to assign too much memory.

I forgot the fact that run-example doesn't use spark-class.
 

Matei

On Jan 8, 2014, at 7:07 PM, Aureliano Buendia <[hidden email]> wrote:

The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?




Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Matei Zaharia
Administrator
Oh, you shouldn’t use spark-class for your own classes. Just build your job separately and submit it by running it with “java” and creating a SparkContext in it. spark-class is designed to run classes internal to the Spark project.

Matei

On Jan 8, 2014, at 8:06 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 3:59 AM, Matei Zaharia <[hidden email]> wrote:
Have you looked at the cluster UI, and do you see any workers registered there, and your application under running applications? Maybe you typed in the wrong master URL or something like that.

No, it's automated: cat spark-ec2/cluster-url

I think the problem might be caused by spark-class script. It seems to assign too much memory.

I forgot the fact that run-example doesn't use spark-class.
 

Matei

On Jan 8, 2014, at 7:07 PM, Aureliano Buendia <[hidden email]> wrote:

The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?





Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Aureliano Buendia



On Thu, Jan 9, 2014 at 4:11 AM, Matei Zaharia <[hidden email]> wrote:
Oh, you shouldn’t use spark-class for your own classes. Just build your job separately and submit it by running it with “java” and creating a SparkContext in it. spark-class is designed to run classes internal to the Spark project.

Really? Apparently Eugen runs his jobs by:

$SPARK_HOME/spark-class SPARK_CLASSPATH=PathToYour.jar com.myproject.MyJob

, as he instructed me here to do this.

I have to say while spark documentation is not sparse, it does not address enough, and as you can see the community is confused.

Are the spark users supposed to create something like run-example for their own jobs?
 

Matei

On Jan 8, 2014, at 8:06 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 3:59 AM, Matei Zaharia <[hidden email]> wrote:
Have you looked at the cluster UI, and do you see any workers registered there, and your application under running applications? Maybe you typed in the wrong master URL or something like that.

No, it's automated: cat spark-ec2/cluster-url

I think the problem might be caused by spark-class script. It seems to assign too much memory.

I forgot the fact that run-example doesn't use spark-class.
 

Matei

On Jan 8, 2014, at 7:07 PM, Aureliano Buendia <[hidden email]> wrote:

The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?






Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Matei Zaharia
Administrator
Just follow the docs at http://spark.incubator.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala for how to run an application. Spark is designed so that you can simply run your application *without* any scripts whatsoever, and submit your JAR to the SparkContext constructor, which will distribute it. You can launch your application with “scala”, “java”, or whatever tool you’d prefer.

Matei

On Jan 8, 2014, at 8:26 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 4:11 AM, Matei Zaharia <[hidden email]> wrote:
Oh, you shouldn’t use spark-class for your own classes. Just build your job separately and submit it by running it with “java” and creating a SparkContext in it. spark-class is designed to run classes internal to the Spark project.

Really? Apparently Eugen runs his jobs by:

$SPARK_HOME/spark-class SPARK_CLASSPATH=PathToYour.jar com.myproject.MyJob

, as he instructed me here to do this.

I have to say while spark documentation is not sparse, it does not address enough, and as you can see the community is confused.

Are the spark users supposed to create something like run-example for their own jobs?
 

Matei

On Jan 8, 2014, at 8:06 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 3:59 AM, Matei Zaharia <[hidden email]> wrote:
Have you looked at the cluster UI, and do you see any workers registered there, and your application under running applications? Maybe you typed in the wrong master URL or something like that.

No, it's automated: cat spark-ec2/cluster-url

I think the problem might be caused by spark-class script. It seems to assign too much memory.

I forgot the fact that run-example doesn't use spark-class.
 

Matei

On Jan 8, 2014, at 7:07 PM, Aureliano Buendia <[hidden email]> wrote:

The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?







Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Aureliano Buendia



On Thu, Jan 9, 2014 at 5:01 AM, Matei Zaharia <[hidden email]> wrote:
Just follow the docs at http://spark.incubator.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala for how to run an application. Spark is designed so that you can simply run your application *without* any scripts whatsoever, and submit your JAR to the SparkContext constructor, which will distribute it. You can launch your application with “scala”, “java”, or whatever tool you’d prefer.

I'm afraid what you said about 'simply run your application *without* any scripts whatsoever' does not apply to spark at the moment, and it simply does not work.

Try the simple Pi calculation this on a standard spark-ec2 instance:

java -cp /root/spark/examples/target/spark-examples_2.9.3-0.8.1-incubating.jar:/root/spark/assembltarget/scala-2.9.3/spark-assembly_2.9.3-0.8.1-incubating-hadoop1.0.4.jar org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

And you'll get the error:

WARN cluster.ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

While the script way works:

spark/run-example org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

What am I missing in the above java command?
 

Matei

On Jan 8, 2014, at 8:26 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 4:11 AM, Matei Zaharia <[hidden email]> wrote:
Oh, you shouldn’t use spark-class for your own classes. Just build your job separately and submit it by running it with “java” and creating a SparkContext in it. spark-class is designed to run classes internal to the Spark project.

Really? Apparently Eugen runs his jobs by:

$SPARK_HOME/spark-class SPARK_CLASSPATH=PathToYour.jar com.myproject.MyJob

, as he instructed me here to do this.

I have to say while spark documentation is not sparse, it does not address enough, and as you can see the community is confused.

Are the spark users supposed to create something like run-example for their own jobs?
 

Matei

On Jan 8, 2014, at 8:06 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 3:59 AM, Matei Zaharia <[hidden email]> wrote:
Have you looked at the cluster UI, and do you see any workers registered there, and your application under running applications? Maybe you typed in the wrong master URL or something like that.

No, it's automated: cat spark-ec2/cluster-url

I think the problem might be caused by spark-class script. It seems to assign too much memory.

I forgot the fact that run-example doesn't use spark-class.
 

Matei

On Jan 8, 2014, at 7:07 PM, Aureliano Buendia <[hidden email]> wrote:

The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?








Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Aureliano Buendia
The java command worked when I set SPARK_HOME and SPARK_EXAMPLES_JAR values.

There are many issues regarding the Initial job has not accepted any resources... error though:
  • When I put my assembly jar before spark-assembly_2.9.3-0.8.1-incubating-hadoop1.0.4.jar, this error happens. Moving my jar after the spark-assembly it works fine.
    In my case, I need to put my jar before spark-assembly, as my jar uses protobuf 2.5 and spark-assembly comes with protobuf 2.4.
  • Sometimes when this error happens the whole cluster server must be restarted, or even run-example script wouldn't work. It took me a while to find this out, making debugging very time consuming.
  • The error message is absolutely irrelevant.

I guess the problem should be somewhere with the spark context jar delivery part.



On Thu, Jan 9, 2014 at 4:17 PM, Aureliano Buendia <[hidden email]> wrote:



On Thu, Jan 9, 2014 at 5:01 AM, Matei Zaharia <[hidden email]> wrote:
Just follow the docs at http://spark.incubator.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala for how to run an application. Spark is designed so that you can simply run your application *without* any scripts whatsoever, and submit your JAR to the SparkContext constructor, which will distribute it. You can launch your application with “scala”, “java”, or whatever tool you’d prefer.

I'm afraid what you said about 'simply run your application *without* any scripts whatsoever' does not apply to spark at the moment, and it simply does not work.

Try the simple Pi calculation this on a standard spark-ec2 instance:

java -cp /root/spark/examples/target/spark-examples_2.9.3-0.8.1-incubating.jar:/root/spark/assembltarget/scala-2.9.3/spark-assembly_2.9.3-0.8.1-incubating-hadoop1.0.4.jar org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

And you'll get the error:

WARN cluster.ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

While the script way works:

spark/run-example org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

What am I missing in the above java command?
 

Matei

On Jan 8, 2014, at 8:26 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 4:11 AM, Matei Zaharia <[hidden email]> wrote:
Oh, you shouldn’t use spark-class for your own classes. Just build your job separately and submit it by running it with “java” and creating a SparkContext in it. spark-class is designed to run classes internal to the Spark project.

Really? Apparently Eugen runs his jobs by:

$SPARK_HOME/spark-class SPARK_CLASSPATH=PathToYour.jar com.myproject.MyJob

, as he instructed me here to do this.

I have to say while spark documentation is not sparse, it does not address enough, and as you can see the community is confused.

Are the spark users supposed to create something like run-example for their own jobs?
 

Matei

On Jan 8, 2014, at 8:06 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 3:59 AM, Matei Zaharia <[hidden email]> wrote:
Have you looked at the cluster UI, and do you see any workers registered there, and your application under running applications? Maybe you typed in the wrong master URL or something like that.

No, it's automated: cat spark-ec2/cluster-url

I think the problem might be caused by spark-class script. It seems to assign too much memory.

I forgot the fact that run-example doesn't use spark-class.
 

Matei

On Jan 8, 2014, at 7:07 PM, Aureliano Buendia <[hidden email]> wrote:

The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?









Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Archit Thakur
How much memory you are setting for exector JVM.
This problem comes when either there is a communication problem between Master/Worker. or you do not have any memory left. Eg, you specified 75G for your executor and your machine has a memory of 70G.


On Thu, Jan 9, 2014 at 11:27 PM, Aureliano Buendia <[hidden email]> wrote:
The java command worked when I set SPARK_HOME and SPARK_EXAMPLES_JAR values.

There are many issues regarding the Initial job has not accepted any resources... error though:
  • When I put my assembly jar before spark-assembly_2.9.3-0.8.1-incubating-hadoop1.0.4.jar, this error happens. Moving my jar after the spark-assembly it works fine.
    In my case, I need to put my jar before spark-assembly, as my jar uses protobuf 2.5 and spark-assembly comes with protobuf 2.4.
  • Sometimes when this error happens the whole cluster server must be restarted, or even run-example script wouldn't work. It took me a while to find this out, making debugging very time consuming.
  • The error message is absolutely irrelevant.

I guess the problem should be somewhere with the spark context jar delivery part.



On Thu, Jan 9, 2014 at 4:17 PM, Aureliano Buendia <[hidden email]> wrote:



On Thu, Jan 9, 2014 at 5:01 AM, Matei Zaharia <[hidden email]> wrote:
Just follow the docs at http://spark.incubator.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala for how to run an application. Spark is designed so that you can simply run your application *without* any scripts whatsoever, and submit your JAR to the SparkContext constructor, which will distribute it. You can launch your application with “scala”, “java”, or whatever tool you’d prefer.

I'm afraid what you said about 'simply run your application *without* any scripts whatsoever' does not apply to spark at the moment, and it simply does not work.

Try the simple Pi calculation this on a standard spark-ec2 instance:

java -cp /root/spark/examples/target/spark-examples_2.9.3-0.8.1-incubating.jar:/root/spark/assembltarget/scala-2.9.3/spark-assembly_2.9.3-0.8.1-incubating-hadoop1.0.4.jar org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

And you'll get the error:

WARN cluster.ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

While the script way works:

spark/run-example org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

What am I missing in the above java command?
 

Matei

On Jan 8, 2014, at 8:26 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 4:11 AM, Matei Zaharia <[hidden email]> wrote:
Oh, you shouldn’t use spark-class for your own classes. Just build your job separately and submit it by running it with “java” and creating a SparkContext in it. spark-class is designed to run classes internal to the Spark project.

Really? Apparently Eugen runs his jobs by:

$SPARK_HOME/spark-class SPARK_CLASSPATH=PathToYour.jar com.myproject.MyJob

, as he instructed me here to do this.

I have to say while spark documentation is not sparse, it does not address enough, and as you can see the community is confused.

Are the spark users supposed to create something like run-example for their own jobs?
 

Matei

On Jan 8, 2014, at 8:06 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 3:59 AM, Matei Zaharia <[hidden email]> wrote:
Have you looked at the cluster UI, and do you see any workers registered there, and your application under running applications? Maybe you typed in the wrong master URL or something like that.

No, it's automated: cat spark-ec2/cluster-url

I think the problem might be caused by spark-class script. It seems to assign too much memory.

I forgot the fact that run-example doesn't use spark-class.
 

Matei

On Jan 8, 2014, at 7:07 PM, Aureliano Buendia <[hidden email]> wrote:

The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?










Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Aureliano Buendia



On Tue, Jan 14, 2014 at 5:07 PM, Archit Thakur <[hidden email]> wrote:
How much memory you are setting for exector JVM.
This problem comes when either there is a communication problem between Master/Worker. or you do not have any memory left. Eg, you specified 75G for your executor and your machine has a memory of 70G.

This was not a memory problem. This could be considered a spark bug.

Here is what happened: My app was using protobuf 2.5, while spark has a protobuf 2.4 dependency, and classpath was like this:

my_app.jar:spark_assembly.jar:..

This caused spark, (or a dependency, probably hadoop) to use protobuf 2.5, giving that misleading 'ensure that workers are registered and have sufficient memory' error.

Regenerating this error is easy, just download protobuf 2.5 and put it at the beginning of your classpath for any app, you should get that error.
 


On Thu, Jan 9, 2014 at 11:27 PM, Aureliano Buendia <[hidden email]> wrote:
The java command worked when I set SPARK_HOME and SPARK_EXAMPLES_JAR values.

There are many issues regarding the Initial job has not accepted any resources... error though:
  • When I put my assembly jar before spark-assembly_2.9.3-0.8.1-incubating-hadoop1.0.4.jar, this error happens. Moving my jar after the spark-assembly it works fine.
    In my case, I need to put my jar before spark-assembly, as my jar uses protobuf 2.5 and spark-assembly comes with protobuf 2.4.
  • Sometimes when this error happens the whole cluster server must be restarted, or even run-example script wouldn't work. It took me a while to find this out, making debugging very time consuming.
  • The error message is absolutely irrelevant.

I guess the problem should be somewhere with the spark context jar delivery part.



On Thu, Jan 9, 2014 at 4:17 PM, Aureliano Buendia <[hidden email]> wrote:



On Thu, Jan 9, 2014 at 5:01 AM, Matei Zaharia <[hidden email]> wrote:
Just follow the docs at http://spark.incubator.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala for how to run an application. Spark is designed so that you can simply run your application *without* any scripts whatsoever, and submit your JAR to the SparkContext constructor, which will distribute it. You can launch your application with “scala”, “java”, or whatever tool you’d prefer.

I'm afraid what you said about 'simply run your application *without* any scripts whatsoever' does not apply to spark at the moment, and it simply does not work.

Try the simple Pi calculation this on a standard spark-ec2 instance:

java -cp /root/spark/examples/target/spark-examples_2.9.3-0.8.1-incubating.jar:/root/spark/assembltarget/scala-2.9.3/spark-assembly_2.9.3-0.8.1-incubating-hadoop1.0.4.jar org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

And you'll get the error:

WARN cluster.ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

While the script way works:

spark/run-example org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

What am I missing in the above java command?
 

Matei

On Jan 8, 2014, at 8:26 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 4:11 AM, Matei Zaharia <[hidden email]> wrote:
Oh, you shouldn’t use spark-class for your own classes. Just build your job separately and submit it by running it with “java” and creating a SparkContext in it. spark-class is designed to run classes internal to the Spark project.

Really? Apparently Eugen runs his jobs by:

$SPARK_HOME/spark-class SPARK_CLASSPATH=PathToYour.jar com.myproject.MyJob

, as he instructed me here to do this.

I have to say while spark documentation is not sparse, it does not address enough, and as you can see the community is confused.

Are the spark users supposed to create something like run-example for their own jobs?
 

Matei

On Jan 8, 2014, at 8:06 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 3:59 AM, Matei Zaharia <[hidden email]> wrote:
Have you looked at the cluster UI, and do you see any workers registered there, and your application under running applications? Maybe you typed in the wrong master URL or something like that.

No, it's automated: cat spark-ec2/cluster-url

I think the problem might be caused by spark-class script. It seems to assign too much memory.

I forgot the fact that run-example doesn't use spark-class.
 

Matei

On Jan 8, 2014, at 7:07 PM, Aureliano Buendia <[hidden email]> wrote:

The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?











Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Christopher Nguyen

Aureliano, this sort of jar-hell is something we have to deal with, whether Spark or elsewhere. How would you propose we fix this with Spark? Do you mean that Spark's own scaffolding caused you to pull in both Protobuf 2.4 and 2.5? Or do you mean the error message should have been more helpful?

Sent while mobile. Pls excuse typos etc.

On Jan 14, 2014 9:27 AM, "Aureliano Buendia" <[hidden email]> wrote:



On Tue, Jan 14, 2014 at 5:07 PM, Archit Thakur <[hidden email]> wrote:
How much memory you are setting for exector JVM.
This problem comes when either there is a communication problem between Master/Worker. or you do not have any memory left. Eg, you specified 75G for your executor and your machine has a memory of 70G.

This was not a memory problem. This could be considered a spark bug.

Here is what happened: My app was using protobuf 2.5, while spark has a protobuf 2.4 dependency, and classpath was like this:

my_app.jar:spark_assembly.jar:..

This caused spark, (or a dependency, probably hadoop) to use protobuf 2.5, giving that misleading 'ensure that workers are registered and have sufficient memory' error.

Regenerating this error is easy, just download protobuf 2.5 and put it at the beginning of your classpath for any app, you should get that error.
 


On Thu, Jan 9, 2014 at 11:27 PM, Aureliano Buendia <[hidden email]> wrote:
The java command worked when I set SPARK_HOME and SPARK_EXAMPLES_JAR values.

There are many issues regarding the Initial job has not accepted any resources... error though:
  • When I put my assembly jar before spark-assembly_2.9.3-0.8.1-incubating-hadoop1.0.4.jar, this error happens. Moving my jar after the spark-assembly it works fine.
    In my case, I need to put my jar before spark-assembly, as my jar uses protobuf 2.5 and spark-assembly comes with protobuf 2.4.
  • Sometimes when this error happens the whole cluster server must be restarted, or even run-example script wouldn't work. It took me a while to find this out, making debugging very time consuming.
  • The error message is absolutely irrelevant.

I guess the problem should be somewhere with the spark context jar delivery part.



On Thu, Jan 9, 2014 at 4:17 PM, Aureliano Buendia <[hidden email]> wrote:



On Thu, Jan 9, 2014 at 5:01 AM, Matei Zaharia <[hidden email]> wrote:
Just follow the docs at http://spark.incubator.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala for how to run an application. Spark is designed so that you can simply run your application *without* any scripts whatsoever, and submit your JAR to the SparkContext constructor, which will distribute it. You can launch your application with “scala”, “java”, or whatever tool you’d prefer.

I'm afraid what you said about 'simply run your application *without* any scripts whatsoever' does not apply to spark at the moment, and it simply does not work.

Try the simple Pi calculation this on a standard spark-ec2 instance:

java -cp /root/spark/examples/target/spark-examples_2.9.3-0.8.1-incubating.jar:/root/spark/assembltarget/scala-2.9.3/spark-assembly_2.9.3-0.8.1-incubating-hadoop1.0.4.jar org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

And you'll get the error:

WARN cluster.ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

While the script way works:

spark/run-example org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

What am I missing in the above java command?
 

Matei

On Jan 8, 2014, at 8:26 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 4:11 AM, Matei Zaharia <[hidden email]> wrote:
Oh, you shouldn’t use spark-class for your own classes. Just build your job separately and submit it by running it with “java” and creating a SparkContext in it. spark-class is designed to run classes internal to the Spark project.

Really? Apparently Eugen runs his jobs by:


$SPARK_HOME/spark-class SPARK_CLASSPATH=PathToYour.jar com.myproject.MyJob

, as he instructed me here to do this.

I have to say while spark documentation is not sparse, it does not address enough, and as you can see the community is confused.

Are the spark users supposed to create something like run-example for their own jobs?
 

Matei

On Jan 8, 2014, at 8:06 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 3:59 AM, Matei Zaharia <[hidden email]> wrote:
Have you looked at the cluster UI, and do you see any workers registered there, and your application under running applications? Maybe you typed in the wrong master URL or something like that.

No, it's automated: cat spark-ec2/cluster-url

I think the problem might be caused by spark-class script. It seems to assign too much memory.

I forgot the fact that run-example doesn't use spark-class.
 

Matei

On Jan 8, 2014, at 7:07 PM, Aureliano Buendia <[hidden email]> wrote:

The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?











Reply | Threaded
Open this post in threaded view
|

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

Aureliano Buendia



On Tue, Jan 14, 2014 at 5:52 PM, Christopher Nguyen <[hidden email]> wrote:

Aureliano, this sort of jar-hell is something we have to deal with, whether Spark or elsewhere. How would you propose we fix this with Spark?

Do you mean that Spark's own scaffolding caused you to pull in both Protobuf 2.4 and 2.5?

I simply used the newer protobuf for higher efficiency. I had no idea this could conflict with spark.

Or do you mean the error message should have been more helpful?

That error is actually a warning, and the warning doesn't even know what went wrong, it is asking the user to check the web ui for two unrelated points: (1) that the workers are registered and (2) that there is enough memory:

https://github.com/apache/incubator-spark/blob/fdaabdc67387524ffb84354f87985f48bd31cf60/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L150-L156

In my case, spark has no idea that hadoop is failing. I think there above error checking is weak. If the workers are not registered, spark must report so. More importantly, if there is not enough memory, spark must be able to report exactly how much memory is potentially needed, and knowing all about the allocated resources, it should even let the user know about the memory shortage amount.

Another major problem is the settings mess in spark.

You can set spark.executor.memory property, or you could set SPARK_MEM env variable.
After you set these, they are not bound to java heap size, so you need to set these up too as spark-class does. Then there is another parameter: SPARK_WORKER_MEMORY.

So the user has to fiddle around with many parameters to get rid of that warning, but even with doing that, it is not clear if that set of parameters is the optimal way of using the resources. Spark probably could automate this as much as possible.

Sent while mobile. Pls excuse typos etc.

On Jan 14, 2014 9:27 AM, "Aureliano Buendia" <[hidden email]> wrote:



On Tue, Jan 14, 2014 at 5:07 PM, Archit Thakur <[hidden email]> wrote:
How much memory you are setting for exector JVM.
This problem comes when either there is a communication problem between Master/Worker. or you do not have any memory left. Eg, you specified 75G for your executor and your machine has a memory of 70G.

This was not a memory problem. This could be considered a spark bug.

Here is what happened: My app was using protobuf 2.5, while spark has a protobuf 2.4 dependency, and classpath was like this:

my_app.jar:spark_assembly.jar:..

This caused spark, (or a dependency, probably hadoop) to use protobuf 2.5, giving that misleading 'ensure that workers are registered and have sufficient memory' error.

Regenerating this error is easy, just download protobuf 2.5 and put it at the beginning of your classpath for any app, you should get that error.
 


On Thu, Jan 9, 2014 at 11:27 PM, Aureliano Buendia <[hidden email]> wrote:
The java command worked when I set SPARK_HOME and SPARK_EXAMPLES_JAR values.

There are many issues regarding the Initial job has not accepted any resources... error though:
  • When I put my assembly jar before spark-assembly_2.9.3-0.8.1-incubating-hadoop1.0.4.jar, this error happens. Moving my jar after the spark-assembly it works fine.
    In my case, I need to put my jar before spark-assembly, as my jar uses protobuf 2.5 and spark-assembly comes with protobuf 2.4.
  • Sometimes when this error happens the whole cluster server must be restarted, or even run-example script wouldn't work. It took me a while to find this out, making debugging very time consuming.
  • The error message is absolutely irrelevant.

I guess the problem should be somewhere with the spark context jar delivery part.



On Thu, Jan 9, 2014 at 4:17 PM, Aureliano Buendia <[hidden email]> wrote:



On Thu, Jan 9, 2014 at 5:01 AM, Matei Zaharia <[hidden email]> wrote:
Just follow the docs at http://spark.incubator.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala for how to run an application. Spark is designed so that you can simply run your application *without* any scripts whatsoever, and submit your JAR to the SparkContext constructor, which will distribute it. You can launch your application with “scala”, “java”, or whatever tool you’d prefer.

I'm afraid what you said about 'simply run your application *without* any scripts whatsoever' does not apply to spark at the moment, and it simply does not work.

Try the simple Pi calculation this on a standard spark-ec2 instance:

java -cp /root/spark/examples/target/spark-examples_2.9.3-0.8.1-incubating.jar:/root/spark/assembltarget/scala-2.9.3/spark-assembly_2.9.3-0.8.1-incubating-hadoop1.0.4.jar org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

And you'll get the error:

WARN cluster.ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

While the script way works:

spark/run-example org.apache.spark.examples.SparkPi `cat spark-ec2/cluster-url`

What am I missing in the above java command?
 

Matei

On Jan 8, 2014, at 8:26 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 4:11 AM, Matei Zaharia <[hidden email]> wrote:
Oh, you shouldn’t use spark-class for your own classes. Just build your job separately and submit it by running it with “java” and creating a SparkContext in it. spark-class is designed to run classes internal to the Spark project.

Really? Apparently Eugen runs his jobs by:

$SPARK_HOME/spark-class SPARK_CLASSPATH=PathToYour.jar com.myproject.MyJob

, as he instructed me here to do this.

I have to say while spark documentation is not sparse, it does not address enough, and as you can see the community is confused.

Are the spark users supposed to create something like run-example for their own jobs?
 

Matei

On Jan 8, 2014, at 8:06 PM, Aureliano Buendia <[hidden email]> wrote:




On Thu, Jan 9, 2014 at 3:59 AM, Matei Zaharia <[hidden email]> wrote:
Have you looked at the cluster UI, and do you see any workers registered there, and your application under running applications? Maybe you typed in the wrong master URL or something like that.

No, it's automated: cat spark-ec2/cluster-url

I think the problem might be caused by spark-class script. It seems to assign too much memory.

I forgot the fact that run-example doesn't use spark-class.
 

Matei

On Jan 8, 2014, at 7:07 PM, Aureliano Buendia <[hidden email]> wrote:

The strange thing is that spark examples work fine, but when I include a spark example in my jar and deploy it, I get this error for the very same example:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

My jar is deployed to master and then to workers by spark-ec2/copy-dir. Why would including the example in my jar cause this error?



On Thu, Jan 9, 2014 at 12:41 AM, Aureliano Buendia <[hidden email]> wrote:
Could someone explain how SPARK_MEM, SPARK_WORKER_MEMORY and spark.executor.memory should be related so that this non helpful error doesn't occur?

Maybe there are more env and java config variable about memory that I'm missing.

By the way, that bit of the error asking to check the web UI, it's just redundant. The UI is of no help.


On Wed, Jan 8, 2014 at 4:31 PM, Aureliano Buendia <[hidden email]> wrote:
Hi,


My spark cluster is not able to run a job due to this warning:

WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The workers have these status:

ALIVE 2 (0 Used)6.3 GB (0.0 B Used)
So there must be plenty of memory available despite the warning message. I'm using default spark config, is there a config parameter that needs changing for this to work?