Submitting job to Yarn's ResourceManager

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Submitting job to Yarn's ResourceManager

DB Tsai

Hi guys,

We were able to submit our spark application to yarn's resource manager (PivotalHD 1.0.1), and the expected result is printed in the stdout of the container log; however, when we took a look at Tracking UI, we got the following error 500. There is no problem for traditional mapreduce job. Is there any suggestion to dig into this problem?

Secondly, we submitted our job through our java app, and since the application is actually running on the remote machine, it seems that there is no easy way to interact between our main application running locally, and the spark application running remotely. For example, if we would like to know the progress of our spark application, we have to write it to HDFS, and then read it back in our main application. It's the same to retrieve the final spark job return. Is there any way to interact through some kind of api without going through the HDFS?

Finally, sometimes, we would like to test some code quickly through spark-shell. And it seems that every operation is running locally when we lunch spark-shell in yarn-client mode (I knew it's not supported to run spark-shell in yarn-standalone mode). Is there anyway to make spark-shell running in the distributed fashion with yarn?

Thanks.

HTTP ERROR 500

Problem accessing /proxy/application_1389853114516_1011/. Reason:

    Server Error

Caused by:

java.io.IOException: java.net.URISyntaxException: Expected authority at index 7: http://
	at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:318)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:652)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1320)


Sincerely,

DB Tsai
Machine Learning Engineer
Alpine Data Labs
--------------------------------------
Reply | Threaded
Open this post in threaded view
|

Re: Submitting job to Yarn's ResourceManager

Tom Graves
The tracking url will error if the application has completed or if you are running yarn client mode (that hasn't bee hooked up) .  In yarn client mode you can get url for UI via the messages that come out on the console.

In yarn-client mode, are you setting the environment variables to ask for workers, etc.  Also make sure the hadoop conf directory is in your classpath.

Tom


On Thursday, January 23, 2014 7:12 PM, DB Tsai <[hidden email]> wrote:

Hi guys,

We were able to submit our spark application to yarn's resource manager (PivotalHD 1.0.1), and the expected result is printed in the stdout of the container log; however, when we took a look at Tracking UI, we got the following error 500. There is no problem for traditional mapreduce job. Is there any suggestion to dig into this problem?

Secondly, we submitted our job through our java app, and since the application is actually running on the remote machine, it seems that there is no easy way to interact between our main application running locally, and the spark application running remotely. For example, if we would like to know the progress of our spark application, we have to write it to HDFS, and then read it back in our main application. It's the same to retrieve the final spark job return. Is there any way to interact through some kind of api without going through the HDFS?

Finally, sometimes, we would like to test some code quickly through spark-shell. And it seems that every operation is running locally when we lunch spark-shell in yarn-client mode (I knew it's not supported to run spark-shell in yarn-standalone mode). Is there anyway to make spark-shell running in the distributed fashion with yarn?

Thanks.

HTTP ERROR 500

Problem accessing /proxy/application_1389853114516_1011/. Reason:
    Server Error

Caused by:

java.io.IOException: java.net.URISyntaxException: Expected authority at index 7: http://
	at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:318)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:652)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1320)


Sincerely,

DB Tsai
Machine Learning Engineer
Alpine Data Labs
--------------------------------------