Unable to submit an application to standalone cluster which on hdfs.

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Unable to submit an application to standalone cluster which on hdfs.

samuel281
I'm trying to launch application inside the cluster (standalone mode) 

According to docs, jar-url can be either file:// or hdfs:// format. (https://spark.incubator.apache.org/docs/latest/spark-standalone.html

But, when I tried to run spark-class It seemed unable to parse hdfs://xx format. 

<command>
spark-class org.apache.spark.deploy.Client launch \ 
    cds-test05:7077 \ 
    hdfs:///namenode:8020/user/datalab/filename.jar \ 
    my.package.Runner \ 
    -i /user/myself/input -o /user/myself/output -m spark://sparkmaster:7077 

<output>
Jar url 'hdfs:///namenode:8020/user/datalab/filename.jar' is not a valid URL
Jar must be in URL format (e.g. hdfs://XX, file://XX) 

I've found that ClientArguments class is using java.net.URL class to parse jar-url, and It doesn't support hdfs protocol.
Reply | Threaded
Open this post in threaded view
|

Re: Unable to submit an application to standalone cluster which on hdfs.

Akhil Das
It says "not a valid URL"

hdfs:///  - Invalid
hdfs://   - Valid

Hope that helps!


Thanks
Best Regards.


On Wed, Feb 19, 2014 at 10:05 AM, samuel281 <[hidden email]> wrote:
I'm trying to launch application inside the cluster (standalone mode) 

According to docs, jar-url can be either file:// or hdfs:// format. (https://spark.incubator.apache.org/docs/latest/spark-standalone.html

But, when I tried to run spark-class It seemed unable to parse hdfs://xx format. 

<command>
spark-class org.apache.spark.deploy.Client launch \ 
    cds-test05:7077 \ 
    hdfs:///namenode:8020/user/datalab/filename.jar \ 
    my.package.Runner \ 
    -i /user/myself/input -o /user/myself/output -m spark://sparkmaster:7077 

<output>
Jar url 'hdfs:///namenode:8020/user/datalab/filename.jar' is not a valid URL
Jar must be in URL format (e.g. hdfs://XX, file://XX) 

I've found that ClientArguments class is using java.net.URL class to parse jar-url, and It doesn't support hdfs protocol.


View this message in context: Unable to submit an application to standalone cluster which on hdfs.
Sent from the Apache Spark User List mailing list archive at Nabble.com.



--
Thanks
Best Regards
Reply | Threaded
Open this post in threaded view
|

Re: Unable to submit an application to standalone cluster which on hdfs.

samuel281
This post was updated on .
Actually I tried them both.  (hdfs:///, hdfs://)
Even I tried test to create java.net.URL instance by writing test code.

URL test = new URL("hdfs://namenode:8020/path/to/jar");

And it throws java.net.MalformedURLException. The message says that it doesn't support hdfs protocol.

In the source code, ClientArguments just tries to instantiate URL object and that's all. No URLStreamHandler either. (https://github.com/apache/incubator-spark/blob/v0.9.0-incubating/core/src/main/scala/org/apache/spark/deploy/ClientArguments.scala)

Anybody came across the same issue?

Akhil Das wrote
It says "*not a valid URL*"

*hdfs:///  - Invalid*
*hdfs://   - Valid*

Hope that helps!


Thanks
Best Regards.


On Wed, Feb 19, 2014 at 10:05 AM, samuel281 <samuel281@gmail.com> wrote:

> I'm trying to launch application inside the cluster (standalone mode)
>
> According to docs, jar-url can be either file:// or hdfs:// format. (
> https://spark.incubator.apache.org/docs/latest/spark-standalone.html)
>
> But, when I tried to run spark-class It seemed unable to parse hdfs://xx
> format.
>
> <command>
> spark-class org.apache.spark.deploy.Client launch \
>     cds-test05:7077 \
>     hdfs:///namenode:8020/user/datalab/filename.jar \
>     my.package.Runner \
>     -i /user/myself/input -o /user/myself/output -m
> spark://sparkmaster:7077
>
> <output>
> Jar url 'hdfs:///namenode:8020/user/datalab/filename.jar' is not a valid
> URL.
> Jar must be in URL format (e.g. hdfs://XX, file://XX)
>
> I've found that *ClientArguments class is using java.net
> <http://java.net/>.URL class to parse jar-url, and It doesn't support hdfs
> protocol.*
>
> ------------------------------
> View this message in context: Unable to submit an application to
> standalone cluster which on
> hdfs.<http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730.html>
> Sent from the Apache Spark User List mailing list
> archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at
> Nabble.com.
>



--
Thanks
Best Regards
Quoted from:
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730p1731.html
Reply | Threaded
Open this post in threaded view
|

Re: Unable to submit an application to standalone cluster which on hdfs.

Patrick Wendell
Thanks for reporting this - this is a bug with the way it validates
the URL. I'm filing this as a blocker for 0.9.1. If you are able to
compile Spark, try just removing the validation block.

On Tue, Feb 18, 2014 at 10:27 PM, samuel281 <[hidden email]> wrote:

> Actually I tried them both.  (hdfs:///, hdfs://)
> Even I tried test to create java.net.URL instance by writing test code.
>
> /URL test = new URL("hdfs://namenode:8020/path/to/jar");/
>
> And it throws java.net.MalformedURLException. The message says that it
> doesn't support hdfs protocol.
>
> In the source code, ClientArguments just tries to instantiate URL object and
> that's all. No URLStreamHandler either.
> (https://github.com/apache/incubator-spark/blob/v0.9.0-incubating/core/src/main/scala/org/apache/spark/deploy/ClientArguments.scala)
>
> Anybody come across the same issue?
>
>
> Akhil Das wrote
>> It says "*not a valid URL*"
>>
>> *hdfs:///  - Invalid*
>> *hdfs://   - Valid*
>>
>> Hope that helps!
>>
>>
>> Thanks
>> Best Regards.
>>
>>
>> On Wed, Feb 19, 2014 at 10:05 AM, samuel281
>> &lt;
>> [hidden email]
>> &gt;
>>  wrote:
>>
>>> I'm trying to launch application inside the cluster (standalone mode)
>>>
>>> According to docs, jar-url can be either file:// or hdfs:// format. (
>>> https://spark.incubator.apache.org/docs/latest/spark-standalone.html)
>>>
>>> But, when I tried to run spark-class It seemed unable to parse hdfs://xx
>>> format.
>>>
>>>
>> <command>
>>> spark-class org.apache.spark.deploy.Client launch \
>>>     cds-test05:7077 \
>>>     hdfs:///namenode:8020/user/datalab/filename.jar \
>>>     my.package.Runner \
>>>     -i /user/myself/input -o /user/myself/output -m
>>> spark://sparkmaster:7077
>>>
>>>
>> <output>
>>> Jar url 'hdfs:///namenode:8020/user/datalab/filename.jar' is not a valid
>>> URL.
>>> Jar must be in URL format (e.g. hdfs://XX, file://XX)
>>>
>>> I've found that *ClientArguments class is using java.net
>>>
>> &lt;
>> http://java.net/
>> &gt;
>> .URL class to parse jar-url, and It doesn't support hdfs
>>> protocol.*
>>>
>>> ------------------------------
>>> View this message in context: Unable to submit an application to
>>> standalone cluster which on
>>> hdfs.
>> &lt;
>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730.html
>> &gt;
>>> Sent from the Apache Spark User List mailing list
>>> archive
>> &lt;
>> http://apache-spark-user-list.1001560.n3.nabble.com/
>> &gt;
>> at
>>> Nabble.com.
>>>
>>
>>
>>
>> --
>> Thanks
>> Best Regards
>
> Quoted from:
> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730p1731.html
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730p1739.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Unable to submit an application to standalone cluster which on hdfs.

haikal.pribadi
How do you remove the validation blocker from the compilation?

Thank you