Programmatically get status of job (WAITING/RUNNING)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Programmatically get status of job (WAITING/RUNNING)

bsikander
Hi,

I have a Spark Cluster running in client mode. I programmatically submit jobs to spark cluster. Under the hood, I am using spark-submit.

If my cluster is overloaded and I start a context, the driver JVM keeps on waiting for executors. The executors are in waiting state because cluster does not have enough resources. Here are the log messages in driver logs

2017-10-27 13:20:15,260 WARN Timer-0 org.apache.spark.scheduler.TaskSchedulerImpl []: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2017-10-27 13:20:30,259 WARN Timer-0 org.apache.spark.scheduler.TaskSchedulerImpl []: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Is it possible to programmatically check the status of application (e.g. Running/Waiting etc)? I know that we can use the application id and then query the history server but I would like to know a solution which does not involve REST calls to history server.

SparkContext should know about the state? How can i get this information from sc?


Regards,

Behroz

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

bsikander
Anyone ?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

Davide.Mandrini
Hello Behroz,

you can use a SparkListener to get updates from the underlying process (c.f.
https://spark.apache.org/docs/2.2.0/api/java/org/apache/spark/scheduler/SparkListener.html
)

You need first to create your own SparkAppListener class that extends it:
-------------------------------------
private static class SparkAppListener implements SparkAppHandle.Listener,
Runnable {

        SparkAppListener() {}

        @Override
        public void stateChanged(SparkAppHandle handle) {
            String sparkAppId = handle.getAppId();
            SparkAppHandle.State appState = handle.getState();
            log.info("Spark job with app id: " + sparkAppId + ", State
changed to: " + appState);
        }

        @Override
        public void infoChanged(SparkAppHandle handle) {}

        @Override
        public void run() {}
    }
-----------------------------------------



Then you can run it in a thread via a Executors.newCachedThreadPool (or with
a simple New Thread(<your thread>))
-----------------------------------------
private final static ExecutorService listenerService =
Executors.newCachedThreadPool();

SparkAppListener appListener = new SparkAppListener();
listenerService.execute(appListener);

SparkLauncher command = new SparkLauncher()
                .setAppName(appName)
                .setSparkHome(sparkHome)
                .setAppResource(appResource)
                .setMainClass(mainClass)
                .setMaster(master) .........

SparkAppHandle appHandle = launcher.startApplication(appListener);
-----------------------------------------

At this point, every time the state changes, you will execute the
SparkAppListener.stateChanged method.

Hope it helps,
Davide



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

bsikander
Thank you for the reply.

I am currently not using SparkLauncher to launch my driver. Rather, I am
using the old fashion spark-submit and moving to SparkLauncher is not an
option right now.
Do I have any options there?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

Davide.Mandrini
In this case, the only way to check the status is via REST calls to the Spark
json API, accessible at http://<master>:8888/json/



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

bsikander
So, I tried to use SparkAppHandle.Listener with SparkLauncher as you
suggested. The behavior of Launcher is not what I expected.

1- If I start the job (using SparkLauncher) and my Spark cluster has enough
cores available, I receive events in my class extending
SparkAppHandle.Listener and I see the status getting changed from
UNKOWN->CONNECTED -> SUBMITTED -> RUNNING. All good here.

2- If my Spark cluster has cores only for my Driver process (running in
cluster mode) but no cores for my executor, then I still receive the RUNNING
event. I was expecting something else since my executor has no cores and
Master UI shows WAITING state for executors, listener should respond with
SUBMITTED state instead of RUNNING.

3- If my Spark cluster has no cores for even the driver process then
SparkLauncher invokes no events at all. The state stays in UNKNOWN. I would
have expected it to be in SUBMITTED state atleast.

*Is there any way with which I can reliably get the WAITING state of job?*
Driver=RUNNING, executor=RUNNING, overall state should be RUNNING
Driver=RUNNING, executor=WAITING overall state should be SUBMITTED/WAITING
Driver=WAITING, executor=WAITING overall state should be
CONNECTED/SUBMITTED/WAITING







--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

Marcelo Vanzin
SparkLauncher operates at a different layer than Spark applications.
It doesn't know about executors or driver or anything, just whether
the Spark application was started or not. So it doesn't work for your
case.

The best option for your case is to install a SparkListener and
monitor events. But that will not tell you when things do not happen,
just when they do happen, so maybe even that is not enough for you.


On Mon, Dec 4, 2017 at 1:06 AM, bsikander <[hidden email]> wrote:

> So, I tried to use SparkAppHandle.Listener with SparkLauncher as you
> suggested. The behavior of Launcher is not what I expected.
>
> 1- If I start the job (using SparkLauncher) and my Spark cluster has enough
> cores available, I receive events in my class extending
> SparkAppHandle.Listener and I see the status getting changed from
> UNKOWN->CONNECTED -> SUBMITTED -> RUNNING. All good here.
>
> 2- If my Spark cluster has cores only for my Driver process (running in
> cluster mode) but no cores for my executor, then I still receive the RUNNING
> event. I was expecting something else since my executor has no cores and
> Master UI shows WAITING state for executors, listener should respond with
> SUBMITTED state instead of RUNNING.
>
> 3- If my Spark cluster has no cores for even the driver process then
> SparkLauncher invokes no events at all. The state stays in UNKNOWN. I would
> have expected it to be in SUBMITTED state atleast.
>
> *Is there any way with which I can reliably get the WAITING state of job?*
> Driver=RUNNING, executor=RUNNING, overall state should be RUNNING
> Driver=RUNNING, executor=WAITING overall state should be SUBMITTED/WAITING
> Driver=WAITING, executor=WAITING overall state should be
> CONNECTED/SUBMITTED/WAITING
>
>
>
>
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [hidden email]
>



--
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

bsikander
Thank you for the reply.

I am not a Spark expert but I was reading through the code and I thought
that the state was changed from SUBMITTED to RUNNING only after executors
(CoarseGrainedExecutorBackend) were registered.
https://github.com/apache/spark/commit/015f7ef503d5544f79512b6333326749a1f0c48b#diff-a755f3d892ff2506a7aa7db52022d77cR95

As you mentioned that Launcher has no idea about executors, probably my
understanding is not correct.



SparkListener is an option but it has its own pitfalls.
1) If I use spark.extraListeners, I get all the events but I cannot
customize the Listener, since I have to pass the class as a string to
spark-submit/Launcher.
2) If I use context.addSparkListener, I can customize the listener but then
I miss the onApplicationStart event. Also, I don't know the Spark's logic to
changing the state of application from WAITING -> RUNNING.

Maybe you can answer,
If I have a Spark job which needs 3 executors and cluster can only provide 1
executor, will the application be in WAITING or RUNNING ?
If I know the Spark's logic then I can program something with
SparkListener.onExecutorAdded event to correctly figure out the state.

One other alternate can be to use Spark Master Json (http://<>:8080/json),
but the problem with this is that it returns everything and I was not able
to find any way to filter ......



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

Marcelo Vanzin
On Tue, Dec 5, 2017 at 12:43 PM, bsikander <[hidden email]> wrote:
> 2) If I use context.addSparkListener, I can customize the listener but then
> I miss the onApplicationStart event. Also, I don't know the Spark's logic to
> changing the state of application from WAITING -> RUNNING.

I'm not sure I follow you here. This is something that you are
defining, not Spark.

"SparkLauncher" has its own view of that those mean, and it doesn't match yours.

"SparkListener" has no notion of whether an app is running or not.

It's up to you to define what waiting and running mean in your code,
and map the events Spark provides you to those concepts.

e.g., a job is running after your listener gets an "onJobStart" event.
But the application might have been running already before that job
started.

--
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

bsikander
Marcelo Vanzin wrote
> I'm not sure I follow you here. This is something that you are
> defining, not Spark.

Yes, you are right. In my code,
1) my notion of RUNNING is that both driver + executors are in RUNNING
state.
2) my notion of WAITING is if any one of driver/executor is in WAITING
state.

So,
- SparkLauncher provides me the details about the "driver".
RUNNING/SUBMITTED/WAITING
- SparkListener provides me the details about the "executor" using
onExecutorAdded/onExecutorDeleted

I want to combine both SparkLauncher + SparkListener to achieve my view of
RUNNING/WAITING.

The only thing confusing me here is that I don't know how Spark internally
converts applications from WAITING to RUNNING state.
For example, if an application wanted 4 executors
(spark.executor.instances=4) but the spark cluster can only provide 1
executor. This means that I will only receive 1 onExecutorAdded event. Will
the application state change to RUNNING (even if 1 executor was allocated)?

If I am clear on this logic I can implement my feature.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

Marcelo Vanzin
On Thu, Dec 7, 2017 at 11:40 AM, bsikander <[hidden email]> wrote:
> For example, if an application wanted 4 executors
> (spark.executor.instances=4) but the spark cluster can only provide 1
> executor. This means that I will only receive 1 onExecutorAdded event. Will
> the application state change to RUNNING (even if 1 executor was allocated)?

What application state are you talking about? That's the thing that
you seem to be confused about here.

As you've already learned, SparkLauncher only cares about the driver.
So RUNNING means the driver is running.

And there's no concept of running anywhere else I know of that is
exposed to Spark applications. So I don't know which code you're
referring to when you say "the application state change to RUNNING".

--
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

bsikander
In reply to this post by bsikander
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/Screen_Shot_2017-12-07_at_20.png>

See the image. I am referring to this state when I say "Application State".



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

Marcelo Vanzin
That's the Spark Master's view of the application. I don't know
exactly what it means in the different run modes, I'm more familiar
with YARN. But I wouldn't be surprised if, as with others, it mostly
tracks the driver's state.

On Thu, Dec 7, 2017 at 12:06 PM, bsikander <[hidden email]> wrote:

> <http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/Screen_Shot_2017-12-07_at_20.png>
>
> See the image. I am referring to this state when I say "Application State".
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [hidden email]
>



--
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

Qiao, Richard
In reply to this post by bsikander
For #2, do you mean “RUNNING” showing in “Driver” table? If yes, that is not a problem, because driver does run, while there is no executor available, as can be a status for you to catch – Driver running while no executors.
Comparing #1 and #3, my understanding of “submitted” is “the jar is submitted to executors”. With this concept, you may define your own status.

Best Regards
Richard


On 12/4/17, 4:06 AM, "bsikander" <[hidden email]> wrote:

    So, I tried to use SparkAppHandle.Listener with SparkLauncher as you
    suggested. The behavior of Launcher is not what I expected.
   
    1- If I start the job (using SparkLauncher) and my Spark cluster has enough
    cores available, I receive events in my class extending
    SparkAppHandle.Listener and I see the status getting changed from
    UNKOWN->CONNECTED -> SUBMITTED -> RUNNING. All good here.
   
    2- If my Spark cluster has cores only for my Driver process (running in
    cluster mode) but no cores for my executor, then I still receive the RUNNING
    event. I was expecting something else since my executor has no cores and
    Master UI shows WAITING state for executors, listener should respond with
    SUBMITTED state instead of RUNNING.
   
    3- If my Spark cluster has no cores for even the driver process then
    SparkLauncher invokes no events at all. The state stays in UNKNOWN. I would
    have expected it to be in SUBMITTED state atleast.
   
    *Is there any way with which I can reliably get the WAITING state of job?*
    Driver=RUNNING, executor=RUNNING, overall state should be RUNNING
    Driver=RUNNING, executor=WAITING overall state should be SUBMITTED/WAITING
    Driver=WAITING, executor=WAITING overall state should be
    CONNECTED/SUBMITTED/WAITING
   
   
   
   
   
   
   
    --
    Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
   
    ---------------------------------------------------------------------
    To unsubscribe e-mail: [hidden email]
   
   

________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

Qiao, Richard
In reply to this post by bsikander
For your question of example, the answer is yes.
“    For example, if an application wanted 4 executors
    (spark.executor.instances=4) but the spark cluster can only provide 1
    executor. This means that I will only receive 1 onExecutorAdded event. Will
    the application state change to RUNNING (even if 1 executor was allocated)?


Best Regards
Richard


On 12/7/17, 2:40 PM, "bsikander" <[hidden email]> wrote:

    Marcelo Vanzin wrote
    > I'm not sure I follow you here. This is something that you are
    > defining, not Spark.
   
    Yes, you are right. In my code,
    1) my notion of RUNNING is that both driver + executors are in RUNNING
    state.
    2) my notion of WAITING is if any one of driver/executor is in WAITING
    state.
   
    So,
    - SparkLauncher provides me the details about the "driver".
    RUNNING/SUBMITTED/WAITING
    - SparkListener provides me the details about the "executor" using
    onExecutorAdded/onExecutorDeleted
   
    I want to combine both SparkLauncher + SparkListener to achieve my view of
    RUNNING/WAITING.
   
    The only thing confusing me here is that I don't know how Spark internally
    converts applications from WAITING to RUNNING state.
    For example, if an application wanted 4 executors
    (spark.executor.instances=4) but the spark cluster can only provide 1
    executor. This means that I will only receive 1 onExecutorAdded event. Will
    the application state change to RUNNING (even if 1 executor was allocated)?
   
    If I am clear on this logic I can implement my feature.
   
   
   
    --
    Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
   
    ---------------------------------------------------------------------
    To unsubscribe e-mail: [hidden email]
   
   

________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

bsikander
Qiao, Richard wrote
> For your question of example, the answer is yes.

Perfect. I am assuming that this is true for Spark-standalone/YARN/Mesos.




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

bsikander
In reply to this post by Qiao, Richard
Qiao, Richard wrote
> Comparing #1 and #3, my understanding of “submitted” is “the jar is
> submitted to executors”. With this concept, you may define your own
> status.

In SparkLauncher, SUBMITTED means that the Driver was able to acquire cores
from Spark cluster and Launcher is waiting for Driver to connect back. Once
it connects back, the state of Driver is changed to CONNECTED.
As Marcelo mentioned, Launcher can only tell me about the Driver state and
it is not possible to guess the state of "application (executors)". For the
state of executors we can use SparkListener.

With the combination of both Launcher + Listener, I have a solution. As you
mentioned, that even if 1 executor is allocated to "application", the state
will change to RUNNING. So in my application, I change the status of my job
to RUNNING only if I receive RUNNING from Launcher and onExecuterAdded event
from SparkListener.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]