Programmatically get status of job (WAITING/RUNNING)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Programmatically get status of job (WAITING/RUNNING)

bsikander
Hi,

I have a Spark Cluster running in client mode. I programmatically submit jobs to spark cluster. Under the hood, I am using spark-submit.

If my cluster is overloaded and I start a context, the driver JVM keeps on waiting for executors. The executors are in waiting state because cluster does not have enough resources. Here are the log messages in driver logs

2017-10-27 13:20:15,260 WARN Timer-0 org.apache.spark.scheduler.TaskSchedulerImpl []: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2017-10-27 13:20:30,259 WARN Timer-0 org.apache.spark.scheduler.TaskSchedulerImpl []: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Is it possible to programmatically check the status of application (e.g. Running/Waiting etc)? I know that we can use the application id and then query the history server but I would like to know a solution which does not involve REST calls to history server.

SparkContext should know about the state? How can i get this information from sc?


Regards,

Behroz

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

bsikander
Anyone ?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

Davide.Mandrini
Hello Behroz,

you can use a SparkListener to get updates from the underlying process (c.f.
https://spark.apache.org/docs/2.2.0/api/java/org/apache/spark/scheduler/SparkListener.html
)

You need first to create your own SparkAppListener class that extends it:
-------------------------------------
private static class SparkAppListener implements SparkAppHandle.Listener,
Runnable {

        SparkAppListener() {}

        @Override
        public void stateChanged(SparkAppHandle handle) {
            String sparkAppId = handle.getAppId();
            SparkAppHandle.State appState = handle.getState();
            log.info("Spark job with app id: " + sparkAppId + ", State
changed to: " + appState);
        }

        @Override
        public void infoChanged(SparkAppHandle handle) {}

        @Override
        public void run() {}
    }
-----------------------------------------



Then you can run it in a thread via a Executors.newCachedThreadPool (or with
a simple New Thread(<your thread>))
-----------------------------------------
private final static ExecutorService listenerService =
Executors.newCachedThreadPool();

SparkAppListener appListener = new SparkAppListener();
listenerService.execute(appListener);

SparkLauncher command = new SparkLauncher()
                .setAppName(appName)
                .setSparkHome(sparkHome)
                .setAppResource(appResource)
                .setMainClass(mainClass)
                .setMaster(master) .........

SparkAppHandle appHandle = launcher.startApplication(appListener);
-----------------------------------------

At this point, every time the state changes, you will execute the
SparkAppListener.stateChanged method.

Hope it helps,
Davide



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

bsikander
Thank you for the reply.

I am currently not using SparkLauncher to launch my driver. Rather, I am
using the old fashion spark-submit and moving to SparkLauncher is not an
option right now.
Do I have any options there?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Programmatically get status of job (WAITING/RUNNING)

Davide.Mandrini
In this case, the only way to check the status is via REST calls to the Spark
json API, accessible at http://<master>:8888/json/



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]