Integration about submitting and monitoring spark tasks

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Integration about submitting and monitoring spark tasks

jianl miao
I now need to integrate spark into our own platform built with spring to reflect the ability of task submission and task monitoring. Spark tasks run on yarn and are in cluster mode. And our current service may submit tasks to different yarn clusters.
According to the current method provided by spark, there are two classes, sparkLauncher and InProcessLauncher. The former calls spark-submit through child processes, and the latter calls the sparksubmit class in the current process. But both methods are done through the launcher, and all will always monitor the current task by themselves, which will cause yarn app status queries to be very frequent as the number of tasks increases. In essence, we can launch the submit program after the task submission is completed, after getting the applicationId of the task and after the submission is completed, and then uniformly monitor the status of all my tasks. But now I can't stop the submit program, even if I configure waitAppCompletion to false, because launcherBackend.isConnected is true. This is very resource intensive.