Spark Kubernetes Architecture: Deployments vs Pods that create Pods

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark Kubernetes Architecture: Deployments vs Pods that create Pods

WILSON Frank

Hi,

 

I’ve been playing around with Spark Kubernetes deployments over the past week and I’m curious to know why Spark deploys as a driver pod that creates more worker pods.

 

I’ve read that it’s normal to use Kubernetes Deployments to create a distributed service, so I am wondering why Spark just creates Pods. I suppose the driver program

is ‘the odd one out’ so it doesn’t belong in a Deployment or ReplicaSet, but maybe the workers could be Deployment? Is this something to do with data locality?

 

I have tried Streaming pipelines on Kubernetes yet, are these also Pods that create Pods rather than Deployments? It seems more important for a streaming pipeline to be ‘durable’[1] as the Kubernetes documentation might say.

 

I ask this question partly because the Kubernetes deployment of Spark is still experimental and I am wondering whether this aspect of the deployment might change.

 

I had a look at the Flink[2] documentation and it does seem to use Deployments however these seem to be a lightweight job/task manager that accepts Flink jobs. It sounds actually like running a lightweight version YARN inside containers on Kubernetes.

 

 

Thanks,

 

 

Frank

 

[1] https://kubernetes.io/docs/concepts/workloads/pods/pod/#durability-of-pods-or-lack-thereof

[2] https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html

Reply | Threaded
Open this post in threaded view
|

Re: Spark Kubernetes Architecture: Deployments vs Pods that create Pods

Yinan Li
Hi Wilson,

The behavior of a Deployment doesn't fit with the way Spark executor pods are run and managed. For example, executor pods are created and deleted per the requests from the driver dynamically and normally they run to completion. A Deployment assumes uniformity and statelessness of the set of Pods it manages, which is not necessarily the case for Spark executors. For example, executor Pods have unique executor IDs. Dynamic resource allocation doesn't play well with a Deployment as scaling or shrinking the number of executor Pods requires a rolling update with a Deployment, which means restarting all the executor Pods. In the Kubernetes mode, the driver is effectively a custom controller of executor Pods that adds or deletes Pods per requests from the driver, and watches the status of the Pods. 

The way Flink on Kubernetes works, as you said, is basically running the Flink job/task managers using Deployments. A equivalent is running a standalone Spark cluster on top of Kubernetes. If you want auto-restart for Spark streaming jobs, I would suggest you take a look at the K8S Spark Operator.

On Tue, Jan 29, 2019 at 5:53 AM WILSON Frank <[hidden email]> wrote:

Hi,

 

I’ve been playing around with Spark Kubernetes deployments over the past week and I’m curious to know why Spark deploys as a driver pod that creates more worker pods.

 

I’ve read that it’s normal to use Kubernetes Deployments to create a distributed service, so I am wondering why Spark just creates Pods. I suppose the driver program

is ‘the odd one out’ so it doesn’t belong in a Deployment or ReplicaSet, but maybe the workers could be Deployment? Is this something to do with data locality?

 

I have tried Streaming pipelines on Kubernetes yet, are these also Pods that create Pods rather than Deployments? It seems more important for a streaming pipeline to be ‘durable’[1] as the Kubernetes documentation might say.

 

I ask this question partly because the Kubernetes deployment of Spark is still experimental and I am wondering whether this aspect of the deployment might change.

 

I had a look at the Flink[2] documentation and it does seem to use Deployments however these seem to be a lightweight job/task manager that accepts Flink jobs. It sounds actually like running a lightweight version YARN inside containers on Kubernetes.

 

 

Thanks,

 

 

Frank

 

[1] https://kubernetes.io/docs/concepts/workloads/pods/pod/#durability-of-pods-or-lack-thereof

[2] https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html

Reply | Threaded
Open this post in threaded view
|

Re: Spark Kubernetes Architecture: Deployments vs Pods that create Pods

Li Gao-2
Hi Wilson,

As Yinan well said, for batch jobs with dynamic scaling requirements and communication between driver and executor, it does not fit into the service oriented Deployment paradigm of k8s. Thus we have the need to abstract these spark specific differences to k8s CRD and CRD controller to manage the lifecycle of spark batch on k8s: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator. The CRD makes the spark job more k8s compliant and repeatable.

Like you discovered, Deployment is typically used for job server type of services.

-Li


On Tue, Jan 29, 2019 at 1:49 PM Yinan Li <[hidden email]> wrote:
Hi Wilson,

The behavior of a Deployment doesn't fit with the way Spark executor pods are run and managed. For example, executor pods are created and deleted per the requests from the driver dynamically and normally they run to completion. A Deployment assumes uniformity and statelessness of the set of Pods it manages, which is not necessarily the case for Spark executors. For example, executor Pods have unique executor IDs. Dynamic resource allocation doesn't play well with a Deployment as scaling or shrinking the number of executor Pods requires a rolling update with a Deployment, which means restarting all the executor Pods. In the Kubernetes mode, the driver is effectively a custom controller of executor Pods that adds or deletes Pods per requests from the driver, and watches the status of the Pods. 

The way Flink on Kubernetes works, as you said, is basically running the Flink job/task managers using Deployments. A equivalent is running a standalone Spark cluster on top of Kubernetes. If you want auto-restart for Spark streaming jobs, I would suggest you take a look at the K8S Spark Operator.

On Tue, Jan 29, 2019 at 5:53 AM WILSON Frank <[hidden email]> wrote:

Hi,

 

I’ve been playing around with Spark Kubernetes deployments over the past week and I’m curious to know why Spark deploys as a driver pod that creates more worker pods.

 

I’ve read that it’s normal to use Kubernetes Deployments to create a distributed service, so I am wondering why Spark just creates Pods. I suppose the driver program

is ‘the odd one out’ so it doesn’t belong in a Deployment or ReplicaSet, but maybe the workers could be Deployment? Is this something to do with data locality?

 

I have tried Streaming pipelines on Kubernetes yet, are these also Pods that create Pods rather than Deployments? It seems more important for a streaming pipeline to be ‘durable’[1] as the Kubernetes documentation might say.

 

I ask this question partly because the Kubernetes deployment of Spark is still experimental and I am wondering whether this aspect of the deployment might change.

 

I had a look at the Flink[2] documentation and it does seem to use Deployments however these seem to be a lightweight job/task manager that accepts Flink jobs. It sounds actually like running a lightweight version YARN inside containers on Kubernetes.

 

 

Thanks,

 

 

Frank

 

[1] https://kubernetes.io/docs/concepts/workloads/pods/pod/#durability-of-pods-or-lack-thereof

[2] https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html