Spark Deployment Strategy

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Spark Deployment Strategy

codingkapoor
I want to understand how best to deploy spark close to a data source or sink.

Let's say, I have a vertica cluster that I need to run spark job on. In that
case how should spark cluster be setup?

1. Should we run a spark worker node on each vertica cluster node?
2. How about when shuffling plays out?
3. Also how would the deployment look like in a managed cluster deployement
such as kubernetes?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]