Spark production scenario

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark production scenario

☼ R Nair (रविशंकर नायर)
Hi all,

We are going to move to production with an 8 node Spark cluster. Request some help for below

We are running on YARN cluster manager.That means YARN is installed with SSH between the nodes. When we run a standalone Spark program with spark-submit, YARN initializes a resource manager followed by application master per application. This is allocated randomely with arbitrary port. So, would we be opening all ports in between the nodes in a production implementation ?

Best,
Passion

Reply | Threaded
Open this post in threaded view
|

Re: Spark production scenario

yncxcw
hi, Passion

I don't know an exact solution. But yes, the port each executor chosen to
communicate with driver is random.  I am wondering if it's possible that you
can have a node has two ethernet card, configure one card for intranet for
Spark and configure one card for WAN. Then connect the rests nodes using the
intranet.

And also, I think you might not use WAN for Spark data transfer since the
amount of data during shuffle is huge. You got to have a high-speed switch
for your cluster.

Hopes this answer can help you!


Wei



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]