Local vs Cluster

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Local vs Cluster

Aakash Basu-2
Hi,

What is the Spark cluster equivalent of standalone's local[N]. I mean, the value we set as a parameter of local as N, which parameter takes it in the cluster mode?

Thanks,
Aakash.
Reply | Threaded
Open this post in threaded view
|

Re: Local vs Cluster

Mich Talebzadeh
Local only one JVM, runs on the host you submitted the job

${SPARK_HOME}/bin/spark-submit \
                  --master local[N] \
 

 Standalone meaning using Spark own scheduler

${SPARK_HOME}/bin/spark-submit \
                --master spark://<IP_ADDRESS> \


Where IP_ADDRESS is the host your Spark master started.

In Standalone mode you can have a master and multiple workers running on master host + other in the cluster. You state master from

You start master from $SPARK_HOME/sbin

start-master.sh

Workers from 

start-slaves.sh


For example in $SPARK_HOME/conf you have slaves file containing the following:

# A Spark Worker will be started on each of the machines listed below.
rhes75
rhes75
rhes75
rhes75
rhes564

where the file start-slaves.sh will pickup the list of workers to start.

HTH

Dr Mich Talebzadeh

 

LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

 

http://talebzadehmich.wordpress.com


Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 



On Fri, 14 Sep 2018 at 09:21, Aakash Basu <[hidden email]> wrote:
Hi,

What is the Spark cluster equivalent of standalone's local[N]. I mean, the value we set as a parameter of local as N, which parameter takes it in the cluster mode?

Thanks,
Aakash.
Reply | Threaded
Open this post in threaded view
|

Re: Local vs Cluster

Apostolos N. Papadopoulos
In reply to this post by Aakash Basu-2
Hi Aakash,

in the cluster you need to consider the total number of executors you
are using. Please take a look in the following link

for an introduction.


https://spoddutur.github.io/spark-notes/distribution_of_executors_cores_and_memory_for_spark_application.html


regards,

Apostolos




On 14/09/2018 11:21 πμ, Aakash Basu wrote:
> Hi,
>
> What is the Spark cluster equivalent of standalone's local[N]. I mean,
> the value we set as a parameter of local as N, which parameter takes
> it in the cluster mode?
>
> Thanks,
> Aakash.

--
Apostolos N. Papadopoulos, Associate Professor
Department of Informatics
Aristotle University of Thessaloniki
Thessaloniki, GREECE
tel: ++0030312310991918
email: [hidden email]
twitter: @papadopoulos_ap
web: http://datalab.csd.auth.gr/~apostol


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]