Running Spark in Local Mode

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Running Spark in Local Mode

FreePeter
This post has NOT been accepted by the mailing list yet.
Hi,

I am trying to use Spark for my own applications, and I am currently profiling the performance with local mode, and I have a couple of questions:

1. When I set spark.master local[N], it means the will use up to N worker *threads* on the single machine. Is this equivalent to say there are N worker *nodes*  as described in http://spark.apache.org/docs/latest/cluster-overview.html
(So each worker node/thread are viewed separately and can have its own executor for each application)

2. Is there anyway to set up the max memory used by each worker thread/node? I only find we can set the memory for each executor? (spark.executor.mem)

Thank you!

mrm
Reply | Threaded
Open this post in threaded view
|

Re: Running Spark in Local Mode

mrm
This post has NOT been accepted by the mailing list yet.
Hi,

Did you resolve this? I have the same questions.
Reply | Threaded
Open this post in threaded view
|

Re: Running Spark in Local Mode

oubrik
This post has NOT been accepted by the mailing list yet.
In reply to this post by FreePeter
Hi
To see more information about nodes executors cores and threads on spark see this link:
http://0x0fff.com/spark-architecture/


Regards

Oubrik brahim
Reply | Threaded
Open this post in threaded view
|

Re: Running Spark in Local Mode

rusty
This post has NOT been accepted by the mailing list yet.
In reply to this post by FreePeter
Hi FreePeter,
   In spark all the sample programs uses local[N] it means N no of threads running to complete the job(jobs divided into smaller task that run on executors). Worker nodes that can run application code in the cluster.Threads run on worker nodes.

(N no of threads !=N no of worker nodes)

sencond question not clear but in the standalone mode spark.executor.memory won't have any effect because worker lives within the driver JVM process that you start when you start spark-shell and the default memory used for that is 512M. You can increase that by setting spark.driver.memory to something higher, for example 2g