[SPARK on MESOS] Avoid re-fetching Spark binary

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

[SPARK on MESOS] Avoid re-fetching Spark binary

Tien Dat
Dear all,

We are running Spark with Mesos as the master for resource management.
In our cluster, there are jobs that require very short response time (near
real time applications), which usually around 3-5 seconds.

In order to Spark to execute with Mesos, one has to specify the
SPARK_EXECUTOR_URI configuration, which defines the location where Mesos can
fetch the Spark binary every time it launches new job.
We noticed that the fetching and extraction of the Spark binary repeats
every time we run, even though the binary is basically the same. More
importantly, fetching and extracting this file can lead to 2-3 seconds of
latency, which is fatal for our near real-time application. Besides, after
running many Spark jobs, the Spark binary tar is cumulated and occupies a
large disk space.

As a result, we wonder if there is a workaround to avoid this fetching and
extracting process, given that the Spark binary is available locally at each
of the Mesos agent?

Please don't hesitate to ask me if you have any further information needed.
Thank you in advance.

Best regards



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [SPARK on MESOS] Avoid re-fetching Spark binary

Timothy Chen
If its available locally on each host, then don’t specify a remote url but a local file uri instead.

We have a fetcher cache in Mesos a while ago, I believe there is integration in the Spark framework if you look at the documentation as well. With the fetcher cache enabled Mesos agent will cache the same remote binary as well.

Tim
On Fri, Jul 6, 2018 at 5:00 PM Tien Dat <[hidden email]> wrote:
Dear all,

We are running Spark with Mesos as the master for resource management.
In our cluster, there are jobs that require very short response time (near
real time applications), which usually around 3-5 seconds.

In order to Spark to execute with Mesos, one has to specify the
SPARK_EXECUTOR_URI configuration, which defines the location where Mesos can
fetch the Spark binary every time it launches new job.
We noticed that the fetching and extraction of the Spark binary repeats
every time we run, even though the binary is basically the same. More
importantly, fetching and extracting this file can lead to 2-3 seconds of
latency, which is fatal for our near real-time application. Besides, after
running many Spark jobs, the Spark binary tar is cumulated and occupies a
large disk space.

As a result, we wonder if there is a workaround to avoid this fetching and
extracting process, given that the Spark binary is available locally at each
of the Mesos agent?

Please don't hesitate to ask me if you have any further information needed.
Thank you in advance.

Best regards



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [SPARK on MESOS] Avoid re-fetching Spark binary

Tien Dat
Thank you for your answer.

The think it I actually pointed to a local binary file. And Mesos locally
copied the binary file to a specific folder in /var/lib/mesos/... and
extract it to every time it launched an Spark executor. With the fetch
cache, the copy time is reduced, but the reduction is not much since the
file is stored at local any way.
The process that takes more time is the extraction.
Finally, since Mesos make a new folder for extracting the Spark binary each
time a new Spark job runs, the disk usage increases gradually.

Therefore, our expectation is to have Spark running on Mesos without this
binary extraction, as well as without storing the same binary every time new
Spark job runs.

Does that make sense to you? And do you have any idea how to deal with this?

Best





--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [SPARK on MESOS] Avoid re-fetching Spark binary

Timothy Chen
Got it, then you can have an extracted Spark directory on each host on the same location, and don’t specify SPARK_EXECUTOR_URI. Instead, set spark.mesos.executor.home to that directory.

This should effectively do what you want, which avoids extracting and fetching and just executed the command.

Tim
On Fri, Jul 6, 2018 at 5:57 PM Tien Dat <[hidden email]> wrote:
Thank you for your answer.

The think it I actually pointed to a local binary file. And Mesos locally
copied the binary file to a specific folder in /var/lib/mesos/... and
extract it to every time it launched an Spark executor. With the fetch
cache, the copy time is reduced, but the reduction is not much since the
file is stored at local any way.
The process that takes more time is the extraction.
Finally, since Mesos make a new folder for extracting the Spark binary each
time a new Spark job runs, the disk usage increases gradually.

Therefore, our expectation is to have Spark running on Mesos without this
binary extraction, as well as without storing the same binary every time new
Spark job runs.

Does that make sense to you? And do you have any idea how to deal with this?

Best





--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [SPARK on MESOS] Avoid re-fetching Spark binary

Tien Dat
This post was updated on .
Dear Timothy,

It works like a charm now.

BTW (don't judge me if I am asking too much :-)), the latency to start a Spark job
is around 2-4 seconds, unless I am not aware of some awesome optimization on
Spark. Do you know if Spark community is working on reducing this latency?

Best



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org

Reply | Threaded
Open this post in threaded view
|

Re: [SPARK on MESOS] Avoid re-fetching Spark binary

Timothy Chen
I know there are some community efforts shown in Spark summits before, mostly around reusing the same Spark context with multiple “jobs”.

I don’t think reducing Spark job startup time is a community priority afaik.

Tim
On Fri, Jul 6, 2018 at 7:12 PM Tien Dat <[hidden email]> wrote:
Dear Timothy,

It works like a charm now.

BTW (don't judge me if I am to greedy :-)), the latency to start a Spark job
is around 2-4 seconds, unless I am not aware of some awesome optimization on
Spark. Do you know if Spark community is working on reducing this latency?

Best



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [SPARK on MESOS] Avoid re-fetching Spark binary

Mark Hamstra
In reply to this post by Tien Dat
The latency to start a Spark Job is nowhere close to 2-4 seconds under typical conditions. You appear to be creating a new Spark Application everytime instead of running multiple Jobs in one Application.

On Fri, Jul 6, 2018 at 3:12 AM Tien Dat <[hidden email]> wrote:
Dear Timothy,

It works like a charm now.

BTW (don't judge me if I am to greedy :-)), the latency to start a Spark job
is around 2-4 seconds, unless I am not aware of some awesome optimization on
Spark. Do you know if Spark community is working on reducing this latency?

Best



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [SPARK on MESOS] Avoid re-fetching Spark binary

Mark Hamstra
In reply to this post by Timothy Chen
Essentially correct. The latency to start a Spark Job is nowhere close to 2-4 seconds under typical conditions. Creating a new Spark Application every time instead of running multiple Jobs in one Application is not going to lead to acceptable interactive or real-time performance, nor is that an execution model that Spark is ever likely to support in trying to meet low-latency requirements. As such, reducing Application startup time (not Job startup time) is not a priority. 

On Fri, Jul 6, 2018 at 4:06 PM Timothy Chen <[hidden email]> wrote:
I know there are some community efforts shown in Spark summits before, mostly around reusing the same Spark context with multiple “jobs”.

I don’t think reducing Spark job startup time is a community priority afaik.

Tim
On Fri, Jul 6, 2018 at 7:12 PM Tien Dat <[hidden email]> wrote:
Dear Timothy,

It works like a charm now.

BTW (don't judge me if I am to greedy :-)), the latency to start a Spark job
is around 2-4 seconds, unless I am not aware of some awesome optimization on
Spark. Do you know if Spark community is working on reducing this latency?

Best



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [SPARK on MESOS] Avoid re-fetching Spark binary

Mark Hamstra
In reply to this post by Tien Dat
It's been done many times before by many organizations. Use Spark Job Server or Livy or create your own implementation of a similar long-running Spark Application. Creating a new Application for every Job is not the way to achieve low-latency performance.

On Tue, Jul 10, 2018 at 4:18 AM <[hidden email]> wrote:
Dear,

Our jobs are triggered by users on demand.
And new job will be submitted to Spark server via REST API. The 2-4 seconds of latency is mainly because of the initialization of SparkContext every time new job is submitted, as you have mentioned.

If you are aware of a way to avoid this initialization, could you please share it. That would be perfect for our case.

Best
Tien Dat

<quote author='Mark Hamstra'>
Essentially correct. The latency to start a Spark Job is nowhere close to
2-4 seconds under typical conditions. Creating a new Spark Application
every time instead of running multiple Jobs in one Application is not going
to lead to acceptable interactive or real-time performance, nor is that an
execution model that Spark is ever likely to support in trying to meet
low-latency requirements. As such, reducing Application startup time (not
Job startup time) is not a priority.

On Fri, Jul 6, 2018 at 4:06 PM Timothy Chen <[hidden email]> wrote:

> I know there are some community efforts shown in Spark summits before,
> mostly around reusing the same Spark context with multiple “jobs”.
>
> I don’t think reducing Spark job startup time is a community priority
> afaik.
>
> Tim
> On Fri, Jul 6, 2018 at 7:12 PM Tien Dat <[hidden email]> wrote:
>
>> Dear Timothy,
>>
>> It works like a charm now.
>>
>> BTW (don't judge me if I am to greedy :-)), the latency to start a Spark
>> job
>> is around 2-4 seconds, unless I am not aware of some awesome optimization
>> on
>> Spark. Do you know if Spark community is working on reducing this
>> latency?
>>
>> Best
>>
>>
>>
>> --
>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: [hidden email]
>>
>>

</quote>
Quoted from:
http://apache-spark-user-list.1001560.n3.nabble.com/SPARK-on-MESOS-Avoid-re-fetching-Spark-binary-tp32849p32865.html


_____________________________________
Sent from http://apache-spark-user-list.1001560.n3.nabble.com

Reply | Threaded
Open this post in threaded view
|

Re: [SPARK on MESOS] Avoid re-fetching Spark binary

Tien Dat
Thanks for your suggestion.

I have been checking Spark-jobserver. Just a off-topic question about this
project: Does Apache Spark project have any support/connection to this
Spark-jobserver project? I noticed that they do not have release for the
newest version of Spark (e.g., 2.3.1).

As you mentioned, many organizations and individuals have been using this,
so wouldn't it be better to have it developed within Spark community?

Best
Tien Dat



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]