[Spark-Core] Long scheduling delays (1+ hour)

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Spark-Core] Long scheduling delays (1+ hour)

bsikander
We are facing an issue with very long scheduling delays in Spark (upto 1+
hours).
We are using Spark-standalone. The data is being pulled from Kafka.

Any help would be much appreciated.

I have attached the screenshots.
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/1-stats.png>
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/4.png>
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/3.png>
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/2.png>







--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Spark-Core] Long scheduling delays (1+ hour)

Biplob Biswas
Hi,

This has to do with your batch duration and processing time, as a rule, the batch duration should be lower than the processing time of your data. As I can see from your screenshots, your batch duration is 10 seconds but your processing time is more than a minute mostly, this adds up and you will end up having a lot of scheduling delay. 

Maybe see, why does it take 1 min to process 100 records and fix the logic. Also, I see you have higher number of events which takes some time lower amount of processing time. Fix the code logic and this should be fixed. 

Thanks & Regards
Biplob Biswas


On Wed, Nov 7, 2018 at 11:08 AM bsikander <[hidden email]> wrote:
We are facing an issue with very long scheduling delays in Spark (upto 1+
hours).
We are using Spark-standalone. The data is being pulled from Kafka.

Any help would be much appreciated.

I have attached the screenshots.
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/1-stats.png>
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/4.png>
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/3.png>
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/2.png>







--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Spark-Core] Long scheduling delays (1+ hour)

bsikander
Actually, our job runs fine for 17-18 hours and this behavior just suddenly
starts happening after that.

We found the following ticket which is exactly what is happening in our
Kafka cluster also.
WARN Failed to send SSL Close message
(org.apache.kafka.common.network.SslTransportLayer)

You also replied to this ticket with a problem very similar to ours.

what fix you did to avoid these SSL Close exceptions and long delays in
spark job?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Spark-Core] Long scheduling delays (1+ hour)

bsikander
In reply to this post by Biplob Biswas
Could you please give some feedback.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Spark-Core] Long scheduling delays (1+ hour)

bsikander
In reply to this post by bsikander
Forgot to add the link
https://jira.apache.org/jira/browse/KAFKA-5649



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]