Back pressure not working on streaming

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Back pressure not working on streaming

JF Chen
I have set  spark.streaming.backpressure.enabled to true,  spark.streaming.backpressure.initialRate to 10. 
Once my application started, it received 32 million messages on first batch. 
My application runs every 300 seconds, with 32 kafka partition. So what's is the max rate if I set initial rate to 10?

Thanks!
  

Regard,
Junfeng Chen
Reply | Threaded
Open this post in threaded view
|

Re: Back pressure not working on streaming

Harsh
There is separate property for max rate , by default is is not set, so if you want to limit the max rate you should  provide that property  a value.

Initial rate =10 means it will pick only 10 records per receiver in the batch interval when you start the process.

Depending  upon the consumption rate it will increase  the consumption of records for processing in each batch.

However i, feel 10 is way to low number for 32 partitioned kafka topic.



Regards
Harsh 
Happy New Year 

On Wed 2 Jan, 2019, 08:33 JF Chen <[hidden email] wrote:
I have set  spark.streaming.backpressure.enabled to true,  spark.streaming.backpressure.initialRate to 10. 
Once my application started, it received 32 million messages on first batch. 
My application runs every 300 seconds, with 32 kafka partition. So what's is the max rate if I set initial rate to 10?

Thanks!
  

Regard,
Junfeng Chen
Reply | Threaded
Open this post in threaded view
|

Re: Back pressure not working on streaming

Dillon Bostwick
In reply to this post by JF Chen
Unsubscribe

On Tue, Jan 1, 2019 at 10:03 PM JF Chen <[hidden email]> wrote:
I have set  spark.streaming.backpressure.enabled to true,  spark.streaming.backpressure.initialRate to 10. 
Once my application started, it received 32 million messages on first batch. 
My application runs every 300 seconds, with 32 kafka partition. So what's is the max rate if I set initial rate to 10?

Thanks!
  

Regard,
Junfeng Chen
--

Dillon Bostwick
Solutions Engineer
Databricks
678-770-5344

Reply | Threaded
Open this post in threaded view
|

Re: Back pressure not working on streaming

JF Chen
In reply to this post by Harsh
yes, 10 is a very low value for testing initial rate. 
And from this article https://www.linkedin.com/pulse/enable-back-pressure-make-your-spark-streaming-production-lan-jiang/, it seems spark back pressure is not available for dstream? 
So ,max rate per partition is the only available back pressure solution for kafka dstream input? 

Regard,
Junfeng Chen


On Wed, Jan 2, 2019 at 11:49 AM HARSH TAKKAR <[hidden email]> wrote:
There is separate property for max rate , by default is is not set, so if you want to limit the max rate you should  provide that property  a value.

Initial rate =10 means it will pick only 10 records per receiver in the batch interval when you start the process.

Depending  upon the consumption rate it will increase  the consumption of records for processing in each batch.

However i, feel 10 is way to low number for 32 partitioned kafka topic.



Regards
Harsh 
Happy New Year 

On Wed 2 Jan, 2019, 08:33 JF Chen <[hidden email] wrote:
I have set  spark.streaming.backpressure.enabled to true,  spark.streaming.backpressure.initialRate to 10. 
Once my application started, it received 32 million messages on first batch. 
My application runs every 300 seconds, with 32 kafka partition. So what's is the max rate if I set initial rate to 10?

Thanks!
  

Regard,
Junfeng Chen