Measure throughput streaming

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Measure throughput streaming

aecc
Hi,

In Spark Streaming:

I'm trying to find a way to measure how many entries can be processed per unit of time? Is there something that can help me with that? What about the number of entries received in each RDD?

Regards
Reply | Threaded
Open this post in threaded view
|

Re: Measure throughput streaming

Tathagata Das
Hello, 

For number of records received per second, you could use something like this to calculate number of records in each batch, and divide it by your batch size.

yourDStream.foreachRDD(rdd => {
  val count = rdd.count
  println("Current rate = " + (count / batchInterval) + " records / second")
})



On Tue, Feb 11, 2014 at 9:52 AM, aecc <[hidden email]> wrote:
Hi,

In Spark Streaming:

I'm trying to find a way to measure how many entries can be processed per
unit of time? Is there something that can help me with that? What about the
number of entries received in each RDD?

Regards



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Measure-throughput-streaming-tp1400.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.