Spark Streaming: Aggregating values across batches

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark Streaming: Aggregating values across batches

Something Something
We've a Spark Streaming job that calculates some values in each batch. What we need to do now is aggregate values across ALL batches. What is the best strategy to do this in Spark Streaming. Should we use 'Spark Accumulators' for this?
Reply | Threaded
Open this post in threaded view
|

Re: Spark Streaming: Aggregating values across batches

Tathagata Das
Use Structured Streaming. Its aggregation, by definition, is across batches.

On Thu, Feb 27, 2020 at 3:17 PM Something Something <[hidden email]> wrote:
We've a Spark Streaming job that calculates some values in each batch. What we need to do now is aggregate values across ALL batches. What is the best strategy to do this in Spark Streaming. Should we use 'Spark Accumulators' for this?