Spark structured streaming aggregation within microbatch

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Spark structured streaming aggregation within microbatch

Koert Kuipers
I have a streaming dataframe where I insert a uuid in every row, then join with a static dataframe (after which uuid column is no longer unique), then group by uuid and do a simple aggregation.

So I know all rows with same uuid will be in same micro batch, guaranteed, correct? How do I express it as such in structured streaming? I don't need an aggregation across batches.

Thanks!