when can we expect multiple aggregations to be supported in spark structured streaming?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

when can we expect multiple aggregations to be supported in spark structured streaming?

kant kodali
Hi All,

when can we expect multiple aggregations to be supported in spark structured streaming?

For example,

id | amount | my_timestamp
------------------------------------------------------
1  |      5      |  2018-04-01T01:00:00.000Z
1  |     10     |  2018-04-01T01:10:00.000Z
2  |     20     |  2018-04-01T01:20:00.000Z
2  |     30     |  2018-04-01T01:25:00.000Z
2  |     40     |  2018-04-01T01:30:00.000Z


I want to run a query like below to solve the problem in all streaming fashion

select sum(amount) from (select amount, max(my_timestamp) from table group by id, window("my_timestamp", "1 hours"))

just want the output to be 

sum(amount)
------------------
 50

I am trying to find a solution without using flatMapGroupWithState or order by. I am using spark 2.3.1 (custom built from master) and I had already tried self join solution but again I am running into "multiple aggregations are not supported"

Thanks!