No. of active states?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

No. of active states?

Something Something
Is there a way to get the total no. of active states in memory at any given point in a Stateful Spark Structured Streaming job? We are thinking of using this metric for 'Auto Scaling' our Spark cluster.
Reply | Threaded
Open this post in threaded view
|

Re: No. of active states?

Jungtaek Lim-2
If you're referring total "entries" in all states in SS job, it's being provided via StreamingQueryListener.


Hope this helps.

On Fri, May 8, 2020 at 3:26 AM Something Something <[hidden email]> wrote:
Is there a way to get the total no. of active states in memory at any given point in a Stateful Spark Structured Streaming job? We are thinking of using this metric for 'Auto Scaling' our Spark cluster.
Reply | Threaded
Open this post in threaded view
|

Re: No. of active states?

Something Something
No. We are already capturing these metrics (e.g. numInputRows, inputRowsPerSecond).

I am talking about "No. of States" in the memory at any given time. 

On Thu, May 7, 2020 at 4:31 PM Jungtaek Lim <[hidden email]> wrote:
If you're referring total "entries" in all states in SS job, it's being provided via StreamingQueryListener.


Hope this helps.

On Fri, May 8, 2020 at 3:26 AM Something Something <[hidden email]> wrote:
Is there a way to get the total no. of active states in memory at any given point in a Stateful Spark Structured Streaming job? We are thinking of using this metric for 'Auto Scaling' our Spark cluster.
Reply | Threaded
Open this post in threaded view
|

Re: No. of active states?

Jungtaek Lim-2
Have you looked through and see metrics for state operators?

It has been providing "total rows" of state, and starting from Spark 2.4 it also provides additional metrics specific to HDFSBackedStateStoreProvider, including estimated memory usage in overall.



On Fri, May 8, 2020 at 11:30 AM Something Something <[hidden email]> wrote:
No. We are already capturing these metrics (e.g. numInputRows, inputRowsPerSecond).

I am talking about "No. of States" in the memory at any given time. 

On Thu, May 7, 2020 at 4:31 PM Jungtaek Lim <[hidden email]> wrote:
If you're referring total "entries" in all states in SS job, it's being provided via StreamingQueryListener.


Hope this helps.

On Fri, May 8, 2020 at 3:26 AM Something Something <[hidden email]> wrote:
Is there a way to get the total no. of active states in memory at any given point in a Stateful Spark Structured Streaming job? We are thinking of using this metric for 'Auto Scaling' our Spark cluster.
Reply | Threaded
Open this post in threaded view
|

Re: No. of active states?

Edgardo Szrajber
In reply to this post by Something Something
On Friday, May 8, 2020, 05:30:56 AM GMT+3, Something Something <[hidden email]> wrote:


No. We are already capturing these metrics (e.g. numInputRows, inputRowsPerSecond).

I am talking about "No. of States" in the memory at any given time. 

On Thu, May 7, 2020 at 4:31 PM Jungtaek Lim <[hidden email]> wrote:
If you're referring total "entries" in all states in SS job, it's being provided via StreamingQueryListener.


Hope this helps.

On Fri, May 8, 2020 at 3:26 AM Something Something <[hidden email]> wrote:
Is there a way to get the total no. of active states in memory at any given point in a Stateful Spark Structured Streaming job? We are thinking of using this metric for 'Auto Scaling' our Spark cluster.