Custom Metrics Source -> Sink routing

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Custom Metrics Source -> Sink routing

Dávid Szakállas
Is there a way to customize what metrics sources are routed to what sinks? 

If I understood the docs correctly, there are some global switches for enabling sources, e.g. spark.metrics.staticSources.enabled, spark.metrics.executorMetricsSource.enabled.

We would like to specify Source -> Sink routing on a namespace basis. The use case is the following: we would like to have Prometheus monitoring for our Spark jobs. A large majority of metrics are to be exposed using the experimental Prometheus endpoint for direct scraping. However, we would like to expose a select set of metrics through a push gateway, as we want to guarantee that these metrics are scraped. For example a counter for the number of rows written to each inserted table, etc. These are reported mostly at the end of a batch ingestion job, so a push model is a better fit. We created a dedicated DropWizard MetricsRegistry for these custom metrics and are using https://github.com/banzaicloud/spark-metrics for pushing the metrics to the PGW. However pushing all the metrics to the gateway overloads it, and is unnecessary to be duplicated there. 

Ideally there should be a way to route batch-like metrics to this sink and having the rest of the gauges exposed through the normal prometheus sink.

Is this something that could be solved with configuration currently, or requires custom code on the plugin side?

Thanks,
David Szakallas