Exporting all Executor Metrics in Prometheus format in K8s cluster
I’ve been trying to set up monitoring for our Spark 3.0.1 cluster running in K8s. We are using Prometheus as our monitoring system. We require both executor and driver metrics. My initial approach was to use the following configuration, to expose both metrics on the Spark UI:
I am not sure if these are availableon the driver at all, so I’ve been thinking of directly scraping the executors instead. It seems PrometheusServlet is meant for this purpose, however the executors aren't running web servers. I also don’t seem to find a configuration setting to open up a port on the executor container, so that it can be scraped. So the thing I have in my mind right now is writing a custom sink that exports the metrics in the Prometheus format to a local file, and running a sidecar container with a nginx that serves that static file. In turn the nginx endpoint can be scraped by Prometheus. Am I overcomplicating this? Is there a simpler approach?