Spark on kubernetes memory spike and spark.kubernetes.memoryOverheadFactor not working
We are using spark on Kubernetes, through spark-on-k8s-operator. Our application deals with multiple updateStateByKey operations. Upon investigation, we found that the spark application consumes a higher volume of memory. As spark-on-k8s-operator
doesn’t give the option to segregate spark JVM memory and pod memory, pod memory usage reaches 100 %. Container OS kernel kills the executor because of OOM.
So, we are trying to limit the memory overhead using –
This sets the Memory Overhead Factor that will allocate memory to non-JVM memory, which includes off-heap memory
allocations, non-JVM tasks, and various systems processes. For JVM-based jobs this value will default to 0.10 and 0.40 for non-JVM jobs. This is done as non-JVM tasks need more non-JVM heap space and such tasks commonly fail with "Memory Overhead Exceeded"
errors. This prempts this error with a higher default.
But to no avail. Is there something we are missing? Is there any way around for this problem.