I am currently working on visualizing the task behaviors running on spark. I found two problems that are hard to solve.
1. I set the SPARK_WORKER_CORES to 2 in my SPARK_HOME/conf/spark-env.sh file. But when I start the cluster, I still see more than two cores running on some nodes both on web UI and my application processing information. I am using my lab cluster instead of EC2. Have you encountered such problems before?
2. I set the workload “slices” manually in the example applications SparkPi. When there are more than two cores on each node, I am trying to get the information: which slice (task) is running on which core. Do you know if there is a way for spark to do that?