Fwd: Some Questions & Doubts regarding Spark process
I am attaching the Spark process web Info screenshot, have a look at screenshot.
1) For A single Map operator why it shows multiple complete Stages, with same information.
2) As you can see the Number of Complete workers is more than Maximum workers (2931/2339). Can you please tell me why it shows like that ??
3) How a stage is designed in spark As you can see my code After first Map with groupByKey and filter I am running one more Map then filter then Count But this spark Combined these three stages and Named it as Count (you can see in ScreenShot attached). Can you please explain How does it combine stages and what is the logic or idea behind this??