[PySpark] Understanding the times reported by PythonRunner

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[PySpark] Understanding the times reported by PythonRunner

Valerie Hayot
Hi,

I'm investigating the performance of my application and am trying to
gain a better understanding of what the boot, init and finish times
reported by the PythonRunner signify.

This is an example of output I have obtained:

19/10/26 12:14:24 INFO PythonRunner: Times: total = 35412, boot = 443,
init = 20858, finish = 14111
19/10/26 12:14:24 INFO PythonRunner: Times: total = 35476, boot = 409,
init = 17866, finish = 17201

I tried to compare these to the times reported in the UI (e.g. Scheduler
delay, GC, Task deserialization time, etc) but it remains unclear what
operations occur within boot, init and finish.

Thank you,

Valerie



---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]