Remote Data Read Time

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Remote Data Read Time

swastik mittal
I was working with custom spark listener library. There, I am not able to
figure out a way to break into the details of task. I only have a listener
which runs on task start, But I want to calculate the time my executor took
to read input data from remote data source for that task, but as spark does
lazy loading I can't work with getting timestamps before and after the read
instruction. Spark monitoring provides me with total computation time and in
that the Cpu time and the shuffle data fetch time which I can subtract and
get a result which is time to read and write from remote data source plus
the time to read the RDD from main memory and also the time to read and
write the data to the main memory. So how do I extract only the read time
from remote data source?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]