Show function name in Logs for PythonUDFRunner

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Show function name in Logs for PythonUDFRunner

AbdealiJK
When I run Python UDFs with pyspark, I get multiple logs where it says:

18/11/22 01:51:59 INFO python.PythonUDFRunner: Times: total = 44, boot = -25, init = 67, finish = 2

I am wondering if in these logs I can identify easily which of my PythonUDFs this timing information is for (I have about a hundred) so it's quite difficult for em to identify this easily ...

Reply | Threaded
Open this post in threaded view
|

Re: Show function name in Logs for PythonUDFRunner

Eike von Seggern
Hi,

Abdeali Kothari <[hidden email]> schrieb am Do., 22. Nov. 2018 um 10:04 Uhr:
When I run Python UDFs with pyspark, I get multiple logs where it says:

18/11/22 01:51:59 INFO python.PythonUDFRunner: Times: total = 44, boot = -25, init = 67, finish = 2

I am wondering if in these logs I can identify easily which of my PythonUDFs this timing information is for (I have about a hundred) so it's quite difficult for em to identify this easily ...

If the log is created using Python's logging module, it should be possible. This supports `funcName` (https://docs.python.org/3.6/library/logging.html#logrecord-attributes). But I do not know how to configure the log-format for pyspark.

 HTH

Eike
Reply | Threaded
Open this post in threaded view
|

Re: Show function name in Logs for PythonUDFRunner

AbdealiJK
My understanding is that the log is printed by PythonRunner.scala in the spark code base. May be mistaken 

On Thu, Nov 22, 2018, 17:54 Eike von Seggern <[hidden email] wrote:
Hi,

Abdeali Kothari <[hidden email]> schrieb am Do., 22. Nov. 2018 um 10:04 Uhr:
When I run Python UDFs with pyspark, I get multiple logs where it says:

18/11/22 01:51:59 INFO python.PythonUDFRunner: Times: total = 44, boot = -25, init = 67, finish = 2

I am wondering if in these logs I can identify easily which of my PythonUDFs this timing information is for (I have about a hundred) so it's quite difficult for em to identify this easily ...

If the log is created using Python's logging module, it should be possible. This supports `funcName` (https://docs.python.org/3.6/library/logging.html#logrecord-attributes). But I do not know how to configure the log-format for pyspark.

 HTH

Eike