The simplest Syntax for saprk/Scala collect.foreach(println) in Pyspark

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

The simplest Syntax for saprk/Scala collect.foreach(println) in Pyspark

Mich Talebzadeh
Hi

In Spark/Scala one can do

scala> println ("\nStarted at"); spark.sql("SELECT FROM_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') ").collect.foreach(println)
Started at
[12/10/2020 22:29:19.19]

I believe foreach(println) is a special syntax in this case.

I can also do a verbose one

scala> println ("\nStarted at"); spark.sql("SELECT FROM_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') ").show()
Started at
+-----------------------------------------------------------------------------------------------+
|from_unixtime(unix_timestamp(current_timestamp(), yyyy-MM-dd HH:mm:ss), dd/MM/yyyy HH:mm:ss.ss)|
+-----------------------------------------------------------------------------------------------+
|                                                                           12/10/2020 22:25:...|
+-----------------------------------------------------------------------------------------------+

In Python I can do

print("\nStarted at");(spark.sql("SELECT FROM_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') ")).show()

Started at
+-----------------------------------------------------------------------------------------------+
|from_unixtime(unix_timestamp(current_timestamp(), yyyy-MM-dd HH:mm:ss), dd/MM/yyyy HH:mm:ss.ss)|
+-----------------------------------------------------------------------------------------------+
|                                                                           12/10/2020 22:29:...|
+-----------------------------------------------------------------------------------------------+
If I wanted it to make the output less verbose I can do

>>> print("\nStarted at")
Started at
>>> for x in (spark.sql("SELECT FROM_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') ")).collect():
...   print(x[0])
...
12/10/2020 22:54:44.44
>>>

But to be honest this looks ridiculous :(

Any suggestion to improve is appreciated!

Thanks

Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.