Writing a UDF that works with an Interval in PySpark

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Writing a UDF that works with an Interval in PySpark

Daniel Haviv
Hi,
I'm trying to write a variant of date_add that accepts an interval as a second parameter so that I could use the following syntax with SparkSQL:
select date_add(cast('1970-01-01' as date), interval 1 day)

but I'm getting the following error:
ValueError: (ValueError(u'Could not parse datatype: calendarinterval',), <function _parse_datatype_json_string at 0x7f823ed68f50>, (u'{"type":"struct","fields":[{"name":"","type":"date","nullable":true,"metadata":{}},{"name":"","type":"calendarinterval","nullable":true,"metadata":{}}]}',))

Any ideas how can I achieve this (or even better, has someone already done this)?

Thank you.
Daniel