the most simple option is create UDF's of these different functions and then use case statement (or similar) in SQL and pass it on. But this is low tech, in case you have conditions based on record values which are even more granular, why not use a single UDF, and then let conditions handle it.
But I think that UDF is not that super unless you use Scala.
It will be interesting to see if there are other scalable options (which are not RDD based) from the group.
On Sun, Sep 30, 2018 at 7:31 PM dimitris plakas <[hidden email]> wrote:
I am trying to split a dataframe on partitions and i want to apply a custom function on every partition. More precisely i have a dataframe like the one below
Group_Id | Id | Points
1 | id1| Point1
2 | id2| Point2
I want to have a partition for every Group_Id and apply on every partition a function defined by me.
I have tried with partitionBy('Group_Id').mapPartitions() but i receive error.