A code example of Catalyst optimization

Hi there,

I am looking for an example of optimization through Catalyst, that you can demonstrate via code. Typically, you load some data in a dataframe, you do something, you do the opposite operation, and, when you collect, it’s super fast because nothing really happened to the data. Hopefully, my request is clear enough - I’d like to use that in teaching when explaining the laziness of Spark.

Does anyone has that in his labs?



