RE: Difference between Typed and untyped transformation in dataset API
From what I understand , if the transformation is untyped it will return a Dataframe , otherwise it will return a Dataset. In the source code you will see that return type is a Dataframeinstead of a Dataset and they should also be annotated with @group untypedrel.Thus , you could check the signature of the method to determine if it is untyped or not.
In general , anything that changes the type of a column or adds a new column in a Dataset will be untyped. The idea of a Dataset is to stay constant when it comes to the schema. The moment you try to modify the schema , we need to fallback to a Dataframe.
For example , withColumn is untyped because it transforms the Dataset(typed) to an untyped structure(Dataframe).
From: Akhilanand <[hidden email]> Sent: Thursday, February 21, 2019 7:35 PM To: user <[hidden email]> Subject: Difference between Typed and untyped transformation in dataset API
What is the key difference between Typed and untyped transformation in dataset API?
How do I determine if its typed or untyped?
Any gotchas when to use what apart from the reason that it does the job for me?