[Spark SQL]: Java Spark Classes With Attributes of Type Set In Datasets

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[Spark SQL]: Java Spark Classes With Attributes of Type Set In Datasets

ddukek
I'm trying to use a data model that has a instance variable that is a Set. If
I leave the type as the Abstract Set class I get an error thrown because Set
is an interface so it cannot be instantiated. If I then try and make the
variable a concrete implementation of Set I get an analysis exception

"org.apache.spark.sql.AnalysisException: cannot resolve 'named_struct()' due
to data type mismatch: input to function named_struct requires at least one
argument".

If I then change the type to be a list the program works just fine. I'm
using dataset operations and using the Encoders.bean method to cast the rows
to the proper type.

Is there a way to get around this without forcing me to use a List in my
model?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Spark SQL]: Java Spark Classes With Attributes of Type Set In Datasets

Dillon Dukek
Actually, it appears walking through it in a debug terminal that the deserializer can properly transform the data on read to an ArrayType, but the serializer doesn't know what to do when we try to go back out from the internal spark representation. 

tags, if (isnull(lambdavariable(MapObjects_loopValue0, MapObjects_loopIsNull0, ObjectType(class <class>), true).getTags)) null else named_struct()


On Tue, Sep 25, 2018 at 2:27 PM ddukek <[hidden email]> wrote:
I'm trying to use a data model that has a instance variable that is a Set. If
I leave the type as the Abstract Set class I get an error thrown because Set
is an interface so it cannot be instantiated. If I then try and make the
variable a concrete implementation of Set I get an analysis exception

"org.apache.spark.sql.AnalysisException: cannot resolve 'named_struct()' due
to data type mismatch: input to function named_struct requires at least one
argument".

If I then change the type to be a list the program works just fine. I'm
using dataset operations and using the Encoders.bean method to cast the rows
to the proper type.

Is there a way to get around this without forcing me to use a List in my
model?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]