Typed datataset from Avro generated classes?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Typed datataset from Avro generated classes?

Joaquin Tarraga
Hi all,
I have an  Avro generated class (e.g., AvroGenerateClass) and I am using Encoders.bean to get a typed dataset (e.g., Dataset<AvroGeneratedClass>):
Encoder<AvroGeneratedClass> encoder = Encoders.bean(AvroGenereatedClass.class);
Dataset<AvroGeneratedClass> ds = sparkSession.read().parquet(filename).as(encoder);
I am getting an exception from the Encoders.bean call:
"java.lang.UnsupportedOperationException: Cannot have circular references in bean class, but got the circular reference of class class org.apache.avro.Schema"

How can I get a typed dataset from Avro generated classes?

Thanks.
--
Joaquín

Reply | Threaded
Open this post in threaded view
|

Re: Typed datataset from Avro generated classes?

Elkhan Dadashov
Hi Spark users,

Did anyone resolve this issue?

Encoder<AvroGeneratedClass> encoder = Encoders.bean(AvroGenereatedClass.class);
Dataset<AvroGeneratedClass> ds = sparkSession.read().parquet(filename).as(encoder);

I'm also facing the same problem: "Cannot have circular references in bean class, but got the circular reference of class class org.apache.avro.Schema" 

This happens due to getSchema() method in a generated Avro Java class.

How can I get a typed dataset from Avro generated classes? 

Thanks.

On Wed, Sep 27, 2017 at 3:23 AM Joaquin Tarraga <[hidden email]> wrote:
Hi all,
I have an  Avro generated class (e.g., AvroGenerateClass) and I am using Encoders.bean to get a typed dataset (e.g., Dataset<AvroGeneratedClass>):
Encoder<AvroGeneratedClass> encoder = Encoders.bean(AvroGenereatedClass.class);
Dataset<AvroGeneratedClass> ds = sparkSession.read().parquet(filename).as(encoder);
I am getting an exception from the Encoders.bean call:
"java.lang.UnsupportedOperationException: Cannot have circular references in bean class, but got the circular reference of class class org.apache.avro.Schema"

How can I get a typed dataset from Avro generated classes?

Thanks.
--
Joaquín



--

Best regards,
Elkhan Dadashov
Reply | Threaded
Open this post in threaded view
|

Re: Typed datataset from Avro generated classes?

Nads
Same problem here.  A google search shows a few related jira tickets in
"Resolved" state but I am getting the same error in Spark 3.0.1.  I'm
pasting my `spark-shell` output below:

scala> import org.apache.spark.sql.Encoders
import org.apache.spark.sql.Encoders

scala> val linkageBean = Encoders.bean(classOf[MyAvroGeneratedClass])
java.lang.UnsupportedOperationException: Cannot have circular references in
bean class, but got the circular reference of class class
org.apache.avro.Schema
  at
org.apache.spark.sql.catalyst.JavaTypeInference$.inferDataType(JavaTypeInference.scala:142)
  at
org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$inferDataType$1(JavaTypeInference.scala:150)
  at
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
  at
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
  at
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
  at scala.collection.TraversableLike.map(TraversableLike.scala:238)
  at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
  at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
  at
org.apache.spark.sql.catalyst.JavaTypeInference$.inferDataType(JavaTypeInference.scala:148)
  at
org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$inferDataType$1(JavaTypeInference.scala:150)
  at
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
  at
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
  at
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
  at scala.collection.TraversableLike.map(TraversableLike.scala:238)
  at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
  at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
  at
org.apache.spark.sql.catalyst.JavaTypeInference$.inferDataType(JavaTypeInference.scala:148)
  at
org.apache.spark.sql.catalyst.JavaTypeInference$.inferDataType(JavaTypeInference.scala:126)
  at
org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$inferDataType$1(JavaTypeInference.scala:150)
  at
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
  at
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
  at
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
  at scala.collection.TraversableLike.map(TraversableLike.scala:238)
  at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
  at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
  at
org.apache.spark.sql.catalyst.JavaTypeInference$.inferDataType(JavaTypeInference.scala:148)
  at
org.apache.spark.sql.catalyst.JavaTypeInference$.inferDataType(JavaTypeInference.scala:67)
  at
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:68)
  at org.apache.spark.sql.Encoders$.bean(Encoders.scala:154)
  ... 49 elided




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]