[Spark SQL] Catalyst ScalaReflection/ExpressionEncoder fail with relocated (shaded) classes

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[Spark SQL] Catalyst ScalaReflection/ExpressionEncoder fail with relocated (shaded) classes


I'm trying to compile google's timestamp.proto protobuf to a scala case
class and use it as a field in another proto-derived case class as part of a
larger dataset schema.
(Although the SQL date type might be preferred in a schema, I encountered
this problem when I attempted to use Timestamp for compatibility with some
existing code)

To avoid the usual "spark/hadoop provide protobuf packages which conflict
with user code" problem, I relocated the com.google.protobuf.timestamp
package in my uberjar with the gradle-shadow plugin.

Unfortunately, this leads to a somewhat cryptic error message referencing
the original package name at runtime:

( https://gist.github.com/johkelly/0c99c7bf717adc610fc906296be02850 )

Relocating the Timestamp class' package appears to have broken the encoder
generation code. I don't understand the libraries involved well enough to
know where to file a bug report, however.

I put together a small gradle project that seems to demonstrate the problem
locally (albeit with a different error message):


An explanation of which component (scala, spark, shadow, other?) is at fault
here so I can know where to direct a bug report (and possibly create a
workaround) would be appreciated.

Jack Kelly

Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

To unsubscribe e-mail: [hidden email]