Odd NoClassDefFoundError exception

Odd NoClassDefFoundError exception

Lavelle, Shawn

Hello Spark Community,

   I have a Spark-SQL problem where I am receiving a NoClassDefFoundError error for org.apache.spark.sql.catalyst.util.RebaseDateTime$ .  This happens for any query with a filter on a Timestamp column when the query is first run programmatically but not when the query is first fun via Beeline/HiveThriftServerCLI. That is, if I submit the query via beeline and HiveThriftServer, the query succeeds and I can also successfully call SparkSession.sqlContext().sql().  If I run it from the program first, however, it throws the aforementioned class loader error and then does beeline will fail with the same error.


   I think there’s something the HiveThriftServer does to initialize the job / session, but I can’t sort out what it is.  I have tried the SparkSession (and the sqlContext) that is passed in to create the HiveThriftServer as well as using SparkSession builder pattern to create one.

     Can you help?  Thanks in advance and let me know if there’s more information I can provide.


~ Shawn
PS Spark 3.0.0


The Exception
Exception occurred in target VM: Could not initialize class org.apache.spark.sql.catalyst.util.RebaseDateTime$
java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.catalyst.util.RebaseDateTime$


Code Snippet:
String sql = < passed in >

SparkSession ss = < passed in instance >

Dataset<Row> ds;

        try {

            ds = ss.sql(sql);

        } catch (Exception ex) {

            return -1d;


MyRDD<Row> myRdd = (MyRDD<Row>) ds.rdd();

for (Partition p : myRdd.getPartitions()) {
… Do the things

Note, ss.sql() doesn’t throw an exception, but the ds.rdd() contains one.  When the provided SQL is run via beeline the code above works as expected.


