Exception handling in Spark throws recursive value for DF needs type error

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Exception handling in Spark throws recursive value for DF needs type error

Mich Talebzadeh

Hi,


Spark version 2.3.3 on Google Dataproc


I am trying to use databricks to other databases


https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html


 to read from Hive table on Prem using Spark in Cloud


This works OK without a Try enclosure. 


import spark.implicits._

import scala.util.{Try, Success, Failure}


val HiveDF = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(HiveDF) => HiveDF

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


However, with Try I am getting the following error


<console>:66: error: recursive value HiveDF needs type

                          case Success(HiveDF) => HiveDF


Wondering what is causing this. I have used it before (say reading from an XML file) and it worked the,

Thanks





Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 

Reply | Threaded
Open this post in threaded view
|

Re: Exception handling in Spark throws recursive value for DF needs type error

srowen
You are reusing HiveDF for two vars and it ends up ambiguous. Just rename one. 

On Thu, Oct 1, 2020, 5:02 PM Mich Talebzadeh <[hidden email]> wrote:

Hi,


Spark version 2.3.3 on Google Dataproc


I am trying to use databricks to other databases


https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html


 to read from Hive table on Prem using Spark in Cloud


This works OK without a Try enclosure. 


import spark.implicits._

import scala.util.{Try, Success, Failure}


val HiveDF = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(HiveDF) => HiveDF

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


However, with Try I am getting the following error


<console>:66: error: recursive value HiveDF needs type

                          case Success(HiveDF) => HiveDF


Wondering what is causing this. I have used it before (say reading from an XML file) and it worked the,

Thanks





Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 

Reply | Threaded
Open this post in threaded view
|

Re: Exception handling in Spark throws recursive value for DF needs type error

Mich Talebzadeh

Many thanks SEan.


Maybe I misunderstood your point?


var DF = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(DF) => HiveDF

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }

Still getting the error


<console>:74: error: recursive method DF needs type

                          case Success(DF) => HiveDF


Do I need to define DF as DataFrame beforehand because at that moment Spark does not know what DF type is

Thanks again


Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 



On Thu, 1 Oct 2020 at 23:08, Sean Owen <[hidden email]> wrote:
You are reusing HiveDF for two vars and it ends up ambiguous. Just rename one. 

On Thu, Oct 1, 2020, 5:02 PM Mich Talebzadeh <[hidden email]> wrote:

Hi,


Spark version 2.3.3 on Google Dataproc


I am trying to use databricks to other databases


https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html


 to read from Hive table on Prem using Spark in Cloud


This works OK without a Try enclosure. 


import spark.implicits._

import scala.util.{Try, Success, Failure}


val HiveDF = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(HiveDF) => HiveDF

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


However, with Try I am getting the following error


<console>:66: error: recursive value HiveDF needs type

                          case Success(HiveDF) => HiveDF


Wondering what is causing this. I have used it before (say reading from an XML file) and it worked the,

Thanks





Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 

Reply | Threaded
Open this post in threaded view
|

Re: Exception handling in Spark throws recursive value for DF needs type error

Russell Spitzer
You can't use df as the name of the return from the try and the name of the match variable in success. You also probably want to match the name of the variable in the match with the return from the match.

So 

val df = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(validDf) => validDf

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


On Thu, Oct 1, 2020 at 5:53 PM Mich Talebzadeh <[hidden email]> wrote:

Many thanks SEan.


Maybe I misunderstood your point?


var DF = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(DF) => HiveDF

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }

Still getting the error


<console>:74: error: recursive method DF needs type

                          case Success(DF) => HiveDF


Do I need to define DF as DataFrame beforehand because at that moment Spark does not know what DF type is

Thanks again


Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 



On Thu, 1 Oct 2020 at 23:08, Sean Owen <[hidden email]> wrote:
You are reusing HiveDF for two vars and it ends up ambiguous. Just rename one. 

On Thu, Oct 1, 2020, 5:02 PM Mich Talebzadeh <[hidden email]> wrote:

Hi,


Spark version 2.3.3 on Google Dataproc


I am trying to use databricks to other databases


https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html


 to read from Hive table on Prem using Spark in Cloud


This works OK without a Try enclosure. 


import spark.implicits._

import scala.util.{Try, Success, Failure}


val HiveDF = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(HiveDF) => HiveDF

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


However, with Try I am getting the following error


<console>:66: error: recursive value HiveDF needs type

                          case Success(HiveDF) => HiveDF


Wondering what is causing this. I have used it before (say reading from an XML file) and it worked the,

Thanks





Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 

Reply | Threaded
Open this post in threaded view
|

Re: Exception handling in Spark throws recursive value for DF needs type error

Mich Talebzadeh
Many thanks Russell. That worked

val HiveDF = Try(spark.read.
     format("jdbc").
     option("url", jdbcUrl).
     option("dbtable", HiveSchema+"."+HiveTable).
     option("user", HybridServerUserName).
     option("password", HybridServerPassword).
     load()) match {
                   case Success(df) => df
                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")
     }

HiveDF: org.apache.spark.sql.DataFrame = [id: int, clustered: int ... 5 more fields]

Appreciated your help Sean and Russell


Mich


On Fri, 2 Oct 2020 at 01:22, Russell Spitzer <[hidden email]> wrote:
You can't use df as the name of the return from the try and the name of the match variable in success. You also probably want to match the name of the variable in the match with the return from the match.

So 

val df = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(validDf) => validDf

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


On Thu, Oct 1, 2020 at 5:53 PM Mich Talebzadeh <[hidden email]> wrote:

Many thanks SEan.


Maybe I misunderstood your point?


var DF = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(DF) => HiveDF

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }

Still getting the error


<console>:74: error: recursive method DF needs type

                          case Success(DF) => HiveDF


Do I need to define DF as DataFrame beforehand because at that moment Spark does not know what DF type is

Thanks again


Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 



On Thu, 1 Oct 2020 at 23:08, Sean Owen <[hidden email]> wrote:
You are reusing HiveDF for two vars and it ends up ambiguous. Just rename one. 

On Thu, Oct 1, 2020, 5:02 PM Mich Talebzadeh <[hidden email]> wrote:

Hi,


Spark version 2.3.3 on Google Dataproc


I am trying to use databricks to other databases


https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html


 to read from Hive table on Prem using Spark in Cloud


This works OK without a Try enclosure. 


import spark.implicits._

import scala.util.{Try, Success, Failure}


val HiveDF = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(HiveDF) => HiveDF

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


However, with Try I am getting the following error


<console>:66: error: recursive value HiveDF needs type

                          case Success(HiveDF) => HiveDF


Wondering what is causing this. I have used it before (say reading from an XML file) and it worked the,

Thanks





Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 

Reply | Threaded
Open this post in threaded view
|

Re: Exception handling in Spark throws recursive value for DF needs type error

Mich Talebzadeh
As a side question consider the following read JDBC read


val lowerBound = 1L

val upperBound = 1000000L

val numPartitions = 10

val partitionColumn = "id"


val HiveDF = Try(spark.read.

    format("jdbc").

    option("url", jdbcUrl).

    option("driver", HybridServerDriverName).

    option("dbtable", HiveSchema+"."+HiveTable).

    option("user", HybridServerUserName).

    option("password", HybridServerPassword).

    option("partitionColumn", partitionColumn).

    option("lowerBound", lowerBound).

    option("upperBound", upperBound).

    option("numPartitions", numPartitions).

    load()) match {

                   case Success(df) => df

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


Are there any performance implications of having Try, Success, Failure enclosure around DF?

Thanks

 Mich


Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 



On Fri, 2 Oct 2020 at 05:33, Mich Talebzadeh <[hidden email]> wrote:
Many thanks Russell. That worked

val HiveDF = Try(spark.read.
     format("jdbc").
     option("url", jdbcUrl).
     option("dbtable", HiveSchema+"."+HiveTable).
     option("user", HybridServerUserName).
     option("password", HybridServerPassword).
     load()) match {
                   case Success(df) => df
                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")
     }

HiveDF: org.apache.spark.sql.DataFrame = [id: int, clustered: int ... 5 more fields]

Appreciated your help Sean and Russell


Mich


On Fri, 2 Oct 2020 at 01:22, Russell Spitzer <[hidden email]> wrote:
You can't use df as the name of the return from the try and the name of the match variable in success. You also probably want to match the name of the variable in the match with the return from the match.

So 

val df = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(validDf) => validDf

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


On Thu, Oct 1, 2020 at 5:53 PM Mich Talebzadeh <[hidden email]> wrote:

Many thanks SEan.


Maybe I misunderstood your point?


var DF = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(DF) => HiveDF

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }

Still getting the error


<console>:74: error: recursive method DF needs type

                          case Success(DF) => HiveDF


Do I need to define DF as DataFrame beforehand because at that moment Spark does not know what DF type is

Thanks again


Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 



On Thu, 1 Oct 2020 at 23:08, Sean Owen <[hidden email]> wrote:
You are reusing HiveDF for two vars and it ends up ambiguous. Just rename one. 

On Thu, Oct 1, 2020, 5:02 PM Mich Talebzadeh <[hidden email]> wrote:

Hi,


Spark version 2.3.3 on Google Dataproc


I am trying to use databricks to other databases


https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html


 to read from Hive table on Prem using Spark in Cloud


This works OK without a Try enclosure. 


import spark.implicits._

import scala.util.{Try, Success, Failure}


val HiveDF = Try(spark.read.

     format("jdbc").

     option("url", jdbcUrl).

     option("dbtable", HiveSchema+"."+HiveTable).

     option("user", HybridServerUserName).

     option("password", HybridServerPassword).

     load()) match {

                   case Success(HiveDF) => HiveDF

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


However, with Try I am getting the following error


<console>:66: error: recursive value HiveDF needs type

                          case Success(HiveDF) => HiveDF


Wondering what is causing this. I have used it before (say reading from an XML file) and it worked the,

Thanks





Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 

Reply | Threaded
Open this post in threaded view
|

Re: Exception handling in Spark throws recursive value for DF needs type error

srowen
It would be quite trivial. None of that affects any of the Spark execution.
It doesn't seem like it helps though - you are just swallowing the cause. Just let it fly?

On Fri, Oct 2, 2020 at 9:34 AM Mich Talebzadeh <[hidden email]> wrote:
As a side question consider the following read JDBC read


val lowerBound = 1L

val upperBound = 1000000L

val numPartitions = 10

val partitionColumn = "id"


val HiveDF = Try(spark.read.

    format("jdbc").

    option("url", jdbcUrl).

    option("driver", HybridServerDriverName).

    option("dbtable", HiveSchema+"."+HiveTable).

    option("user", HybridServerUserName).

    option("password", HybridServerPassword).

    option("partitionColumn", partitionColumn).

    option("lowerBound", lowerBound).

    option("upperBound", upperBound).

    option("numPartitions", numPartitions).

    load()) match {

                   case Success(df) => df

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


Are there any performance implications of having Try, Success, Failure enclosure around DF?
Reply | Threaded
Open this post in threaded view
|

Re: Exception handling in Spark throws recursive value for DF needs type error

Mich Talebzadeh
Thanks Sean. I guess I was being pedantic. In any case if the source table does not exist as spark.read is a collection, then it is going to fall over one way or another!




On Fri, 2 Oct 2020 at 15:55, Sean Owen <[hidden email]> wrote:
It would be quite trivial. None of that affects any of the Spark execution.
It doesn't seem like it helps though - you are just swallowing the cause. Just let it fly?

On Fri, Oct 2, 2020 at 9:34 AM Mich Talebzadeh <[hidden email]> wrote:
As a side question consider the following read JDBC read


val lowerBound = 1L

val upperBound = 1000000L

val numPartitions = 10

val partitionColumn = "id"


val HiveDF = Try(spark.read.

    format("jdbc").

    option("url", jdbcUrl).

    option("driver", HybridServerDriverName).

    option("dbtable", HiveSchema+"."+HiveTable).

    option("user", HybridServerUserName).

    option("password", HybridServerPassword).

    option("partitionColumn", partitionColumn).

    option("lowerBound", lowerBound).

    option("upperBound", upperBound).

    option("numPartitions", numPartitions).

    load()) match {

                   case Success(df) => df

                   case Failure(e) => throw new Exception("Error Encountered reading Hive table")

     }


Are there any performance implications of having Try, Success, Failure enclosure around DF?