Type Casting Error in Spark Data Frame

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Type Casting Error in Spark Data Frame

Arnav kumar
Hello Experts,

I would need your advice in resolving the below issue when I am trying to retrieving the data from a dataframe. 

Can you please let me know where I am going wrong.

code :


// create the dataframe by parsing the json 
// Message Helper describes the JSON Struct
//data out is the json string received from Streaming Engine. 

val dataDF = sparkSession.createDataFrame(dataOut, MessageHelper.sqlMapping)
dataDF.printSchema()
/* -- out put of dataDF.printSchema

root
 |-- messageID: string (nullable = true)
 |-- messageType: string (nullable = true)
 |-- meta: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- messageParsedTimestamp: string (nullable = true)
 |    |    |-- ipaddress: string (nullable = true)
 |-- messageData: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- packetID: string (nullable = true)
 |    |    |-- messageID: string (nullable = true)
 |    |    |-- unixTime: string (nullable = true)
 


*/


dataDF.createOrReplaceTempView("message")
val routeEventDF=sparkSession.sql("select messageId ,messageData.unixTime,messageData.packetID, messageData.messageID from message")
routeEventDF.show


Error  on routeEventDF.show
Caused by: java.lang.RuntimeException: org.apache.spark.sql.catalyst.expressions.GenericRow is not a valid external type for schema of array<struct<messageParsedTimestamp:string,ipaddress:string,port:string,message:string>>>>
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.evalIfFalseExpr14$(Unknown Source)


Appreciate your help

Best Regards
Arnav Kumar.


Reply | Threaded
Open this post in threaded view
|

Re: Type Casting Error in Spark Data Frame

Patrick McCarthy
You can't select from an array like that, try instead using 'lateral view explode' in the query for that element, or before the sql stage (py)spark.sql.functions.explode.

On Mon, Jan 29, 2018 at 4:26 PM, Arnav kumar <[hidden email]> wrote:
Hello Experts,

I would need your advice in resolving the below issue when I am trying to retrieving the data from a dataframe. 

Can you please let me know where I am going wrong.

code :


// create the dataframe by parsing the json 
// Message Helper describes the JSON Struct
//data out is the json string received from Streaming Engine. 

val dataDF = sparkSession.createDataFrame(dataOut, MessageHelper.sqlMapping)
dataDF.printSchema()
/* -- out put of dataDF.printSchema

root
 |-- messageID: string (nullable = true)
 |-- messageType: string (nullable = true)
 |-- meta: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- messageParsedTimestamp: string (nullable = true)
 |    |    |-- ipaddress: string (nullable = true)
 |-- messageData: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- packetID: string (nullable = true)
 |    |    |-- messageID: string (nullable = true)
 |    |    |-- unixTime: string (nullable = true)
 


*/


dataDF.createOrReplaceTempView("message")
val routeEventDF=sparkSession.sql("select messageId ,messageData.unixTime,messageData.packetID, messageData.messageID from message")
routeEventDF.show


Error  on routeEventDF.show
Caused by: java.lang.RuntimeException: org.apache.spark.sql.catalyst.expressions.GenericRow is not a valid external type for schema of array<struct<messageParsedTimestamp:string,ipaddress:string,port:string,message:string>>>>
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.evalIfFalseExpr14$(Unknown Source)


Appreciate your help

Best Regards
Arnav Kumar.



jgp
Reply | Threaded
Open this post in threaded view
|

Re: Type Casting Error in Spark Data Frame

jgp
In reply to this post by Arnav kumar
You can try to create new columns with the nested value,

> On Jan 29, 2018, at 15:26, Arnav kumar <[hidden email]> wrote:
>
> Hello Experts,
>
> I would need your advice in resolving the below issue when I am trying to retrieving the data from a dataframe.
>
> Can you please let me know where I am going wrong.
>
> code :
>
>
> // create the dataframe by parsing the json
> // Message Helper describes the JSON Struct
> //data out is the json string received from Streaming Engine.
>
> val dataDF = sparkSession.createDataFrame(dataOut, MessageHelper.sqlMapping)
> dataDF.printSchema()
> /* -- out put of dataDF.printSchema
>
> root
>  |-- messageID: string (nullable = true)
>  |-- messageType: string (nullable = true)
>  |-- meta: array (nullable = true)
>  |    |-- element: struct (containsNull = true)
>  |    |    |-- messageParsedTimestamp: string (nullable = true)
>  |    |    |-- ipaddress: string (nullable = true)
>  |-- messageData: array (nullable = true)
>  |    |-- element: struct (containsNull = true)
>  |    |    |-- packetID: string (nullable = true)
>  |    |    |-- messageID: string (nullable = true)
>  |    |    |-- unixTime: string (nullable = true)
>  
>
>
> */
>
>
> dataDF.createOrReplaceTempView("message")
> val routeEventDF=sparkSession.sql("select messageId ,messageData.unixTime,messageData.packetID, messageData.messageID from message")
> routeEventDF.show
>
>
> Error  on routeEventDF.show
> Caused by: java.lang.RuntimeException: org.apache.spark.sql.catalyst.expressions.GenericRow is not a valid external type for schema of array<struct<messageParsedTimestamp:string,ipaddress:string,port:string,message:string>>>>
> at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.evalIfFalseExpr14$(Unknown Source)
>
>
> Appreciate your help
>
> Best Regards
> Arnav Kumar.
>
>


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Type Casting Error in Spark Data Frame

vijay.bvp
Assuming MessageHelper.sqlMapping schema is correctly mapped with input json (it would help if the schema and sample json is shared) here is explode function with dataframes similar functionality is available with SQL import sparkSession.implicits._ import org.apache.spark.sql.functions._ val routeEventDF=dataDF.select($"messageId" ,explode($"messageData").alias("MessageData")) .select($"messageId", $"MessageData.unixTime",$"MessageData.packetID", $"MessageData.messageID") routeEventDF.show thanks Vijay

Sent from the Apache Spark User List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Type Casting Error in Spark Data Frame

vijay.bvp
formatted
=============
Assuming MessageHelper.sqlMapping schema is correctly mapped with input json
(it would help if the schema and sample json is shared)

here is explode function with dataframes similar functionality is available
with SQL

import sparkSession.implicits._
import org.apache.spark.sql.functions._
val routeEventDF=dataDF.select($"messageId"
,explode($"messageData").alias("MessageData"))
                                     .select($"messageId",
$"MessageData.unixTime",$"MessageData.packetID",
                                                $"MessageData.messageID")
routeEventDF.show


thanks
Vijay



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]