Json Parsing.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Json Parsing.

satyajit vegesna
Does spark support automatic detection of schema from a json string in a dataframe.

I am trying to parse a json string and do some transofrmations on to it (would like append new columns to the dataframe) , from the data i stream from kafka.

But i am not very sure, how i can parse the json in structured streaming. And i would not be interested in creating a schema, as the data form kafka is going to maintain different schema objects in value column.

Any advice or help would be appreciated.

Regards,
Satyajit.
Reply | Threaded
Open this post in threaded view
|

Re: Json Parsing.

ayan guha
You can use get

On Thu, 7 Dec 2017 at 10:39 am, satyajit vegesna <[hidden email]> wrote:
Does spark support automatic detection of schema from a json string in a dataframe.

I am trying to parse a json string and do some transofrmations on to it (would like append new columns to the dataframe) , from the data i stream from kafka.

But i am not very sure, how i can parse the json in structured streaming. And i would not be interested in creating a schema, as the data form kafka is going to maintain different schema objects in value column.

Any advice or help would be appreciated.

Regards,
Satyajit.
--
Best Regards,
Ayan Guha
Reply | Threaded
Open this post in threaded view
|

Re: Json Parsing.

ayan guha

On Thu, 7 Dec 2017 at 11:37 am, ayan guha <[hidden email]> wrote:
You can use get_json function

On Thu, 7 Dec 2017 at 10:39 am, satyajit vegesna <[hidden email]> wrote:
Does spark support automatic detection of schema from a json string in a dataframe.

I am trying to parse a json string and do some transofrmations on to it (would like append new columns to the dataframe) , from the data i stream from kafka.

But i am not very sure, how i can parse the json in structured streaming. And i would not be interested in creating a schema, as the data form kafka is going to maintain different schema objects in value column.

Any advice or help would be appreciated.

Regards,
Satyajit.
--
Best Regards,
Ayan Guha
--
Best Regards,
Ayan Guha
Reply | Threaded
Open this post in threaded view
|

Re: Json Parsing.

satyajit vegesna
Thank you for the info, is there a way to get all keys of JSON, so that i can create a dataframe with json keys, as below,

  fieldsDataframe.withColumn("data" , functions.get_json_object($"RecordString", "$.id"))   this is for appending a single column in dataframe with id key.
I would like to automate this process for all keys in the JSON, as i am going to get dynamically generated JSON schema.

On Wed, Dec 6, 2017 at 4:37 PM, ayan guha <[hidden email]> wrote:

On Thu, 7 Dec 2017 at 11:37 am, ayan guha <[hidden email]> wrote:
You can use get_json function

On Thu, 7 Dec 2017 at 10:39 am, satyajit vegesna <[hidden email]> wrote:
Does spark support automatic detection of schema from a json string in a dataframe.

I am trying to parse a json string and do some transofrmations on to it (would like append new columns to the dataframe) , from the data i stream from kafka.

But i am not very sure, how i can parse the json in structured streaming. And i would not be interested in creating a schema, as the data form kafka is going to maintain different schema objects in value column.

Any advice or help would be appreciated.

Regards,
Satyajit.
--
Best Regards,
Ayan Guha
--
Best Regards,
Ayan Guha