from_json function

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

from_json function

dbolshak
Hello community,

I can not manage to run from_json method with "columnNameOfCorruptRecord"
option.
```
    import org.apache.spark.sql.functions._

    val data = Seq(
      "{'number': 1}",
      "{'number': }"
    )

    val schema = new StructType()
      .add($"number".int)
      .add($"_corrupt_record".string)

    val sourceDf = data.toDF("column")

    val jsonedDf = sourceDf
      .select(from_json(
        $"column",
        schema,
        Map("mode" -> "PERMISSIVE", "columnNameOfCorruptRecord" ->
"_corrupt_record")
      ) as "data").selectExpr("data.number", "data._corrupt_record")

      jsonedDf.show()
```
Does anybody can help me get `_corrupt_record` non empty?

Thanks in advance.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: from_json function

Maxim Gekk
Hello Denis,

The from_json function supports only the fail fast mode, see: 

Your settings "mode" -> "PERMISSIVE" will be overwritten

On Wed, Aug 15, 2018 at 4:52 PM dbolshak <[hidden email]> wrote:
Hello community,

I can not manage to run from_json method with "columnNameOfCorruptRecord"
option.
```
    import org.apache.spark.sql.functions._

    val data = Seq(
      "{'number': 1}",
      "{'number': }"
    )

    val schema = new StructType()
      .add($"number".int)
      .add($"_corrupt_record".string)

    val sourceDf = data.toDF("column")

    val jsonedDf = sourceDf
      .select(from_json(
        $"column",
        schema,
        Map("mode" -> "PERMISSIVE", "columnNameOfCorruptRecord" ->
"_corrupt_record")
      ) as "data").selectExpr("data.number", "data._corrupt_record")

      jsonedDf.show()
```
Does anybody can help me get `_corrupt_record` non empty?

Thanks in advance.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



--

Maxim Gekk

Technical Solutions Lead

Databricks Inc.

[hidden email]

databricks.com

 

Reply | Threaded
Open this post in threaded view
|

Re: from_json function

dbolshak
Maxim, thanks for your replay.

I've left comment in the following jira issue
https://issues.apache.org/jira/browse/SPARK-23194?focusedCommentId=16582025&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16582025



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]