SparkSQL not support CharType

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
163
Reply | Threaded
Open this post in threaded view
|

SparkSQL not support CharType

163
Hi,
     when I use Dataframe with table schema, It goes wrong:

val test_schema = StructType(Array(
  StructField("id", IntegerType, false),
StructField("flag", CharType(1), false),
StructField("time", DateType, false)));

val df = spark.read.format("com.databricks.spark.csv")
.schema(test_schema)
.option("header", "false")
.option("inferSchema", "false")
.option("delimiter", ",")
.load("file:///Users/name/b")

The log is below:
Exception in thread "main" scala.MatchError: CharType(1) (of class org.apache.spark.sql.types.CharType)
at org.apache.spark.sql.catalyst.encoders.RowEncoder$.org$apache$spark$sql$catalyst$encoders$RowEncoder$$serializerFor(RowEncoder.scala:73)
at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$2.apply(RowEncoder.scala:158)
at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$2.apply(RowEncoder.scala:157)

Why? Is this a bug?

But I found spark will translate char type to string when using create table command:

      create table test(flag char(1));
      desc test:            flag string;

    


Regards
Wendy He
Reply | Threaded
Open this post in threaded view
|

Re: SparkSQL not support CharType

Jörn Franke
Or bytetype depending on the use case 

On 23. Nov 2017, at 10:18, Herman van Hövell tot Westerflier <[hidden email]> wrote:

You need to use a StringType. The CharType and VarCharType are there to ensure compatibility with Hive and ORC; they should not be used anywhere else.

On Thu, Nov 23, 2017 at 4:09 AM, 163 <[hidden email]> wrote:
Hi,
     when I use Dataframe with table schema, It goes wrong:

val test_schema = StructType(Array(
  StructField("id", IntegerType, false),
StructField("flag", CharType(1), false),
StructField("time", DateType, false)));

val df = spark.read.format("com.databricks.spark.csv")
.schema(test_schema)
.option("header", "false")
.option("inferSchema", "false")
.option("delimiter", ",")
.load("file:///Users/name/b")

The log is below:
Exception in thread "main" scala.MatchError: CharType(1) (of class org.apache.spark.sql.types.CharType)
at org.apache.spark.sql.catalyst.encoders.RowEncoder$.org$apache$spark$sql$catalyst$encoders$RowEncoder$$serializerFor(RowEncoder.scala:73)
at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$2.apply(RowEncoder.scala:158)
at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$2.apply(RowEncoder.scala:157)

Why? Is this a bug?

But I found spark will translate char type to string when using create table command:

      create table test(flag char(1));
      desc test:            flag string;

    


Regards
Wendy He



--

Herman van Hövell

Software Engineer

Databricks Inc.

[hidden email]

+31 6 420 590 27

databricks.com

http://databricks.com