Type change support in spark parquet read-write

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Type change support in spark parquet read-write

Swapnil Chougule
Hi Folks,

I came across one problem while reading parquet through spark.
One parquet has been written with field 'a' with type 'Integer'. Afterwards, reading this file with schema for 'a' as 'Long' gives exception.
I thought this compatible type change is supported. But this is not working.
Code snippet of this:

val oldSchema =
StructType(
StructField("a", IntegerType, true) :: Nil)

val df1 = spark.read.schema(oldSchema).json("/path/to/json/data")
df1.write.parquet("/path/to/parquet/data")

val newSchema =
StructType(
StructField("a", LongType, true) :: Nil)

spark.read.schema(newSchema).json("/path/to/parquet/data").show()

Any help around this is really appreciated.

Thanks,
Swapnil