Insert into dynamic partitioned hive/parquet table throws error - Partition spec contains non-partition columns

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Insert into dynamic partitioned hive/parquet table throws error - Partition spec contains non-partition columns

Nirav Patel
I am trying to insert overwrite multiple partitions into existing partitioned hive/parquet table. Table was created using sparkSession.

I have a table 'mytable' with partitions P1 and P2.

I have following set on sparkSession object:

    .config("hive.exec.dynamic.partition", true)

    .config("hive.exec.dynamic.partition.mode", "nonstrict")


val df = spark.read.csv(pathToNewData)

df.createOrReplaceTempView("updateTable")


here 'df' may contains data from multiple partitions. i.e. multiple values for P1 and P2 in data.


spark.sql("insert overwrite table mytable PARTITION(P1, P2) select c1, c2,..cn, P1, P2 from updateTable") // I made sure that partition columns P1 and P2 are at the end of projection list.

I am getting following error:

org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.Table.ValidationFailureSemanticException: Partition spec {p1=, p2=, P1=1085, P2=164590861} contains non-partition columns;

dataframe 'df' have records for P1=1085, P2=164590861 . 




What's New with Xactly

        
Reply | Threaded
Open this post in threaded view
|

Re: Insert into dynamic partitioned hive/parquet table throws error - Partition spec contains non-partition columns

Nirav Patel
FYI, it works with static partitioning
spark.sql("insert overwrite table mytable PARTITION(P1=1085, P2=164590861) select c1, c2,..cn, P1, P2 from updateTable")

On Thu, Aug 2, 2018 at 5:01 PM, Nirav Patel <[hidden email]> wrote:
I am trying to insert overwrite multiple partitions into existing partitioned hive/parquet table. Table was created using sparkSession.

I have a table 'mytable' with partitions P1 and P2.

I have following set on sparkSession object:

    .config("hive.exec.dynamic.partition", true)

    .config("hive.exec.dynamic.partition.mode", "nonstrict")


val df = spark.read.csv(pathToNewData)

df.createOrReplaceTempView("updateTable")


here 'df' may contains data from multiple partitions. i.e. multiple values for P1 and P2 in data.


spark.sql("insert overwrite table mytable PARTITION(P1, P2) select c1, c2,..cn, P1, P2 from updateTable") // I made sure that partition columns P1 and P2 are at the end of projection list.

I am getting following error:

org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.Table.ValidationFailureSemanticException: Partition spec {p1=, p2=, P1=1085, P2=164590861} contains non-partition columns;

dataframe 'df' have records for P1=1085, P2=164590861 . 





What's New with Xactly