[Spark-SQL] - Creating Hive Metastore Parquet table from Avro schema

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[Spark-SQL] - Creating Hive Metastore Parquet table from Avro schema

pradeepbaji
Hello Everyone,

I have my parquet files stored on HDFS. I am trying to create a table in
Hive Metastore from Spark SQL. I have an Avro schema file from which I
generated the parquet files.

I am doing the following to create the table.

1) Firstly create an Avro dummy table from the schema file.

spark.sql("""
  CREATE TABLE
     db_test.avro_test
  ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
  STORED AS
     INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
     OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
  TBLPROPERTIES ('avro.schema.url'='/avro-schema/schema.avsc')""")

This step is successful and I have a table created in hive-metastore.

2) Now create an external table with the same schema as the first one and
with location pointing to parquet files directory.

spark.sql(“””
   CREATE EXTERNAL TABLE db_test.parquet_test
   LIKE db_test.avro_test
  STORED AS PARQUET LOCATION ‘/parquet-data-dir’
“””)

This step is failing. Looks like Spark SQL doesn’t like the word “LIKE” in
the create statement. The same statement works fine from the Hive shell.

*Can someone please help me to with the parquet table creation from the Avro
Schema? *
Is this a bug in spark sql that it doesn't parse "LIKE"?


Here is the error that the spark is throwing.
Exception in thread "main"
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'LIKE' expecting <EOF>(line 1, pos 136)

== SQL ==
CREATE EXTERNAL TABLE db_test.parquet_test LIKE db_test.avro_test STORED AS
PARQUET LOCATION ‘/parquet-data-dir’
---------------------------------------------------------------------^^^

        at
org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:239)
        at
org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:115)



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]