Should SHOW TABLES statement return a hive-compatible output?

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Should SHOW TABLES statement return a hive-compatible output?


I came across an issue[1] in PyHive which involves the SHOW TABLES output from Thrift Server.

When you run a SHOW TABLES statement in beeline, it will return a table with the following fields: (i) schema name, (ii) table name, (iii) temporary table flag.

This output is different from what Hive does, which returns a single column containing all table names.

From the spark[2] docs: "The Thrift JDBC/ODBC server implemented here corresponds to the HiveServer2 in built-in Hive.". With that being said, there is a compatibility issue in that particular statement because it breaks libraries like PyHive.

Now my questions:

1) Is it expected for Thrift Server to be 100% Hive compatible?
2) If the answer to the previous question is yes, is this a bug in spark?
3) What possible problems could bring to spark if we make SHOW TABLES return just like what Hive returns and make Thrift Server resolve a SHOW TABLES EXTENDED statement to return what SparkSQL returns?