[Spark SQL]: Does namespace name is always needed in a query for tables from a user defined catalog plugin

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Spark SQL]: Does namespace name is always needed in a query for tables from a user defined catalog plugin

xufei

Hi,

I'm trying to write a catalog plugin based on spark-3.0-preview,  and I found even when I use 'use catalog.namespace' to set the current catalog and namespace, I still need to qualified name in the query.

For example, I add a catalog named 'example_catalog', there is a database named 'test' in 'example_catalog', and a table 't' in 'example_catalog.test'. I can query the table using 'select * from example_catalog.test.t' under default catalog(which is spark_catalog). After I use 'use example_catalog.test' to change the current catalog to 'example_catalog', and the current namespace to 'test', I can query the table using 'select * from test.t', but 'select * from t' failed due to table_not_found exception.

I want to know if this is an expected behavior?  If yes, it sounds a little weird since I think after 'use example_catalog.test', all the un-qualified identifiers should be interpreted as 'example_catalog.test.identifier'.

Attachment is a test file that you can use to reproduce the problem I met.

Thanks.



---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

DataSourceV2ExplainSuite.scala (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Spark SQL]: Does namespace name is always needed in a query for tables from a user defined catalog plugin

imback82
Hi Xufei,
I also noticed the same while looking into relation resolution behavior (See Appendix A in this doc). I created SPARK-30094 and will follow up.

Thanks,
Terry

On Sun, Dec 1, 2019 at 7:12 PM xufei <[hidden email]> wrote:

Hi,

I'm trying to write a catalog plugin based on spark-3.0-preview,  and I found even when I use 'use catalog.namespace' to set the current catalog and namespace, I still need to qualified name in the query.

For example, I add a catalog named 'example_catalog', there is a database named 'test' in 'example_catalog', and a table 't' in 'example_catalog.test'. I can query the table using 'select * from example_catalog.test.t' under default catalog(which is spark_catalog). After I use 'use example_catalog.test' to change the current catalog to 'example_catalog', and the current namespace to 'test', I can query the table using 'select * from test.t', but 'select * from t' failed due to table_not_found exception.

I want to know if this is an expected behavior?  If yes, it sounds a little weird since I think after 'use example_catalog.test', all the un-qualified identifiers should be interpreted as 'example_catalog.test.identifier'.

Attachment is a test file that you can use to reproduce the problem I met.

Thanks.


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [Spark SQL]: Does namespace name is always needed in a query for tables from a user defined catalog plugin

xufei
Thanks, Terry. Glad to know that it is not an expected behavior.

Terry Kim <[hidden email]> 于2019年12月2日周一 上午11:51写道:
Hi Xufei,
I also noticed the same while looking into relation resolution behavior (See Appendix A in this doc). I created SPARK-30094 and will follow up.

Thanks,
Terry

On Sun, Dec 1, 2019 at 7:12 PM xufei <[hidden email]> wrote:

Hi,

I'm trying to write a catalog plugin based on spark-3.0-preview,  and I found even when I use 'use catalog.namespace' to set the current catalog and namespace, I still need to qualified name in the query.

For example, I add a catalog named 'example_catalog', there is a database named 'test' in 'example_catalog', and a table 't' in 'example_catalog.test'. I can query the table using 'select * from example_catalog.test.t' under default catalog(which is spark_catalog). After I use 'use example_catalog.test' to change the current catalog to 'example_catalog', and the current namespace to 'test', I can query the table using 'select * from test.t', but 'select * from t' failed due to table_not_found exception.

I want to know if this is an expected behavior?  If yes, it sounds a little weird since I think after 'use example_catalog.test', all the un-qualified identifiers should be interpreted as 'example_catalog.test.identifier'.

Attachment is a test file that you can use to reproduce the problem I met.

Thanks.


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]