Schema store for Parquet

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Schema store for Parquet

Ruijing Li
Hi all,

Has anyone explored efforts to have a centralized storage of schemas of different parquet files? I know there is schema management for Avro, but couldn’t find solutions for parquet schema management. Thanks!
--
Cheers,
Ruijing Li
Reply | Threaded
Open this post in threaded view
|

Re: Schema store for Parquet

Molotch
Google hive metastore.

On Wed, Mar 4, 2020 at 7:29 PM Ruijing Li <[hidden email]> wrote:
Hi all,

Has anyone explored efforts to have a centralized storage of schemas of different parquet files? I know there is schema management for Avro, but couldn’t find solutions for parquet schema management. Thanks!
--
Cheers,
Ruijing Li
Reply | Threaded
Open this post in threaded view
|

Re: Schema store for Parquet

lucas.gary@gmail.com
Or AWS glue catalog if you're in AWS

On Wed, 4 Mar 2020 at 10:35, Magnus Nilsson <[hidden email]> wrote:
Google hive metastore.

On Wed, Mar 4, 2020 at 7:29 PM Ruijing Li <[hidden email]> wrote:
Hi all,

Has anyone explored efforts to have a centralized storage of schemas of different parquet files? I know there is schema management for Avro, but couldn’t find solutions for parquet schema management. Thanks!
--
Cheers,
Ruijing Li
Reply | Threaded
Open this post in threaded view
|

Re: Schema store for Parquet

Ruijing Li
Thanks Lucas and Magnus,

Would there be any open source solutions other than Apache Hive metastore, if we don’t wish to use Apache Hive and spark?

Thanks.

On Wed, Mar 4, 2020 at 10:40 AM [hidden email] <[hidden email]> wrote:
Or AWS glue catalog if you're in AWS

On Wed, 4 Mar 2020 at 10:35, Magnus Nilsson <[hidden email]> wrote:
Google hive metastore.

On Wed, Mar 4, 2020 at 7:29 PM Ruijing Li <[hidden email]> wrote:
Hi all,

Has anyone explored efforts to have a centralized storage of schemas of different parquet files? I know there is schema management for Avro, but couldn’t find solutions for parquet schema management. Thanks!
--
Cheers,
Ruijing Li
--
Cheers,
Ruijing Li
Reply | Threaded
Open this post in threaded view
|

Re: Schema store for Parquet

Molotch
Apache Atlas is the apache data catalog. Maybe want to look into that. It depends on what your use case is.

On Wed, Mar 4, 2020 at 8:01 PM Ruijing Li <[hidden email]> wrote:
Thanks Lucas and Magnus,

Would there be any open source solutions other than Apache Hive metastore, if we don’t wish to use Apache Hive and spark?

Thanks.

On Wed, Mar 4, 2020 at 10:40 AM [hidden email] <[hidden email]> wrote:
Or AWS glue catalog if you're in AWS

On Wed, 4 Mar 2020 at 10:35, Magnus Nilsson <[hidden email]> wrote:
Google hive metastore.

On Wed, Mar 4, 2020 at 7:29 PM Ruijing Li <[hidden email]> wrote:
Hi all,

Has anyone explored efforts to have a centralized storage of schemas of different parquet files? I know there is schema management for Avro, but couldn’t find solutions for parquet schema management. Thanks!
--
Cheers,
Ruijing Li
--
Cheers,
Ruijing Li
Reply | Threaded
Open this post in threaded view
|

Re: Schema store for Parquet

Ruijing Li
Thanks Magnus, 

I’ll explore Atlas and see what I can find. 

On Wed, Mar 4, 2020 at 11:10 AM Magnus Nilsson <[hidden email]> wrote:
Apache Atlas is the apache data catalog. Maybe want to look into that. It depends on what your use case is.

On Wed, Mar 4, 2020 at 8:01 PM Ruijing Li <[hidden email]> wrote:
Thanks Lucas and Magnus,

Would there be any open source solutions other than Apache Hive metastore, if we don’t wish to use Apache Hive and spark?

Thanks.

On Wed, Mar 4, 2020 at 10:40 AM [hidden email] <[hidden email]> wrote:
Or AWS glue catalog if you're in AWS

On Wed, 4 Mar 2020 at 10:35, Magnus Nilsson <[hidden email]> wrote:
Google hive metastore.

On Wed, Mar 4, 2020 at 7:29 PM Ruijing Li <[hidden email]> wrote:
Hi all,

Has anyone explored efforts to have a centralized storage of schemas of different parquet files? I know there is schema management for Avro, but couldn’t find solutions for parquet schema management. Thanks!
--
Cheers,
Ruijing Li
--
Cheers,
Ruijing Li
--
Cheers,
Ruijing Li