How to apply ranger policies on Spark

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How to apply ranger policies on Spark

joyan sil

Hi,

We have ranger policies defined on the hive table and authorization works as expected when we use hive cli and beeline. But when we access those hive tables using spark-shell or spark-submit it does not work.

 Any suggestions to make Ranger work with Spark?


Regards

Joyan

Reply | Threaded
Open this post in threaded view
|

Re: How to apply ranger policies on Spark

Dennis Suhari
Hi Joyan,

Spark uses its own metastore. Using Ranger you need to use the Hive Metastore. For this you need to point to Hive Metastore and use HiveContext in your Spark Code.

Br,

Dennis

Von meinem iPhone gesendet

Am 23.11.2020 um 19:04 schrieb joyan sil <[hidden email]>:



Hi,

We have ranger policies defined on the hive table and authorization works as expected when we use hive cli and beeline. But when we access those hive tables using spark-shell or spark-submit it does not work.

 Any suggestions to make Ranger work with Spark?


Regards

Joyan

Reply | Threaded
Open this post in threaded view
|

Re: How to apply ranger policies on Spark

ayan guha
AFAIK, Ranger secures Hive (JDBC) server only. Unfortunately Spark does not interact with HS2, but directly interacts with Metastore. Hence, the only way to use Ranger policies if you use Hive via JDBC. Another option is HDFS or Storage ACLs, which are coarse grain control over file path etc. You can use Ranger to manage HDFS ACLs as well. In such scenario spark will be bound by those policies. 

On Tue, Nov 24, 2020 at 5:26 PM Dennis Suhari <[hidden email]> wrote:
Hi Joyan,

Spark uses its own metastore. Using Ranger you need to use the Hive Metastore. For this you need to point to Hive Metastore and use HiveContext in your Spark Code.

Br,

Dennis

Von meinem iPhone gesendet

Am 23.11.2020 um 19:04 schrieb joyan sil <[hidden email]>:



Hi,

We have ranger policies defined on the hive table and authorization works as expected when we use hive cli and beeline. But when we access those hive tables using spark-shell or spark-submit it does not work.

 Any suggestions to make Ranger work with Spark?


Regards

Joyan



--
Best Regards,
Ayan Guha
Reply | Threaded
Open this post in threaded view
|

Re: How to apply ranger policies on Spark

joyan sil
Thanks Ayan and Dennis,

'@Ayan. if I use Ranger to manage HDFS ACLS, as you mentioned it will coarse grain control over file. I might have few fine grained use cases at row/column level
I was going through the below JIRAS and thinking if anyone might have used it and any user documentation for the same exists in the spark community.


Regards
Joyan

On Tue, Nov 24, 2020 at 1:40 PM ayan guha <[hidden email]> wrote:
AFAIK, Ranger secures Hive (JDBC) server only. Unfortunately Spark does not interact with HS2, but directly interacts with Metastore. Hence, the only way to use Ranger policies if you use Hive via JDBC. Another option is HDFS or Storage ACLs, which are coarse grain control over file path etc. You can use Ranger to manage HDFS ACLs as well. In such scenario spark will be bound by those policies. 

On Tue, Nov 24, 2020 at 5:26 PM Dennis Suhari <[hidden email]> wrote:
Hi Joyan,

Spark uses its own metastore. Using Ranger you need to use the Hive Metastore. For this you need to point to Hive Metastore and use HiveContext in your Spark Code.

Br,

Dennis

Von meinem iPhone gesendet

Am 23.11.2020 um 19:04 schrieb joyan sil <[hidden email]>:



Hi,

We have ranger policies defined on the hive table and authorization works as expected when we use hive cli and beeline. But when we access those hive tables using spark-shell or spark-submit it does not work.

 Any suggestions to make Ranger work with Spark?


Regards

Joyan



--
Best Regards,
Ayan Guha