Security in pyspark using extensions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Security in pyspark using extensions

Maximiliano Patricio Méndez
Hi,

I'm trying to build an authorization/security extension for spark using the hooks provided in SPARK-18127 (https://issues.apache.org/jira/browse/SPARK-18127).

The problem I've encountered is that those hooks aren't available for pyspark, as the exensions are loaded in the getOrCreate method of the SparkSession (https://github.com/apache/spark/blob/v2.3.1/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L937-L953) and the SparkSession used for pyspark is created through the constructor using py4j which initializes the extensions, but it never adds configured values to it.


(I'm pointing to the latest stable release (2.3.1) but this is also the case in master at the moment.)

I haven't found a jira issue or info in the mailing list regarding this, but wanted to check if this is something that we could hopefuly push to newer releases to maintain consistency of the extensions feature across python/scala.

Do you think this is worth a jira issue? I can provide a very small patch if that helps.

Thanks,
Maximiliano