Customizing Spark ThriftServer

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Customizing Spark ThriftServer

Soheil Pourbafrani
Hi, I want to create a thrift server that has some hive table predefined and listen on a port for the user query. Here is my code:
val spark = SparkSession.builder()
.config("hive.server2.thrift.port", "10000")
.config("spark.sql.hive.thriftServer.singleSession", "true")
.config("spark.cassandra.connection.host", "Z, X")
.config("spark.cassandra.auth.username", "A")
.config("spark.cassandra.auth.password", "B")
.appName("ThriftServer")
.getOrCreate()

val table = spark
.read
.format("org.apache.spark.sql.cassandra")
.options(Map( "table" -> "N", "keyspace" -> "M"))
.load

table.createOrReplaceTempView("N")

The problem is when I submit the job to the cluster, it just runs and terminates while I expect it listens on port 10000. I guess I missed something in the code.

The second question is the ThriftServer cannot be submitted in cluster mode so the client node could be the single point of failure! Is there any solution for this?

Thanks
Reply | Threaded
Open this post in threaded view
|

Re:Customizing Spark ThriftServer

Jiaan Geng
Spark ThriftServer is a spark application that possess thrift server. your code is a custom spark application.
If you need some custome function beyond Spark ThriftServer, you can make your spark application contains HiveThriftServer2.





At 2019-01-23 17:53:01, "Soheil Pourbafrani" <[hidden email]> wrote:
Hi, I want to create a thrift server that has some hive table predefined and listen on a port for the user query. Here is my code:
val spark = SparkSession.builder()
.config("hive.server2.thrift.port", "10000")
.config("spark.sql.hive.thriftServer.singleSession", "true")
.config("spark.cassandra.connection.host", "Z, X")
.config("spark.cassandra.auth.username", "A")
.config("spark.cassandra.auth.password", "B")
.appName("ThriftServer")
.getOrCreate()

val table = spark
.read
.format("org.apache.spark.sql.cassandra")
.options(Map( "table" -> "N", "keyspace" -> "M"))
.load

table.createOrReplaceTempView("N")

The problem is when I submit the job to the cluster, it just runs and terminates while I expect it listens on port 10000. I guess I missed something in the code.

The second question is the ThriftServer cannot be submitted in cluster mode so the client node could be the single point of failure! Is there any solution for this?

Thanks