Connection to Presto via Spark

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Connection to Presto via Spark

Vineet Mishra
Hi,

I am trying to connect to Presto via Spark shell using the following connection string, however ending up with exception

-bash-4.2$ spark-shell  --driver-class-path com.facebook.presto.jdbc.PrestoDriver  --jars presto-jdbc-0.221.jar

scala> val presto_df = sqlContext.read.format("jdbc").option("url", "jdbc:presto://presto-prd.url.com:8443/hive/xyz").option("dbtable","testTable").option("driver","com.facebook.presto.jdbc.PrestoDriver").load()
java.sql.SQLException: Unrecognized connection property 'url'
at com.facebook.presto.jdbc.PrestoDriverUri.validateConnectionProperties(PrestoDriverUri.java:316)
at com.facebook.presto.jdbc.PrestoDriverUri.<init>(PrestoDriverUri.java:95)
at com.facebook.presto.jdbc.PrestoDriverUri.<init>(PrestoDriverUri.java:85)
at com.facebook.presto.jdbc.PrestoDriver.connect(PrestoDriver.java:87)
at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:45)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:61)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:52)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:120)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:91)
at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:57)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)

Upon removing the url option from the above string I am getting the following exception,

scala> val presto_df = sqlContext.read.format("jdbc").option("uri", "jdbc:presto://presto-prd.url.com:8443/hive/xyz").option("dbtable","testTable").option("driver","com.facebook.presto.jdbc.PrestoDriver").load()
 java.lang.RuntimeException: Option 'url' not specified
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource$$anonfun$1.apply(DefaultSource.scala:33)
at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource$$anonfun$1.apply(DefaultSource.scala:33)
at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
at org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.getOrElse(ddl.scala:150)
at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:33)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:25)

Would be helpful if someone can help here!

Thanks!
VM
Reply | Threaded
Open this post in threaded view
|

Re: Connection to Presto via Spark

Gourav Sengupta
Terribly fascinating. Any insights into why are we not trying to use spark itself?
Regards 
Gourav 

On Wed, 13 Jan 2021, 12:46 Vineet Mishra, <[hidden email]> wrote:
Hi,

I am trying to connect to Presto via Spark shell using the following connection string, however ending up with exception

-bash-4.2$ spark-shell  --driver-class-path com.facebook.presto.jdbc.PrestoDriver  --jars presto-jdbc-0.221.jar

scala> val presto_df = sqlContext.read.format("jdbc").option("url", "jdbc:presto://presto-prd.url.com:8443/hive/xyz").option("dbtable","testTable").option("driver","com.facebook.presto.jdbc.PrestoDriver").load()
java.sql.SQLException: Unrecognized connection property 'url'
at com.facebook.presto.jdbc.PrestoDriverUri.validateConnectionProperties(PrestoDriverUri.java:316)
at com.facebook.presto.jdbc.PrestoDriverUri.<init>(PrestoDriverUri.java:95)
at com.facebook.presto.jdbc.PrestoDriverUri.<init>(PrestoDriverUri.java:85)
at com.facebook.presto.jdbc.PrestoDriver.connect(PrestoDriver.java:87)
at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:45)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:61)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:52)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:120)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:91)
at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:57)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)

Upon removing the url option from the above string I am getting the following exception,

scala> val presto_df = sqlContext.read.format("jdbc").option("uri", "jdbc:presto://presto-prd.url.com:8443/hive/xyz").option("dbtable","testTable").option("driver","com.facebook.presto.jdbc.PrestoDriver").load()
 java.lang.RuntimeException: Option 'url' not specified
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource$$anonfun$1.apply(DefaultSource.scala:33)
at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource$$anonfun$1.apply(DefaultSource.scala:33)
at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
at org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.getOrElse(ddl.scala:150)
at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:33)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:25)

Would be helpful if someone can help here!

Thanks!
VM