how to integrate hbase and hive in spark3.0.1?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

how to integrate hbase and hive in spark3.0.1?

李继先
hello,
  I am using spark3.0.1, I want to integrate hive and hbase, but I don't know choose hive and hbase version, I had re-compiled spark source and installed spark3.0.1 with hive and Hadoop,but I encountered below the error, anyone who can help?

root@namenode bin]# ./spark-sql
20/09/18 23:31:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/09/18 23:31:46 INFO HiveConf: Found configuration file file:/usr/local/spark-3.0.1/conf/hive-site.xml
20/09/18 23:31:47 INFO SharedState: loading hive config file: file:/usr/local/spark-3.0.1/conf/hive-site.xml
20/09/18 23:31:47 INFO SharedState: spark.sql.warehouse.dir is not set, but hive.metastore.warehouse.dir is set. Setting spark.sql.warehouse.dir to the value of hive.metastore.warehouse.dir ('/dev/hive/warehouse').
20/09/18 23:31:47 INFO SharedState: Warehouse path is '/dev/hive/warehouse'.
20/09/18 23:31:48 INFO SessionState: Created HDFS directory: /tmp/hive/root/31bd3857-77de-4a76-95f2-76ac7e0f20b4
20/09/18 23:31:48 INFO SessionState: Created local directory: /tmp/root/31bd3857-77de-4a76-95f2-76ac7e0f20b4
20/09/18 23:31:48 INFO SessionState: Created HDFS directory: /tmp/hive/root/31bd3857-77de-4a76-95f2-76ac7e0f20b4/_tmp_space.db
20/09/18 23:31:48 INFO SparkContext: Running Spark version 3.0.1
20/09/18 23:31:48 INFO ResourceUtils: ==============================================================
20/09/18 23:31:48 INFO ResourceUtils: Resources for spark.driver:

20/09/18 23:31:48 INFO ResourceUtils: ==============================================================
20/09/18 23:31:48 INFO SparkContext: Submitted application: SparkSQL::200.31.1.176
20/09/18 23:31:48 INFO SecurityManager: Changing view acls to: root
20/09/18 23:31:48 INFO SecurityManager: Changing modify acls to: root
20/09/18 23:31:48 INFO SecurityManager: Changing view acls groups to: 
20/09/18 23:31:48 INFO SecurityManager: Changing modify acls groups to: 
20/09/18 23:31:48 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
20/09/18 23:31:48 INFO Utils: Successfully started service 'sparkDriver' on port 44755.
20/09/18 23:31:48 INFO SparkEnv: Registering MapOutputTracker
20/09/18 23:31:49 INFO SparkEnv: Registering BlockManagerMaster
20/09/18 23:31:49 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
20/09/18 23:31:49 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
20/09/18 23:31:49 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
20/09/18 23:31:49 INFO DiskBlockManager: Created local directory at /usr/local/spark-3.0.1/blockmgr-0bfa59f0-2688-4aab-a0ae-731ad9112d3f
20/09/18 23:31:49 INFO MemoryStore: MemoryStore started with capacity 366.3 MiB
20/09/18 23:31:49 INFO SparkEnv: Registering OutputCommitCoordinator
20/09/18 23:31:49 INFO Utils: Successfully started service 'SparkUI' on port 4040.
20/09/18 23:31:49 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://namenode:4040
20/09/18 23:31:49 INFO Executor: Starting executor ID driver on host namenode
20/09/18 23:31:49 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 46040.
20/09/18 23:31:49 INFO NettyBlockTransferService: Server created on namenode:46040
20/09/18 23:31:49 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
20/09/18 23:31:49 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, namenode, 46040, None)
20/09/18 23:31:49 INFO BlockManagerMasterEndpoint: Registering block manager namenode:46040 with 366.3 MiB RAM, BlockManagerId(driver, namenode, 46040, None)
20/09/18 23:31:49 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, namenode, 46040, None)
20/09/18 23:31:49 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, namenode, 46040, None)
20/09/18 23:31:50 INFO SharedState: loading hive config file: file:/usr/local/spark-3.0.1/conf/hive-site.xml
20/09/18 23:31:50 INFO SharedState: spark.sql.warehouse.dir is not set, but hive.metastore.warehouse.dir is set. Setting spark.sql.warehouse.dir to the value of hive.metastore.warehouse.dir ('/dev/hive/warehouse').
20/09/18 23:31:50 INFO SharedState: Warehouse path is '/dev/hive/warehouse'.
20/09/18 23:31:51 INFO HiveUtils: Initializing HiveMetastoreConnection version 2.3.7 using Spark classes.
20/09/18 23:31:51 INFO HiveClientImpl: Warehouse location for Hive client (version 2.3.7) is /dev/hive/warehouse
20/09/18 23:31:52 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
20/09/18 23:31:52 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
20/09/18 23:31:52 INFO HiveMetaStore: 0: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
20/09/18 23:31:52 INFO ObjectStore: ObjectStore, initialize called
20/09/18 23:31:52 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
20/09/18 23:31:52 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
20/09/18 23:31:54 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
20/09/18 23:31:57 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL
20/09/18 23:31:57 INFO ObjectStore: Initialized ObjectStore
20/09/18 23:31:57 INFO HiveMetaStore: Added admin role in metastore
20/09/18 23:31:57 INFO HiveMetaStore: Added public role in metastore
20/09/18 23:31:57 INFO HiveMetaStore: No user is added in admin role, since config is empty
20/09/18 23:31:57 INFO HiveMetaStore: 0: get_all_functions
20/09/18 23:31:57 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_all_functions
20/09/18 23:31:58 INFO HiveMetaStore: 0: get_database: default
20/09/18 23:31:58 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_database: default
ADD JAR file:///usr/local/hive-3.1.2/lib/hive-hbase-handler-3.1.2.jar
20/09/18 23:31:58 INFO SQLStdHiveAccessController: Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=31bd3857-77de-4a76-95f2-76ac7e0f20b4, clientType=HIVECLI]
20/09/18 23:31:58 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory.
20/09/18 23:31:58 INFO metastore: Mestastore configuration hive.metastore.filter.hook changed from org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
20/09/18 23:31:58 INFO HiveMetaStore: 0: Cleaning up thread local RawStore...
20/09/18 23:31:58 INFO audit: ugi=root ip=unknown-ip-addr cmd=Cleaning up thread local RawStore...
20/09/18 23:31:58 INFO HiveMetaStore: 0: Done cleaning up thread local RawStore
20/09/18 23:31:58 INFO audit: ugi=root ip=unknown-ip-addr cmd=Done cleaning up thread local RawStore
Added [file:///usr/local/hive-3.1.2/lib/hive-hbase-handler-3.1.2.jar] to class path
20/09/18 23:31:58 INFO SessionState: Added [file:///usr/local/hive-3.1.2/lib/hive-hbase-handler-3.1.2.jar] to class path
Added resources: [file:///usr/local/hive-3.1.2/lib/hive-hbase-handler-3.1.2.jar]
20/09/18 23:31:58 INFO SessionState: Added resources: [file:///usr/local/hive-3.1.2/lib/hive-hbase-handler-3.1.2.jar]
20/09/18 23:31:58 INFO SparkContext: Added JAR file:///usr/local/hive-3.1.2/lib/hive-hbase-handler-3.1.2.jar at spark://namenode:44755/jars/hive-hbase-handler-3.1.2.jar with timestamp 1600443118245
ADD JAR file:///usr/local/hive-3.1.2/lib/hbase-protocol-2.3.1.jar
Added [file:///usr/local/hive-3.1.2/lib/hbase-protocol-2.3.1.jar] to class path
20/09/18 23:31:58 INFO SessionState: Added [file:///usr/local/hive-3.1.2/lib/hbase-protocol-2.3.1.jar] to class path
Added resources: [file:///usr/local/hive-3.1.2/lib/hbase-protocol-2.3.1.jar]
20/09/18 23:31:58 INFO SessionState: Added resources: [file:///usr/local/hive-3.1.2/lib/hbase-protocol-2.3.1.jar]
20/09/18 23:31:58 INFO SparkContext: Added JAR file:///usr/local/hive-3.1.2/lib/hbase-protocol-2.3.1.jar at spark://namenode:44755/jars/hbase-protocol-2.3.1.jar with timestamp 1600443118254
ADD JAR file:///usr/local/hive-3.1.2/lib/hbase-server-2.3.1.jar
Added [file:///usr/local/hive-3.1.2/lib/hbase-server-2.3.1.jar] to class path
20/09/18 23:31:58 INFO SessionState: Added [file:///usr/local/hive-3.1.2/lib/hbase-server-2.3.1.jar] to class path
Added resources: [file:///usr/local/hive-3.1.2/lib/hbase-server-2.3.1.jar]
20/09/18 23:31:58 INFO SessionState: Added resources: [file:///usr/local/hive-3.1.2/lib/hbase-server-2.3.1.jar]
20/09/18 23:31:58 INFO SparkContext: Added JAR file:///usr/local/hive-3.1.2/lib/hbase-server-2.3.1.jar at spark://namenode:44755/jars/hbase-server-2.3.1.jar with timestamp 1600443118261
ADD JAR file:///usr/local/hive-3.1.2/lib/hbase-client-2.3.1.jar
Added [file:///usr/local/hive-3.1.2/lib/hbase-client-2.3.1.jar] to class path
20/09/18 23:31:58 INFO SessionState: Added [file:///usr/local/hive-3.1.2/lib/hbase-client-2.3.1.jar] to class path
Added resources: [file:///usr/local/hive-3.1.2/lib/hbase-client-2.3.1.jar]
20/09/18 23:31:58 INFO SessionState: Added resources: [file:///usr/local/hive-3.1.2/lib/hbase-client-2.3.1.jar]
20/09/18 23:31:58 INFO SparkContext: Added JAR file:///usr/local/hive-3.1.2/lib/hbase-client-2.3.1.jar at spark://namenode:44755/jars/hbase-client-2.3.1.jar with timestamp 1600443118267
ADD JAR file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1.jar
Added [file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1.jar] to class path
20/09/18 23:31:58 INFO SessionState: Added [file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1.jar] to class path
Added resources: [file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1.jar]
20/09/18 23:31:58 INFO SessionState: Added resources: [file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1.jar]
20/09/18 23:31:58 INFO SparkContext: Added JAR file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1.jar at spark://namenode:44755/jars/hbase-common-2.3.1.jar with timestamp 1600443118273
ADD JAR file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1-tests.jar
Added [file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1-tests.jar] to class path
20/09/18 23:31:58 INFO SessionState: Added [file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1-tests.jar] to class path
Added resources: [file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1-tests.jar]
20/09/18 23:31:58 INFO SessionState: Added resources: [file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1-tests.jar]
20/09/18 23:31:58 INFO SparkContext: Added JAR file:///usr/local/hive-3.1.2/lib/hbase-common-2.3.1-tests.jar at spark://namenode:44755/jars/hbase-common-2.3.1-tests.jar with timestamp 1600443118278
ADD JAR file:///usr/local/hive-3.1.2/lib/zookeeper-3.4.6.jar
Added [file:///usr/local/hive-3.1.2/lib/zookeeper-3.4.6.jar] to class path
20/09/18 23:31:58 INFO SessionState: Added [file:///usr/local/hive-3.1.2/lib/zookeeper-3.4.6.jar] to class path
Added resources: [file:///usr/local/hive-3.1.2/lib/zookeeper-3.4.6.jar]
20/09/18 23:31:58 INFO SessionState: Added resources: [file:///usr/local/hive-3.1.2/lib/zookeeper-3.4.6.jar]
20/09/18 23:31:58 INFO SparkContext: Added JAR file:///usr/local/hive-3.1.2/lib/zookeeper-3.4.6.jar at spark://namenode:44755/jars/zookeeper-3.4.6.jar with timestamp 1600443118284
ADD JAR file:///usr/local/hive-3.1.2/lib/guava-27.0-jre.jar
Added [file:///usr/local/hive-3.1.2/lib/guava-27.0-jre.jar] to class path
20/09/18 23:31:58 INFO SessionState: Added [file:///usr/local/hive-3.1.2/lib/guava-27.0-jre.jar] to class path
Added resources: [file:///usr/local/hive-3.1.2/lib/guava-27.0-jre.jar]
20/09/18 23:31:58 INFO SessionState: Added resources: [file:///usr/local/hive-3.1.2/lib/guava-27.0-jre.jar]
20/09/18 23:31:58 INFO SparkContext: Added JAR file:///usr/local/hive-3.1.2/lib/guava-27.0-jre.jar at spark://namenode:44755/jars/guava-27.0-jre.jar with timestamp 1600443118289
Spark master: local[*], Application Id: local-1600443109751
20/09/18 23:31:58 INFO SparkSQLCLIDriver: Spark master: local[*], Application Id: local-1600443109751
spark-sql> select * from dim_vip;
20/09/18 23:32:43 INFO HiveMetaStore: 0: get_database: default
20/09/18 23:32:43 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_database: default
20/09/18 23:32:43 WARN HiveConf: HiveConf of name hive.internal.ss.authz.settings.applied.marker does not exist
20/09/18 23:32:43 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
20/09/18 23:32:43 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
20/09/18 23:32:43 INFO HiveMetaStore: 0: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
20/09/18 23:32:43 INFO ObjectStore: ObjectStore, initialize called
20/09/18 23:32:43 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL
20/09/18 23:32:43 INFO ObjectStore: Initialized ObjectStore
20/09/18 23:32:43 INFO HiveMetaStore: 0: get_table : db=default tbl=dim_vip
20/09/18 23:32:43 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_table : db=default tbl=dim_vip
20/09/18 23:32:43 INFO HiveMetaStore: 0: get_table : db=default tbl=dim_vip
20/09/18 23:32:43 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_table : db=default tbl=dim_vip
20/09/18 23:32:47 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 321.0 KiB, free 366.0 MiB)
20/09/18 23:32:47 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 29.3 KiB, free 366.0 MiB)
20/09/18 23:32:47 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on namenode:46040 (size: 29.3 KiB, free: 366.3 MiB)
20/09/18 23:32:47 INFO SparkContext: Created broadcast 0 from 
20/09/18 23:32:47 INFO HBaseStorageHandler: Configuring input job properties
20/09/18 23:32:47 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 428.9 KiB, free 365.5 MiB)
20/09/18 23:32:47 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 37.2 KiB, free 365.5 MiB)
20/09/18 23:32:47 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on namenode:46040 (size: 37.2 KiB, free: 366.2 MiB)
20/09/18 23:32:47 INFO SparkContext: Created broadcast 1 from 
20/09/18 23:32:48 ERROR SparkSQLDriver: Failed in [select * from dim_vip]
java.io.IOException: Cannot create a record reader because of a previous error. Please look at the previous logs lines from the task's full log for more details.
at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:253)
at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:131)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2164)
at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1004)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:388)
at org.apache.spark.rdd.RDD.collect(RDD.scala:1003)
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:385)
at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:412)
at org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:58)
at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$1(SparkSQLDriver.scala:65)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:65)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:377)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:496)
at scala.collection.Iterator.foreach(Iterator.scala:941)
at scala.collection.Iterator.foreach$(Iterator.scala:941)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:490)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:282)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalStateException: The input format instance has not been properly initialized. Ensure you call initializeTable either in your constructor or initialize method
at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getTable(TableInputFormatBase.java:555)
at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:248)
... 59 more
java.io.IOException: Cannot create a record reader because of a previous error. Please look at the previous logs lines from the task's full log for more details.
at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:253)
at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:131)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:276)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:272)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2164)
at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1004)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:388)
at org.apache.spark.rdd.RDD.collect(RDD.scala:1003)
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:385)
at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:412)
at org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:58)
at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$1(SparkSQLDriver.scala:65)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:65)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:377)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:496)
at scala.collection.Iterator.foreach(Iterator.scala:941)
at scala.collection.Iterator.foreach$(Iterator.scala:941)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:490)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:282)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalStateException: The input format instance has not been properly initialized. Ensure you call initializeTable either in your constructor or initialize method
at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getTable(TableInputFormatBase.java:555)
at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:248)
... 59 more