sparksql exception when using regexp_replace

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

sparksql exception when using regexp_replace

付涛
Hi, sparks:
     I am using sparksql to insert some values into directory,the sql seems
like this:
     
     insert overwrite directory '/temp/test_spark'
     ROW FORMAT DELIMITED FIELDS TERMINATED BY '~'
     select regexp_replace('a~b~c', '~', ''), 123456

     however,some exceptions has throwed:
     
     Caused by: org.apache.hadoop.hive.serde2.SerDeException:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe: columns has 4 elements
while columns.types has 2 elements!
        at
org.apache.hadoop.hive.serde2.lazy.LazySerDeParameters.extractColumnInfo(LazySerDeParameters.java:163)
        at
org.apache.hadoop.hive.serde2.lazy.LazySerDeParameters.(LazySerDeParameters.java:90)
        at
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.initialize(LazySimpleSerDe.java:116)
        at
org.apache.spark.sql.hive.execution.HiveOutputWriter.(HiveFileFormat.scala:119)
        at
org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:103)
        at
org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:367)
        at
org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:378)
        at
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:269)
        at
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:267)
        at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1414)
        at
org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
        ... 8 more

       the hive version used is 2.0.1

       when I add a alias to regexp_replace, the sql has successed:
       
       insert overwrite directory '/temp/test_spark'
       ROW FORMAT DELIMITED FIELDS TERMINATED BY '~'
       select regexp_replace('a~b~c', '~', '') as kv, 123456



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]