Need help with String Concat Operation

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Need help with String Concat Operation

Debabrata Ghosh
Hi,
             I am having a dataframe column (name of the column is CTOFF) and I intend to prefix with '0' in case the length of the column is 3. Unfortunately, I am unable to acheive my goal and wonder whether you can help me here.

Command which I am executing:

ctoff_dedup_prep_temp = ctoff_df.withColumn('CTOFF_NEW',when(length(col('CTOFF')) == 3,'0'+col('CTOFF')))
ctoff_dedup_prep_temp.show()

+------------+----------+----------+--------------------+---------+-----+---------+
|EVNT_SRVC_AR|EVNT_FCLTY|EVNT_TP_CD|        NTWRK_PRD_CD| DY_OF_WK|CTOFF|CTOFF_NEW|
+------------+----------+----------+--------------------+---------+-----+---------+
|         HKG|       HKC|        AR|2,3,7,8,C,D,J,P,Q...|1,2,3,4,5| 1440|     null|
|         HKG|       HKC|        AR|             C,Q,T,Y|1,2,3,4,5|  730|    730.0|
|         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|  600|    600.0|
|         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|  900|    900.0|
+------------+----------+----------+--------------------+---------+-----+---------+

The result which I want is:
+------------+----------+----------+--------------------+---------+-----+---------+
|EVNT_SRVC_AR|EVNT_FCLTY|EVNT_TP_CD|        NTWRK_PRD_CD| DY_OF_WK|CTOFF|CTOFF_NEW|
+------------+----------+----------+--------------------+---------+-----+---------+
|         HKG|       HKC|        AR|2,3,7,8,C,D,J,P,Q...|1,2,3,4,5| 1440|    1440|
|         HKG|       HKC|        AR|             C,Q,T,Y|1,2,3,4,5|  730|    0730|
|         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|  600|    0600|
|         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|  900|    0900|
+------------+----------+----------+--------------------+---------+-----+---------+

So I want the '0' to be prefixed but it's getting suffixed as '.0'. Any clue around why is this happening

Thanks,

Debu
Reply | Threaded
Open this post in threaded view
|

Re: Need help with String Concat Operation

高佳翔

Hi Debu,

First, Instead of using ‘+’, you can use ‘concat’ to concatenate string columns. And you should enclose “0” with "lit()" to make it a column.
Second, 1440 become null because you didn’t tell spark what to do if the when clause is failed. So it simply set the value to null. To fix this, you should add “.otherwise()” right behind “when()”.

The code looks like this:

ctoff_df.withColumn("CTOFF_NEW", 
  when(
    length(col("CTOFF")) ==3,  
    concat(lit("0"), col("CTOFF"))
  ).otherwise(
    col("CTOFF")
  ))

Best,

JiaXiang


On Wed, Oct 18, 2017 at 2:17 PM, Debabrata Ghosh <[hidden email]> wrote:
Hi,
             I am having a dataframe column (name of the column is CTOFF) and I intend to prefix with '0' in case the length of the column is 3. Unfortunately, I am unable to acheive my goal and wonder whether you can help me here.

Command which I am executing:

ctoff_dedup_prep_temp = ctoff_df.withColumn('CTOFF_NEW',when(length(col('CTOFF')) == 3,'0'+col('CTOFF')))
ctoff_dedup_prep_temp.show()

+------------+----------+----------+--------------------+---------+-----+---------+
|EVNT_SRVC_AR|EVNT_FCLTY|EVNT_TP_CD|        NTWRK_PRD_CD| DY_OF_WK|CTOFF|CTOFF_NEW|
+------------+----------+----------+--------------------+---------+-----+---------+
|         HKG|       HKC|        AR|2,3,7,8,C,D,J,P,Q...|1,2,3,4,5| 1440|     null|
|         HKG|       HKC|        AR|             C,Q,T,Y|1,2,3,4,5|  730|    730.0|
|         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|  600|    600.0|
|         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|  900|    900.0|
+------------+----------+----------+--------------------+---------+-----+---------+

The result which I want is:
+------------+----------+----------+--------------------+---------+-----+---------+
|EVNT_SRVC_AR|EVNT_FCLTY|EVNT_TP_CD|        NTWRK_PRD_CD| DY_OF_WK|CTOFF|CTOFF_NEW|
+------------+----------+----------+--------------------+---------+-----+---------+
|         HKG|       HKC|        AR|2,3,7,8,C,D,J,P,Q...|1,2,3,4,5| 1440|    1440|
|         HKG|       HKC|        AR|             C,Q,T,Y|1,2,3,4,5|  730|    0730|
|         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|  600|    0600|
|         HKG|       HKC|        AR|         E,K,C,Q,T,Y|1,2,3,4,5|  900|    0900|
+------------+----------+----------+--------------------+---------+-----+---------+

So I want the '0' to be prefixed but it's getting suffixed as '.0'. Any clue around why is this happening

Thanks,

Debu



--
Gao JiaXiang
Data Analyst, GCBI