how "hour" function in Spark SQL is supposed to work?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

how "hour" function in Spark SQL is supposed to work?

Serega Sheypak
hi, desperately trying to extract hour from unix seconds

year, month, dayofmonth functions work as expected.
hour function always returns 0.
val ds  = dataset
.withColumn("year", year(to_date(from_unixtime(dataset.col("ts") / 1000))))
.withColumn("month", month(to_date(from_unixtime(dataset.col("ts") / 1000))))
.withColumn("day", dayofmonth(to_date(from_unixtime(dataset.col("ts") / 1000))))
.withColumn("hour", hour(from_utc_timestamp(dataset.col("ts") / 1000, "UTC")))

//.withColumn("hour", hour(dataset.col("ts") / 1000))
//.withColumn("hour1", hour(dataset.col("ts")))
//.withColumn("hour", hour(dataset.col("ts")))
//.withColumn("hour", hour("2009-07-30 12:58:59"))
I took a look at source code

year, month, dayofmonth expect to get
override def inputTypes: Seq[AbstractDataType] = Seq(DateType)
hour function expects something different
override def inputTypes: Seq[AbstractDataType] = Seq(TimestampType)
from_utc_timestamp returns Timestamp
override def dataType: DataType = TimestampType
but It didn't help

What do I do wrong? how can I get hour from unix seconds? 
Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: how "hour" function in Spark SQL is supposed to work?

vermanurag
Not sure why you are dividing by 1000. from_unixtime expects a long type
which is time in milliseconds from reference date.

The following should work:

val ds = dataset.withColumn("hour",hour(from_unixtime(dataset.col("ts"))))






--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: how "hour" function in Spark SQL is supposed to work?

Serega Sheypak
> Not sure why you are dividing by 1000. from_unixtime expects a long type
It expects seconds, I have milliseconds.



2018-03-12 6:16 GMT+01:00 vermanurag <[hidden email]>:
Not sure why you are dividing by 1000. from_unixtime expects a long type
which is time in milliseconds from reference date.

The following should work:

val ds = dataset.withColumn("hour",hour(from_unixtime(dataset.col("ts"))))






--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: how "hour" function in Spark SQL is supposed to work?

Serega Sheypak
Hi, any updates? Looks like some API inconsistency or bug..?

2018-03-17 13:09 GMT+01:00 Serega Sheypak <[hidden email]>:
> Not sure why you are dividing by 1000. from_unixtime expects a long type
It expects seconds, I have milliseconds.



2018-03-12 6:16 GMT+01:00 vermanurag <[hidden email]>:
Not sure why you are dividing by 1000. from_unixtime expects a long type
which is time in milliseconds from reference date.

The following should work:

val ds = dataset.withColumn("hour",hour(from_unixtime(dataset.col("ts"))))






--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]



Reply | Threaded
Open this post in threaded view
|

Re: how "hour" function in Spark SQL is supposed to work?

Serega Sheypak
Ok, this one works:
.withColumn("hour", hour(from_unixtime(typedDataset.col("ts") / 1000)))


2018-03-20 22:43 GMT+01:00 Serega Sheypak <[hidden email]>:
Hi, any updates? Looks like some API inconsistency or bug..?

2018-03-17 13:09 GMT+01:00 Serega Sheypak <[hidden email]>:
> Not sure why you are dividing by 1000. from_unixtime expects a long type
It expects seconds, I have milliseconds.



2018-03-12 6:16 GMT+01:00 vermanurag <[hidden email]>:
Not sure why you are dividing by 1000. from_unixtime expects a long type
which is time in milliseconds from reference date.

The following should work:

val ds = dataset.withColumn("hour",hour(from_unixtime(dataset.col("ts"))))






--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]