Hive: how to convert yyyy-mm-ddThh:mm:SS:sssZ into hour units
Use unix_timestamp
and from_unixtime
functions to convert and format the timestamp required.
select from_unixtime(unix_timestamp(string("2020-03-09T07:34:06:825Z"),"yyyy-MM-dd'T'hh:mm:ss:SSS'Z'"),"yyyy-MM-dd'T'hh:00:00") as new_ts;+-------------------+|new_ts |+-------------------+|2020-03-09T07:00:00|+-------------------+
Explanation:
unix_timestamp(string("2020-03-09T07:34:06:825Z"), --sample data"yyyy-MM-dd'T'hh:mm:ss:SSS'Z'") --match the data format
from_unixtime('unix_timestamp...etc',"yyyy-MM-dd'T'hh:00:00") --to format as required
Using regexp_replace:
with your_data as (select stack('2020-03-09T07:34:06:825Z','2020-03-09T07:54:12:220Z','2020-03-09T03:54:11:041Z','2020-03-09T09:22:10:220Z','2020-03-09T11:13:36:217Z','2020-03-09T11:23:26:040Z','2020-03-09T11:43:35:721Z') as str)select regexp_replace(str,'(\\d{4}-\\d{2}-\\d{2})T(\\d{2}).*','$1T$2:00:00') from your_data;
Result:
2020-03-09T07:00:002020-03-09T07:00:002020-03-09T03:00:002020-03-09T09:00:002020-03-09T11:00:002020-03-09T11:00:002020-03-09T11:00:00
Explanation:
Regular expression defines two groups:
$1 is date part (\\d{4}-\\d{2}-\\d{2})
$2 is hour part after T '(\d{2})' everything else at the end .*
is ignored.
You extract '$1T$2:00:00'