What is the difference between TRUNC and TO_DATE in Hive
1
Hive Operators and User-Defined Functions (UDFs)
to_date
Returns the date part of a timestamp string (pre-Hive 2.1.0):
to_date("1970-01-01 00:00:00") = "1970-01-01".
As of Hive 2.1.0, returns a date object.
Prior to Hive 2.1.0 (HIVE-13248) the return type was a String because no Date type existed when the method was created.
trunc
Returns date truncated to the unit specified by the format (as of Hive1.2.0).
Supported formats: MONTH/MON/MM, YEAR/YYYY/YY.
Example: trunc('2015-03-17', 'MM') = 2015-03-01.
2
You have errors in your original query
- There was no comma between
TASK_NO
and(CASE WHEN
Trunc
in Hive must take 1 parameter, and there is no parameter for day.- There is no minus operator for dates (and definitly not for strings). This results in a NULL.
3
The only recognize date format in Hive is YYYY-MM-DD which does not match your data.
Applying date functions on invalid string result in NULL.
This is how you convert your data format to dates:
hive> select from_unixtime(unix_timestamp('07-OCT-16 12:30:54','dd-MMM-yy HH:mm:ss'));OK2016-10-07 12:30:54
and the whole query:
select ORDER_NO ,NAME ,DATE_fixed ,TASK_NO ,case when DATE_uts - LAG(DATE_uts) OVER ( PARTITION BY ORDER_NO,NAME,to_date(DATE_fixed) ORDER BY DATE_fixed ) <= 60*30 then 0 else 1 end AS COUNT1 from (select ORDER_NO ,NAME ,TASK_NO ,from_unixtime(unix_timestamp(DATE_,'dd-MMM-yy HH:mm:ss')) as DATE_fixed ,unix_timestamp(DATE_,'dd-MMM-yy HH:mm:ss') as DATE_uts from t ) t;
ABC123 Humpty 2016-10-07 12:30:54 2 1ABC123 Humpty 2016-10-07 12:30:54 1 0ABC123 Humpty 2016-10-07 12:32:20 6 0
This were also the results when I tested it on Oracle