Handling null values on hive Handling null values on hive hadoop hadoop

Handling null values on hive


The correct query is:

select count(*) from table where columnA is null;


In Hive, count(*) counts all rows and count(columnA) will only count rows where columnA is non-NULL. If you would like to do multiple columns you could write the query as:

select count(*)-count(columnA), count(*)-count(columnB) from table;

and get the number of null values in each column. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF