Is there maximum size of string data type in Hive?
The current documentation for Hive lists STRING
as a valid datatype, distinct from VARCHAR
and CHAR
See official apache doc here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Strings
It wasn't immediately apparent to me that STRING
was indeed it's own type, but if you scroll down you'll see several cases where it's used distinctly from the others.
While perhaps not authoritative, this page indicates the max length of a STRING
is 2GB. http://www.folkstalk.com/2011/11/data-types-in-hive.html
By default, the columns metadata for Hive does not specify a maximum data length for STRING columns.
The driver has the parameter DefaultStringColumnLength, default is 255 maximum value.
A connection string with this parameter set to maximum size would look like this: jdbc:hive2://localhost:10000;DefaultStringColumnLength=32767;
(https://github.com/exasol/virtual-schemas/issues/118)
"In the “looser” world in which Hive lives, where it may not own the data files and has to be flexible on file format, Hive relies on the presence of delimiters to separate fields. Also, Hadoop and Hive emphasize optimizing disk reading and writing performance, where fixing the lengths of column values is relatively unimportant." from