Is there maximum size of string data type in Hive? Is there maximum size of string data type in Hive? hadoop hadoop

Is there maximum size of string data type in Hive?


The current documentation for Hive lists STRING as a valid datatype, distinct from VARCHAR and CHAR See official apache doc here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Strings

It wasn't immediately apparent to me that STRING was indeed it's own type, but if you scroll down you'll see several cases where it's used distinctly from the others.

While perhaps not authoritative, this page indicates the max length of a STRING is 2GB. http://www.folkstalk.com/2011/11/data-types-in-hive.html


By default, the columns metadata for Hive does not specify a maximum data length for STRING columns.

The driver has the parameter DefaultStringColumnLength, default is 255 maximum value.

A connection string with this parameter set to maximum size would look like this: jdbc:hive2://localhost:10000;DefaultStringColumnLength=32767;

(https://github.com/exasol/virtual-schemas/issues/118)

"In the “looser” world in which Hive lives, where it may not own the data files and has to be flexible on file format, Hive relies on the presence of delimiters to separate fields. Also, Hadoop and Hive emphasize optimizing disk reading and writing performance, where fixing the lengths of column values is relatively unimportant." from

https://learning.oreilly.com/library/view/programming-hive/9781449326944/ch03.html#Collection-Data-Types