Length of string in bash Length of string in bash bash bash

Length of string in bash


To get the length of a string stored in a variable, say:

myvar="some string"size=${#myvar} 

To confirm it was properly saved, echo it:

$ echo "$size"11


UTF-8 string length

In addition to fedorqui's correct answer, I would like to show the difference between string length and byte length:

myvar='Généralités'chrlen=${#myvar}oLang=$LANG oLcAll=$LC_ALLLANG=C LC_ALL=Cbytlen=${#myvar}LANG=$oLang LC_ALL=$oLcAllprintf "%s is %d char len, but %d bytes len.\n" "${myvar}" $chrlen $bytlen

will render:

Généralités is 11 char len, but 14 bytes len.

you could even have a look at stored chars:

myvar='Généralités'chrlen=${#myvar}oLang=$LANG oLcAll=$LC_ALLLANG=C LC_ALL=Cbytlen=${#myvar}printf -v myreal "%q" "$myvar"LANG=$oLang LC_ALL=$oLcAllprintf "%s has %d chars, %d bytes: (%s).\n" "${myvar}" $chrlen $bytlen "$myreal"

will answer:

Généralités has 11 chars, 14 bytes: ($'G\303\251n\303\251ralit\303\251s').

Nota: According to Isabell Cowan's comment, I've added setting to $LC_ALL along with $LANG.

Length of an argument

Argument work same as regular variables

strLen() {    local bytlen sreal oLang=$LANG oLcAll=$LC_ALL    LANG=C LC_ALL=C    bytlen=${#1}    printf -v sreal %q "$1"    LANG=$oLang LC_ALL=$oLcAll    printf "String '%s' is %d bytes, but %d chars len: %s.\n" "$1" $bytlen ${#1} "$sreal"}

will work as

strLen théorèmeString 'théorème' is 10 bytes, but 8 chars len: $'th\303\251or\303\250me'

Useful printf correction tool:

If you:

for string in Généralités Language Théorème Février  "Left: ←" "Yin Yang ☯";do    printf " - %-14s is %2d char length\n" "'$string'"  ${#string}done - 'Généralités' is 11 char length - 'Language'     is  8 char length - 'Théorème'   is  8 char length - 'Février'     is  7 char length - 'Left: ←'    is  7 char length - 'Yin Yang ☯' is 10 char length

Not really pretty... For this, there is a little function:

strU8DiffLen () {     local bytlen oLang=$LANG oLcAll=$LC_ALL    LANG=C LC_ALL=C    bytlen=${#1}    LANG=$oLang LC_ALL=$oLcAll    return $(( bytlen - ${#1} ))}

Then now:

for string in Généralités Language Théorème Février  "Left: ←" "Yin Yang ☯";do    strU8DiffLen "$string"    printf " - %-$((14+$?))s is %2d chars length, but uses %2d bytes\n" \        "'$string'" ${#string} $((${#string}+$?))  done  - 'Généralités'  is 11 chars length, but uses 14 bytes - 'Language'     is  8 chars length, but uses  8 bytes - 'Théorème'     is  8 chars length, but uses 10 bytes - 'Février'      is  7 chars length, but uses  8 bytes - 'Left: ←'      is  7 chars length, but uses  9 bytes - 'Yin Yang ☯'   is 10 chars length, but uses 12 bytes

Unfortunely, this is not perfect!

But there left some strange UTF-8 behaviour, like double-spaced chars, zero spaced chars, reverse deplacement and other that could not be as simple...

Have a look at diffU8test.sh or diffU8test.sh.txt for more limitations.


I wanted the simplest case, finally this is a result:

echo -n 'Tell me the length of this sentence.' | wc -m;36