Why is "grep --ignore-case" 50 times slower?
I think this bug report helps in understanding why it is slow:
This slowness is due to grep (on a UTF-8 locale) constantly accesses files "/usr/lib/locale/locale-archive" and "/usr/lib/gconv/gconv-modules.cache".
It can be shown using the strace utility. Both files are from glibc.
The reason is that it needs to do a Unicode-aware comparison for the current locale, and judging by Marat's answer, it's not very efficient in doing so.
This shows how much faster it is when Unicode is not taken into consideration:
$ time LC_CTYPE=C grep -i fun test.txtall work and no plJack is no funJack is no Funreal 0m0.192s
Of course, this alternative won't work with characters in other languages such as Ñ/ñ, Ø/ø, Ð/ð, Æ/æ and so on.
Another alternative is to modify the regex so that it matches with case insensitivity:
$ time grep '[Ff][Uu][Nn]' test.txtall work and no plJack is no funJack is no Funreal 0m0.193s
This is reasonably fast, but of course it's a pain to convert each character into a class, and it's not easy to convert it to an alias or an sh
script, unlike the above one.
For comparison, in my system:
$ time grep fun test.txtall work and no plJack is no funreal 0m0.085s$ time grep -i fun test.txtall work and no plJack is no funJack is no Funreal 0m3.810s