Why grep is taking so much time? Why grep is taking so much time? unix unix

Why grep is taking so much time?


The following line might make it faster:

$ awk '/^mj/{c++}END{print c}' file

This will process the file only a single time and it will only print the total amount of matches. This is in contrast to your initial case where you ask grep to print everything into a buffer and process that again with wc.

In the end, you could also just do:

$ grep -c '^mj' file

which just returns the total matches. This is probably even faster than the awk version. Awk will, by default, attempt a field splitting, this action is not needed with the above grep.

There are many reasons why your process could be slow, heavy load on the disk, a slow nfs if you use it, extremely long lines to parse, ... without more information on the input file and the system you are running this on, it is hard to say why it is so slow.


Sounds like something up with your machine. Have you enough swap space etc? What does df -h show? As a test, try egrep or fgrep as alternatives to grep.


You should try with this small C program that I just made a minute ago.

#define _FILE_OFFSET_BITS 64#include <string.h>#include <stdio.h>#include <sys/mman.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <errno.h>#include <unistd.h>const char needle[] = "mj";int main(int argc, char * argv[]) {  int fd, i, res, count;  struct stat st;  char * data;  if (argc != 2) {    fprintf(stderr, "Syntax: %s file\n", *argv);    return 1;  }  fd = open(argv[1], O_RDONLY);  if (fd < 0) {    fprintf(stderr, "Couldn't open file \"%s\": %s\n", argv[1], strerror(errno));    return 1;  }  res = fstat(fd, &st);  if (res < 0) {    fprintf(stderr, "Failed at fstat: %s\n", strerror(errno));    return 1;  }  if (!S_ISREG(st.st_mode)) {    fprintf(stderr, "File \"%s\" is not a regular file.\n", argv[1]);    return 1;  }  data = mmap(NULL, st.st_size, PROT_READ, MAP_SHARED, fd, 0);  if (!data) {    fprintf(stderr, "mmap failed!: %s\n", strerror(errno));    return 1;  }  count = 0;  for (i = 0; i < st.st_size; i++) {    // look for string:    if (i + sizeof needle - 1 < st.st_size    && !memcmp(data + i, needle, sizeof needle - 1)) {      count++;      i += sizeof needle - 1;    }    while (data[i] != '\n' && i < st.st_size)      i++;  }  printf("%d\n", count);  return 0;}

Compile it with: gcc grepmj.c -o grepmj -O2