Split string with delimiters in C Split string with delimiters in C c c

Split string with delimiters in C


You can use the strtok() function to split a string (and specify the delimiter to use). Note that strtok() will modify the string passed into it. If the original string is required elsewhere make a copy of it and pass the copy to strtok().

EDIT:

Example (note it does not handle consecutive delimiters, "JAN,,,FEB,MAR" for example):

#include <stdio.h>#include <stdlib.h>#include <string.h>#include <assert.h>char** str_split(char* a_str, const char a_delim){    char** result    = 0;    size_t count     = 0;    char* tmp        = a_str;    char* last_comma = 0;    char delim[2];    delim[0] = a_delim;    delim[1] = 0;    /* Count how many elements will be extracted. */    while (*tmp)    {        if (a_delim == *tmp)        {            count++;            last_comma = tmp;        }        tmp++;    }    /* Add space for trailing token. */    count += last_comma < (a_str + strlen(a_str) - 1);    /* Add space for terminating null string so caller       knows where the list of returned strings ends. */    count++;    result = malloc(sizeof(char*) * count);    if (result)    {        size_t idx  = 0;        char* token = strtok(a_str, delim);        while (token)        {            assert(idx < count);            *(result + idx++) = strdup(token);            token = strtok(0, delim);        }        assert(idx == count - 1);        *(result + idx) = 0;    }    return result;}int main(){    char months[] = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC";    char** tokens;    printf("months=[%s]\n\n", months);    tokens = str_split(months, ',');    if (tokens)    {        int i;        for (i = 0; *(tokens + i); i++)        {            printf("month=[%s]\n", *(tokens + i));            free(*(tokens + i));        }        printf("\n");        free(tokens);    }    return 0;}

Output:

$ ./main.exemonths=[JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC]month=[JAN]month=[FEB]month=[MAR]month=[APR]month=[MAY]month=[JUN]month=[JUL]month=[AUG]month=[SEP]month=[OCT]month=[NOV]month=[DEC]


I think strsep is still the best tool for this:

while ((token = strsep(&str, ","))) my_fn(token);

That is literally one line that splits a string.

The extra parentheses are a stylistic element to indicate that we're intentionally testing the result of an assignment, not an equality operator ==.

For that pattern to work, token and str both have type char *. If you started with a string literal, then you'd want to make a copy of it first:

// More general pattern:const char *my_str_literal = "JAN,FEB,MAR";char *token, *str, *tofree;tofree = str = strdup(my_str_literal);  // We own str's memory now.while ((token = strsep(&str, ","))) my_fn(token);free(tofree);

If two delimiters appear together in str, you'll get a token value that's the empty string. The value of str is modified in that each delimiter encountered is overwritten with a zero byte - another good reason to copy the string being parsed first.

In a comment, someone suggested that strtok is better than strsep because strtok is more portable. Ubuntu and Mac OS X have strsep; it's safe to guess that other unixy systems do as well. Windows lacks strsep, but it has strbrk which enables this short and sweet strsep replacement:

char *strsep(char **stringp, const char *delim) {  if (*stringp == NULL) { return NULL; }  char *token_start = *stringp;  *stringp = strpbrk(token_start, delim);  if (*stringp) {    **stringp = '\0';    (*stringp)++;  }  return token_start;}

Here is a good explanation of strsep vs strtok. The pros and cons may be judged subjectively; however, I think it's a telling sign that strsep was designed as a replacement for strtok.


String tokenizer this code should put you in the right direction.

int main(void) {  char st[] ="Where there is will, there is a way.";  char *ch;  ch = strtok(st, " ");  while (ch != NULL) {  printf("%s\n", ch);  ch = strtok(NULL, " ,");  }  getch();  return 0;}