XML Parser for C [closed] XML Parser for C [closed] c c

XML Parser for C [closed]


Two examples with expat and libxml2. The second one is,IMHO, much easier to use since it creates a tree in memory, a datastructure which is easy to work with. expat, on the other hand, doesnot build anything (you have to do it yourself), it just allows you tocall handlers at specific events during the parsing. But expat may befaster (I didn't measure).

With expat, reading a XML file and displaying the elements indented:

/*    A simple test program to parse XML documents with expat   <http://expat.sourceforge.net/>. It just displays the element   names.   On Debian, compile with:   gcc -Wall -o expat-test -lexpat expat-test.c     Inspired from <http://www.xml.com/pub/a/1999/09/expat/index.html> */#include <expat.h>#include <stdio.h>#include <string.h>/* Keep track of the current level in the XML tree */int             Depth;#define MAXCHARS 1000000voidstart(void *data, const char *el, const char **attr){    int             i;    for (i = 0; i < Depth; i++)        printf("  ");    printf("%s", el);    for (i = 0; attr[i]; i += 2) {        printf(" %s='%s'", attr[i], attr[i + 1]);    }    printf("\n");    Depth++;}               /* End of start handler */voidend(void *data, const char *el){    Depth--;}               /* End of end handler */intmain(int argc, char **argv){    char           *filename;    FILE           *f;    size_t          size;    char           *xmltext;    XML_Parser      parser;    if (argc != 2) {        fprintf(stderr, "Usage: %s filename\n", argv[0]);        return (1);    }    filename = argv[1];    parser = XML_ParserCreate(NULL);    if (parser == NULL) {        fprintf(stderr, "Parser not created\n");        return (1);    }    /* Tell expat to use functions start() and end() each times it encounters     * the start or end of an element. */    XML_SetElementHandler(parser, start, end);    f = fopen(filename, "r");    xmltext = malloc(MAXCHARS);    /* Slurp the XML file in the buffer xmltext */    size = fread(xmltext, sizeof(char), MAXCHARS, f);    if (XML_Parse(parser, xmltext, strlen(xmltext), XML_TRUE) ==        XML_STATUS_ERROR) {        fprintf(stderr,            "Cannot parse %s, file may be too large or not well-formed XML\n",            filename);        return (1);    }    fclose(f);    XML_ParserFree(parser);    fprintf(stdout, "Successfully parsed %i characters in file %s\n", size,        filename);    return (0);}

With libxml2, a program which displays the name of the root elementand the names of its children:

/*   Simple test with libxml2 <http://xmlsoft.org>. It displays the name   of the root element and the names of all its children (not   descendents, just children).   On Debian, compiles with:   gcc -Wall -o read-xml2 $(xml2-config --cflags) $(xml2-config --libs) \                    read-xml2.c    */#include <stdio.h>#include <string.h>#include <libxml/parser.h>intmain(int argc, char **argv){    xmlDoc         *document;    xmlNode        *root, *first_child, *node;    char           *filename;    if (argc < 2) {        fprintf(stderr, "Usage: %s filename.xml\n", argv[0]);        return 1;    }    filename = argv[1];    document = xmlReadFile(filename, NULL, 0);    root = xmlDocGetRootElement(document);    fprintf(stdout, "Root is <%s> (%i)\n", root->name, root->type);    first_child = root->children;    for (node = first_child; node; node = node->next) {        fprintf(stdout, "\t Child is <%s> (%i)\n", node->name, node->type);    }    fprintf(stdout, "...\n");    return 0;}


How about one written in pure assembler :-) Don't forget to check out the benchmarks.


Two of the most widely used parsers are Expat and libxml.

If you are okay with using C++, there's Xerces-C++ too.