Java SAX Parsing Java SAX Parsing xml xml

Java SAX Parsing


There is one neat trick when writing a SAX parser: It is allowed to change theContentHandler of a XMLReader while parsing. This allows to separate theparsing logic for different elements into multiple classes, which makes theparsing more modular and reusable. When one handler sees its end element itswitches back to its parent. How many handlers you implement would be left toyou. The code would look like this:

public class RootHandler extends DefaultHandler {    private XMLReader reader;    private List<Team> teams;    public RootHandler(XMLReader reader) {        this.reader = reader;        this.teams = new LinkedList<Team>();    }    public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException {        if (name.equals("team")) {            // Switch handler to parse the team element            reader.setContentHandler(new TeamHandler(reader, this));        }    }}public class TeamHandler extends DefaultHandler {    private XMLReader reader;    private RootHandler parent;    private Team team;    private StringBuilder content;    public TeamHandler(XMLReader reader, RootHandler parent) {        this.reader = reader;        this.parent = parent;        this.content = new StringBuilder();        this.team = new Team();    }    // characters can be called multiple times per element so aggregate the content in a StringBuilder    public void characters(char[] ch, int start, int length) throws SAXException {        content.append(ch, start, length);    }    public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException {        content.setLength(0);    }    public void endElement(String uri, String localName, String name) throws SAXException {        if (name.equals("name")) {            team.setName(content.toString());        } else if (name.equals("team")) {            parent.addTeam(team);            // Switch handler back to our parent            reader.setContentHandler(parent);        }    }}


It's difficult to advise without knowing more about your requirements, but the fact that you are surprised that "my code got quite complex" suggests that you were not well informed when you chose SAX. SAX is a low-level programming interface capable of very high performance, but that's because the parser is doing far less work for you, and you therefore need to do a lot more work yourself.


I do something very similar, but instead of having boolean flags to tell me what state I'm in, I test for player or team being non-null. Makes things a bit neater. This requires you to set them to null when you detect the end of each element, after you've added it to the relevant list.