How do I use Nokogiri::XML::Reader to parse large XML files?
Each element in the stream comes through as two events: one to open the element and one to close it. The opening event will have
node.node_type == Nokogiri::XML::Reader::TYPE_ELEMENT
and the closing event will have
node.node_type == Nokogiri::XML::Reader::TYPE_END_ELEMENT
The empty strings you're seeing are just the element closing events. Remember that with SAX parsing, you're basically walking through a tree so you need the second event to tell you when you're going back up and closing an element.
You probably want something more like this:
reader.each do |node| if node.name == "PMID" && node.node_type == Nokogiri::XML::Reader::TYPE_ELEMENT p << node.inner_xml endend
Or perhaps:
reader.each do |node| next if node.name != 'PMID' next if node.node_type != Nokogiri::XML::Reader::TYPE_ELEMENT p << node.inner_xmlend
Or some other variation on that.