[jdom-interest] ampersand entity parsing

Michael Kay mike at saxonica.com
Wed Oct 18 01:16:58 PDT 2006


> 
> I have a text node like:
> <name>Peter &amp; Paul</name>
> 
> The parser returns only:
> Peter
> 
> It seems to stop right at the ampersand and then moves on to 
> the next element.
> No parsing errors occur.
> I'm using Xerces as the SAX parser.

SAX can break up a text node into multiple calls of the characters() method
any way that the parser likes. It's the application's job to assemble the
pieces. It's quite common for parsers to make one call for each sequence of
characters that were adjacent in the input buffer, which means that the text
will break at an entity reference.

Michael Kay
http://www.saxonica.com/



More information about the jdom-interest mailing list