[jdom-interest] Formatting output

Jason Hunter jhunter at servlets.com
Sat Apr 7 16:04:17 PDT 2007

Michael Kay wrote:
>> Is there an authoritative reference on this?
> http://www.w3.org/TR/xpath-datamodel/
> Section 6.1.1:
> In the [Infoset], a document information item must have at least one child,
> its children must consist exclusively of element information items,
> processing instruction information items and comment information items, and
> exactly one of the children must be an element information item. This data
> model is more permissive: a Document Node may be empty, it may have more
> than one Element Node as a child, and it also permits Text Nodes as
> children.

Thanks, that spells it out clearly: the Infoset does *not* allow text 
nodes as document children while the XQuery/XPath data model chose to be 
"more permissive" and does.


>> I ask because, unless I've long been mistaken, SAX suppresses 
>> such whitespace and always has.
> Well, SAX doesn't support what you might call "XML fragments", and XDM does.
> That's why you need to ask "which XML data model?".
> Incidentally, I'm not sure where you found the statement that SAX parsers
> won't report whitespace outside the document element. The ContentHandler
> interface is certainly used for passing the structure of well-balanced
> fragments as well as well-formed documents, and there's nothing in the
> interface definition as far as I can see which says that this is not a valid
> usage. This does cause some practical problems when you use JAXP to send
> XSLT output to a SAXResult, because the XSLT output might have multiple
> top-level elements, and some ContentHandlers can't cope with that.

Empirical evidence demonstrates SAX parsers do not return whitespace 
outside the root element.  JDOM would error out if any did.  Do you 
think any might start to report such whitespace?  I'd prefer if a JDOM 
document built via SAX could capture that whitespace to be able to round 
trip more reliably.


More information about the jdom-interest mailing list