[jdom-interest] retrieving the children of an element as text

Thu Sep 14 11:39:52 PDT 2006

--- Paul Libbrecht <paul at activemath.org> wrote:

> J. McConnell wrote:
> > On 9/13/06, Jason Hunter <jhunter at servlets.com>
> wrote:
> >> J. McConnell wrote:
> >> > On 9/13/06, Jason Hunter <jhunter at servlets.com>
> wrote:
> >> >> I'd recommend
> xmlOutputter.outputString(mainelement.getContent()).
> >> I think it will, since getContent() includes the
> elements as well as the
> >> whitespace text nodes, and by default
> XMLOutputter doesn't alter source
> >> whitespace.
> > Huh, good to know.  Sorry for the confusion and
> thanks for the 
> > explanation!
> 
> Can I ask whether DOM is better than SAX in this
> respect ?
> Whether there's anything yet different better than
> that ?
> (e.g. that would preserve whitespace inside tags)

It's unlikely DOM would be better (in looking more
similar, round-trippability), since most (all?) DOM
impls seem to be based on SAX.
At the same time, trying to reproduce exact
white-spacing (between attributes, for example) etc
seems like a lost cause. It's more important to
preserve InfoSet contents identical.

Having said that, if one really wants to do in-place
minimally disruptive changes, one could look at
VTD-XML. While it's not as xml compliant as SAX
parsers, it can definitely preserve exact formatting
of pieces not modified, since it refers to the
original textual serialization of the document, not
just Infoset content as nodes.

-+ Tatu +-

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com