[jdom-interest] Re: SAXBuilder enhancement request /2
Dennis Sosnoski
dms at sosnoski.com
Sat Mar 30 02:40:26 PST 2002
From a quick look at the code this appears to remove character data
consisting only of whitespace separating elements - it doesn't strip
leading and trailing whitespace from the character data content of an
element. It might be good to change the class description to make this
clear. :-)
It could be modified to strip leading and trailing whitespace with some
work; right now it just collects whitespace character data if it hasn't
seen anything that's not a whitespace, and once it sees a nonwhitespace
passes everything on directly. Instead it'd need to accumulate all the
character data once it sees a nonwhitespace (scanning from the start of
each sequence, not the end), then strip trailing whitespace before it
dumps the data to the next step (on any non-character data event).
Performance would be better just doing the whitespace stripping within
the SAXHandler, though (no copying and extra array creation steps).
- Dennis
Joseph Bowbeer wrote:
>Btw, a whitespace stripping filter is here:
>
>http://cvs.jdom.org/cgi-bin/viewcvs.cgi/jdom/samples/sax/DataUnformatFilter.
>java
>
>As the javadoc says:
>
>* This filter removes leading and trailing whitespace from field-oriented
>* XML without mixed content. Note that this class will likely not yield
>* appropriate results for document-oriented XML like XHTML pages
>* which mix character data and elements together.
>
>----- Original Message ----- >
>
>[...] It could also be done using a filter, as ERH suggests, though this
>might be a little more complicated - for stripping trailing whitespace you'd
>need to make sure you have the entire character data sequence available,
>rather than just a portion. [...]
>
>
>
More information about the jdom-interest
mailing list