[jdom-interest] Question about CR/LF Handling

Michael Kay mike at saxonica.com
Thu Dec 5 02:07:56 PST 2013



On 5 Dec 2013, at 09:41, Dr. Klemens Waldhör <klemens.waldhoer at heartsome.de> wrote:

> I am using JDOM in the open source project openTMS mainly to parse Xliff
> files . One problem I run into now is that I find no way how to keep
> carriage return (CR) / line feed (LF) exactly in the same way as they are in
> the original xml file once I write a copy of the possible modified file to
> the XMLOutputter.
> 
> What I would like to have is:
> 
> "Text CR text LF text CR LF text"
> 
> Is exactly copied as it is to the output. And not changed to "Text LF text
> LF text LF text" depending on some setting. I know what the XML spec says
> about this, anyway I need the original characters.
> 
> Anything I can do about it?
> 

Not really. If XML says that a distinction is irrelevant (for example the whitespace after a tag name in a start or end tag, or the choice of single or double quotes) then the XML parser is going to normalize things so the application doesn't know what was in the original. That's by design; you're not supposed to write applications that are sensitive to such distinctions.

Michael Kay
Saxonica




More information about the jdom-interest mailing list