[jdom-interest] Re: Getting original Encodin g and changing the d efau lt UTF-8

Elliotte Harold elharo at metalab.unc.edu
Fri Sep 10 03:10:33 PDT 2004

Young Matthew wrote:

>Exactly.  We have child documents that get included which should have a certain
>ISO encoding but don't.  Then the default takes over and the swedish characters
>bomb the parser.

Those documents are likely malformed.

>Simplest thing is to demand our projects to deliver documents with the correct

>Find it odd that when transforming with XSLT (say Xalan) that the encoding of
>the style sheet overides all of the input XML documents.  Seems like XML
>parsers should apply the same principle with "included" child documents to a
>parent XML.  If the main XML says the encoding should be XYZ then regardless of
>what  is stated in the headers of subdocuments the document gets translated
>with XYZ encoding.
That's not what happens with XSLT at all. The XSLT processor does not 
know or care about the encoding of the input documents. It receives all 
its data from an XML parser that's resolved it. The principle you claim 
should be followed doesn't exist, and would be a disaster fro 
multilingual documents.

Elliotte Rusty Harold
elharo at metalab.unc.edu

