[jdom-interest] How to get the XML Decl with JDOM?

Alex Rosen arosen at silverstream.com
Mon Oct 29 14:36:07 PST 2001


Sorry, I guess it wasn't obvious. It says:

"Why does my file encoding on output not match the encoding on input?

The default character encoding used by XMLOutputter is UTF-8, a
variable-length encoding that can represent all Unicode characters. This can
be changed with a call to outputter.setEncoding(). It would be nice if
XMLOutputter could default to the original encoding for a file, but
unfortunately parsers don't indicate the original encoding. You have to set
it programatically.

This issue most often affects people with documents in the common ISO-8859-1
(Latin-1) encoding who use characters like n but aren't familiar with having
to think about encodings. The tip to remember is that with these documents
you must set the output encoding to ISO-8859-1, otherwise characters in the
range 128-255 will be output using a double byte encoding in UTF-8 instead
of the normal single byte encoding of ISO-8859-1."

What it's saying is that XML parsers (via their SAX interface) do not
provide the XML decl information to JDOM, so JDOM can't know the encoding
when reading in a document. But as I said, it looks like the extensions to
SAX2 do provide this, so we should probably look into that. Jason, do you
want to add this to the TODO list?

Alex

> -----Original Message-----
> From: jdom-interest-admin at jdom.org
> [mailto:jdom-interest-admin at jdom.org]On Behalf Of Fred Clewis
> Sent: Monday, October 29, 2001 2:58 PM
> To: jdom-interest at jdom.org
> Subject: RE: [jdom-interest] How to get the XML Decl with JDOM?
>
>
> thanks Alex,
>
> I could not find a FAQ about how to get the XML decl
> information out of the
> input XML with JDOM, only the one about how to set the
> outputter and writer
> after you have it.  Did I miss it?
>




More information about the jdom-interest mailing list