[jdom-interest] Character encoding from UTF-8 to ISO-8859-1

Alex Rosen arosen at silverstream.com
Wed Feb 6 13:07:36 PST 2002


> You cannot transcode exactly UTF-8 to Latin1 as UTF-8 has a richer
> set of characters

True - if your document contains chars that aren't in the character encoding
that you're using, such as Czech or Chinese characters if you're using
Latin1, then they need to be escaped as character reference, like "Ӓ".
I forget if XMLOutputter does this for you - I think it doesn't, but it's
planned for the future.

> and, especially, as UTF-8 is not a superset of Latin1.

No, all of Unicode is supported by UTF-8, and Unicode is a superset of the
Latin1 (ISO-8859-1) character set.

Alex





More information about the jdom-interest mailing list