[jdom-interest] Attribute.getSerializedForm bug [eg]

Elliotte Rusty Harold elharo at metalab.unc.edu
Sat Apr 14 10:38:35 PDT 2001


At 11:24 AM -0700 4/13/01, Jason Hunter wrote:

>The general "right" solution is probably to check each character as it
>goes out and if it's not in the chosen encoding's character set then
>output a char entity.  The problem is that for many encodings such a
>check isn't fast at all (less than this, greater than that, less than
>this, greater than that), nor is the information about which chars are
>in which character set easily available (to my knowledge).
>

JDK 1.4 should make this available and a lot easier. It is available 
now. but you either need to use some undocumented classes in the sun 
packages or use some very inconvenient and probably slow APIs.

As to slowness, there are some strong optimizations we can do for the 
most common cases; e.g. ASCII, Latin-1, UTF-8, and all other Unicode 
variants. We'd only need to take the performance hit on non-Latin-1 
characters in non-Unicode environments.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|                  The XML Bible (IDG Books, 1999)                   |
|              http://metalab.unc.edu/xml/books/bible/               |
|   http://www.amazon.com/exec/obidos/ISBN=0764532367/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://metalab.unc.edu/javafaq/ |
|  Read Cafe con Leche for XML News: http://metalab.unc.edu/xml/     |
+----------------------------------+---------------------------------+



More information about the jdom-interest mailing list