[jdom-interest] Character encodings...

Mark Schmeets mschmeets at hotmail.com
Fri Sep 27 14:46:11 PDT 2002

Hi All,
I know this must be a character encoding problem, but I am at wits end 
trying to figure out where I am making the mistake.

I have a Swing application which programatically converts CSV files 
(produced on an NT client) creates a JDOM document, posts that document to a 
servlet which passes the JDOM to a builder class which in turn creates sql 
statements to insert the records into an Oracle database. So far, so good, 

Another part of the system contains an applet which posts a request to a 
servlet (that queries the database, creates a JDOM document with the 
resultset) and then displays the data.
Touches a lot of stuff here, I know. The problem character is the left 
double quotation mark character. In the CSV file it shows up as 0x93 which 
matches the Windows 1252 codepage map, also as U+201C.
My applet throws a SAXParseException for an illegal xml character : &#X1c. 
Ok, I see that is "half" of the unicode value for the character, but I do 
not understand why I am getting the error.
I have looked at the XML on the input side, no apparent problems there.

On the output side the data comes from JDBC, and I am I am specifying UTF-8 
as the encoding for the XMLOutputter. The InputStreamReader that is created 
on the applet is also specified for UTF-8. So, it seems like the output side 
should be ok, but to me it looks like we are dropping part of the unicode 
value (the 20), and just passing the 1C.

Any suggestions, as to what I am doing wrong?


