[jdom-interest] XMLOutputter and utf-8
    Jason Hunter 
    jhunter at xquery.com
       
    Fri May 20 11:27:03 PDT 2005
    
    
  
Well, just be aware the renderedDoc string there is going to be a 
character String not a byte stream, so you can't look at it for 
diagnostics about how the encoding's going.
The out.output(doc, output) looks like the proper way to send UTF-8 
characters.  I don't recall if there were any issues with beta7 about 
this.  Beta7 was long, long ago.  You may also want to specify the 
encoding in the HTTP headers you're sending so the receiver will know 
how to parse the bytes.
-jh-
Chris Curvey wrote:
> Thanks to Jason & Paul for their responses.  I tried Jason's suggestion 
> for my example, and it works great.  (And I realize that this question 
> is increasingly off-topic, please forgive me.)
> 
> In my real-world problem, I'm not writing to System.out, I'm writing to 
> an output stream returned from an HttpsURLConnection.  So I tried this:
> 
>     Document doc = getXML();
>     XMLOutputter out = new XMLOutputter();
>     out.setEncoding("UTF-8");
>     String renderedDoc = out.outputString(doc);
> 
>     // Construct the request headers
>     setupHeaders(theConnection, renderedDoc.length());
> 
>     // Send the request
>     OutputStream output = theConnection.getOutputStream();
>     out.output(doc, output);
> 
> I don't have access to the server on the other end of that connection, 
> and the connection is encrypted, so I can't just put in a proxy server 
> to capture the stream to see what's really being sent.
> 
> One more data point, which may or may not be important.  I have to use 
> the Beta-7 version of JDOM, because it's distributed as part of my app 
> server, and putting jdom 1.0 earlier in the classpath causes the app 
> server to choke. 
> 
> Many, many thanks for any help.
> 
> -Chris
> 
> On 5/20/05, *Jason Hunter* <jhunter at xquery.com 
> <mailto:jhunter at xquery.com>> wrote:
> 
>     You're not actually outputting the file to a byte stream.  You're
>     outputting it to a String, then printing the string using
>     System.out.println().  System.out is a PrintStream and per the
>     PrintStream Javadocs, "All characters printed by a PrintStream are
>     converted into bytes using the platform's default character encoding."
> 
>     Try this: out.output(doc, System.out);
> 
>     That way JDOM gets to control the bytes being output.
> 
>     -jh-
> 
>     Chris Curvey wrote:
> 
>      > Hi all,
>      >
>      > I'm having a little trouble figuring out utf-8 encoding with
>     JDom.  The
>      > output from this sample program is returning a single hex value, \xc9
>      > for an E-acute, but according to this page
>      > http://www.fileformat.info/info/unicode/char/00c9/index.htm, the
>     UTF-8
>      > encoding for E-acute should be a hex pair \xc3 and \x89.  (\xc9
>     appears
>      > to be right value for UTF-16.)
>      >
>      > Any idea what I'm doing wrong?  Or am I just misinterpreting
>     something?
>      >
>      > import org.jdom.Document;
>      > import org.jdom.Element;
>      > import org.jdom.output.XMLOutputter ;
>      > import org.jdom.output.Format;
>      >
>      > class JdomTest
>      > {
>      >     public static void main (String[] argv)
>      >     {
>      >         Document doc = new Document();
>      >         Element element = new Element("foobar");
>      >         element.setText("CLOISONNÉ");
>      >         doc.addContent(element);
>      >
>      >         Format format = Format.getPrettyFormat();
>      >         format.setEncoding("UTF-8");
>      >         XMLOutputter out = new XMLOutputter(format);
>      >         System.out.println(out.outputString(doc));
>      >     }
>      > }
>      >
>      >
>      >
>     ------------------------------------------------------------------------
> 
>      >
>      > _______________________________________________
>      > To control your jdom-interest membership:
>      >
>     http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>     <http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com>
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
    
    
More information about the jdom-interest
mailing list