[jdom-interest] encoding problems when using get text

Wed Jun 19 10:29:59 PDT 2002

From: "Steven Shand" <steven at intrallect.com>

> Thinking some more about it...Is this because java itself is UTF-8 and
> so what I am seeing is the java interpretation of &#176;

&#176; is an XML character entity, not a UTF8 encoding. So it's the parser
that you're using as the backend to JDOM which is automatically translating
that character entity into the appropriate Java unicode character. I'm not
sure if there is anything you can do to prevent that.

> Steven Shand wrote:
>
> > I've had a hunt round the javadocs as well as the FAQ/Archives but I
> > feel like I'm going round in circles!!
> >
> > I create a document from a file ( or a string ), and the content of an
> > element contains text containing  something like &#176;
> > My call to get text on this element returns the unencoded value ( in
> > this case a degree symbol ). How can I maintain the original escaped
> > value??
> >
> > I understand?! the issues with XMLOutputter and setting encoding type
> > but I don't see how that applies here.
> >
> > If someone can help me out or point me in the right direction I'd be
> > mighty appreciative.
> >
> > Thanks.
> >
> > Steven Shand.