[jdom-interest] Question about CDATA

Ken Rune Helland kenh at csc.no
Tue Jul 2 01:15:13 PDT 2002


> Greetings,
> I have a question about how to extract the contents of CDATA that has
> &  and " on it.
> 
> Here is a sample tag:
> <foo>
> <![CDATA[&quot;One &amp; two&quot;]]>
> </foo>
> 
> When I apply the getText() method the return is &quot;One &amp;
> two&quot;  when I'm expecting 'One & two'.
> 

The question that arises is why is this inside a CDATA section?

The whole point of a CDATA section is to store data where you do
NOT want the text to be parsed as XML.

So if you want it to come up as 'One & two' you either get rid of the
CDATA:

<foo>
&quot;One &amp; two&quot;
</foo>

or store it literaly inside the CDATA:

<foo>
<![CDATA['One & two']]>
</foo>

If you are not in control of the generation of the XML i guess
you have to parse the text int the CDATA yourself, unless someone else
comes up with a smarter idea.


Best regards
KenR





More information about the jdom-interest mailing list