[jdom-interest] Simple xhtml/entity resolver?

Rolf Lear jdom at tuis.net
Thu Mar 29 06:46:51 PDT 2012


Hi Oliver.

If you already have the XHTML content as JDOM Elements, then you should be
able to (just) do:

XMLOutputter xout = new XMLOutputter();
String fragment = xout.outputString(element);

If you want to change the format of the output (indenting, etc.), you can
add a 'Format' to the XMLOutputter with:

XMLOutputter xout = new XMLOutputter(Format.getPrettyFormat());
String fragment = xout.outputString(element);


I think you may be chasing a red-herring with the Entity References.

The EntityRef code is a 'CYA' implementation, but, in reality, the
SystemID and PublicID are never going to be needed in regular usage.

The only place I know of where you have entity references is if you
specify your input parser should ignore entity-reference lookups when
parsing, and in JDOM you will end up with an EntityRef instead of it's
'underlying' text.

Rolf


On Thu, 29 Mar 2012 09:23:36 -0400, Oliver Ruebenacker <curoli at gmail.com>
wrote:
> Hello,
> 
>   I need a simple way to convert some XHTML fragments, provided as a
> JDOM Element, into plain text. I am willing to ignore most HTML tags
> and consider only the most commonly used predefined entities.
> 
>   In JDOM, an entity reference has a name, a public id and a system
> id. I think I know what the named means, for named entities. But what
> about numeric entities, how do I get the code point? And what are
> public id and system id?
> 
>   Thanks!
> 
>      Take care
>      Oliver


More information about the jdom-interest mailing list