[jdom-interest] Outputting Entity reference for non US-ASCII characters

Benjamin Kopic benjamin.kopic at panContext.com
Thu Oct 16 01:37:39 PDT 2003


Hi

I need to write some sort of Entity handling routine that converts all
of the non US-ASCII characters
to their SGML Entity reference. There was some discussion on this
subject way back, but I am not sure
what came out of it. All of the documents I need to produce have to
comply to the following restriction:
http://www.ncbi.nlm.nih.gov/entrez/query/static/entities.html

What would be the best way:

a) write EntityRef for each one of these and then let JDOM XMLOutputter
do the conversion (I assume it
does it)

b) write my own String conversion utility that converts the chars
outside 127 bit range to their entity
ref value.

Actually, what I really would like to know is if JDOM would convert a
Unicode String to an XML String
that is valid for a particular encoding (i.e. US-ASCII) simply by
registering EntityRef for each of
the characters outside the range for the given encoding?

Best regards

Benjamin
-- 
benjamin kopic
m: +44 (0)780 154 7643
t: +44 (0)20 7794 3090
e: benjamin.kopic at panContext.com
w: http://www.panContext.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://jdom.org/pipermail/jdom-interest/attachments/20031016/1ad579a7/attachment.htm


More information about the jdom-interest mailing list