[jdom-interest] Fwd: Re: Kana symbols and UTF-8? (was Re: Kanacharacters?)

Angela Amoateng angela.amoateng at kcl.ac.uk
Wed May 23 06:37:24 PDT 2007

Hi Michael,

Thanks for that warning! I think will stick to just using the 
hexadecimal values then, just in case I do get the dreaded boxes for 
some symbols!

I have been told I could use either, but to be on the safe side, I will 
use the hexadecimal values. =)

Thanks Again

Quoting Michael Kay <mike at saxonica.com>:

>> To clear confusion, the symbols used in the <hiraganaSym>
>> tags are actual fonts of the UTF-8 hexadecimal value.
> I'm sorry, but that kind of language causes far more confusion than it
> clears. You're using words like symbol, font, and tag quite inaccurately.
> Your XML document is a sequence of bytes or octets. The encoding of the
> document determines the mapping of these octets to Unicode characters, so if
> the encoding is UTF-8 then a sequence of three particular octets might
> represent the character whose Unicode name is "HIRAGANA LETTER HA", which is
> assigned to the codepoint hexadecimal x306F (=decimal 12399). A font is a
> mapping from characters to glyphs (visible representations of characters on
> screen or paper). So to get from a sequence of octets in your file to
> something you see on the screen, you first use the encoding to translate the
> octets to characters, and you then use a font to translate the characters to
> glyphs.
> In XML, you can always represent a character using a character reference,
> for example HIRAGANA LETTER HA can be represented as &#x306F; or as
> &#12399;. This is useful if you don't have a keyboard that lets you enter
> the character directly, and it also has the advantage that it protects you
> from errors in applying the encoding. But it doesn't help you with font
> difficulties: if you use a font that has no glyph for a given character,
> then it will usually be displayed in some kind of fallback representation,
> for example a hollow rectangle.
> Michael Kay
> http://www.saxonica.com/

Angela Amoateng
angela.amoateng at kcl.ac.uk

More information about the jdom-interest mailing list