"unmatched halves of surrogate pairs".... That would be assuming UTF-8 specifically,
would it not? ISO-8859-1, for example, does not have surrogate pairs.


>At 11:08 PM -0800 11/1/02, Malachi de AElfweald wrote:
>>It would be against XML spec to check the characters within the 
>>CDATA, since the spec
>>says that CDATA is "unparsed character data". Seems like parsing it 
>>wouldn't fit the description, eh?
>No, that's not quite true. there are a number of characters which 
>cannot appear in a CDATA section. These include many C0 controls such 
>as null and vertical tab, unmatched halves of surrogate pairs, and a 
>few other undefined code points. The three character sequence ]]> is 
>also illegal.
