[jdom-interest] encoding="MS950"

Jason Hunter jhunter at xquery.com
Mon Nov 22 09:42:57 PST 2004


Are you using Xerces?  If so look at the

http://apache.org/xml/features/allow-java-encodings

feature as listed at the web page

http://xml.apache.org/xerces2-j/features.html

-jh-

Stuart wrote:

> All,
> 
> Regarding the encoding problem I initially thought I may need to install
> Chinese version of windows (I still may try this at some point) however with
> jdk1.4 and jdk1.3 I am able to use MS950 encoding as follows:
> 
> 			String test = "hello"; //hack
> 			byte[] bytes = test.getBytes("MS950"); //hack
> 
> If I make up some unknown encoding it will fail (as expected):
> 
> 			String test = "hello"; //hack
> 			byte[] bytes = test.getBytes("dhhfg"); //hack
> 
> 	java.io.UnsupportedEncodingException: dhhfg
>         at sun.io.Converters.getConverterClass(Converters.java:125)
>         at sun.io.Converters.newConverter(Converters.java:156)
>         at
> sun.io.CharToByteConverter.getConverter(CharToByteConverter.java:64)
>         at java.lang.StringCoding.encode(StringCoding.java:368)
>         at java.lang.String.getBytes(String.java:591)
> 
> What is different about JDOM or the SAXBuilder?  The XML (VXML) document I
> am testing with is as follows (note: I just added the encoding attribute
> myself i.e. the document was not created using any Chinese input and it does
> not need the encoding attribute.  The 'real' xml documents I am parsing are
> much more complicated but the problem is the same):
> 
> <?xml version="1.0" encoding="MS950"?>
> <vxml version="1.0">
> 	<form id="hello">
> 		<block>Hello World!</block>
> 	</form>
> </vxml>
> 
> Any help will be much appreciated.
> 
> Regards,
> 
> Stuart
> 
> -----Original Message-----
> From: Stuart [mailto:stuart at truetel.com]
> Sent: Tuesday, November 23, 2004 12:36 AM
> To: jdom-interest at jdom.org
> Subject: RE: [jdom-interest] encoding="MS950"
> 
> 
> All,
> 
> Sorry for the multiple postings but I think I was wrong about MS950 not
> being supported in jdk1.4.  I also discovered the following entry in the
> jdk1.4 information:
> 
> x-windows-950 MS950 Windows Traditional Chinese
> 
> Not sure what I am doing wrong.  *8-(
> 
> Regards,
> 
> Stuart
> 
> 
> -----Original Message-----
> From: Stuart [mailto:stuart at truetel.com]
> Sent: Monday, November 22, 2004 10:51 PM
> To: Elliotte Harold
> Cc: jdom-interest at jdom.org
> Subject: RE: [jdom-interest] encoding="MS950"
> 
> 
> All,
> 
> I originally posted a question about the SAXBuilder supporting the encoding
> format MS950.  I recieved a reply stating that the encoding format support
> is determined by the JDK (not the parser).  I also found that MS950 no
> longer appears supported under jdk1.4 BUT in jdk1.3 it seems to be supported
> (http://java.sun.com/j2se/1.3/docs/guide/intl/encoding.doc.html).  I
> downladed the international jre for jdk1.3.1_13 but I still get the encoding
> not supported error:
> 
> STUART$java -version
> java version "1.3.1_13"
> Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_13-b03)
> Java HotSpot(TM) Client VM (build 1.3.1_13-b03, mixed mode)
> 
> Here is the error I am getting:
> 
> org.jdom.input.JDOMParseException: Error on line 0: The encoding "MS950" is
> not
> supported.
>         at org.jdom.input.SAXBuilder.build(SAXBuilder.java:468)
>         at org.jdom.input.SAXBuilder.build(SAXBuilder.java:810)
>         at org.jdom.input.SAXBuilder.build(SAXBuilder.java:789)
> 	  ...
> 
> Do I need to do something in order to 'enable' the internation support?  I
> opened the i18n.jar and inside could see a class called
> CharToByteMS950.class.
> 
> Also is there a way of disabling the encoding check (basically just ignore
> this field and parse the rest of the document)?
> 
> Regards,
> 
> Stuart
> 
> 
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
> 


More information about the jdom-interest mailing list