[jdom-interest] Parsing a MODS-document with validation fails

Thomas Scheffler thomas.scheffler at uni-jena.de
Wed Jul 20 06:23:49 PDT 2011


Hi,

if I parse a valid MODS document with XML Schema validation, JDOM 
changes attributes as it handles default values of schema not correctly 
(by ignoring the namespace).

Here is a short code to demonstrate this:

SAXBuilder builder = new SAXBuilder(true);
builder.setFeature("http://xml.org/sax/features/namespaces", true);
builder.setFeature("http://xml.org/sax/features/namespace-prefixes", true);
builder.setFeature("http://apache.org/xml/features/validation/schema", 
true);

Document document = builder.build(new 
URL("http://academiccommons.columbia.edu/download/fedora_content/show_pretty/ac:111060/CONTENT/ac111060_description.xml"));
XMLOutputter xout = new XMLOutputter(Format.getPrettyFormat());
xout.output(document, System.out);

Here is a result fragment:

<name type="simple">
<namePart type="family">Edwards</namePart>
<namePart type="given">Stephen A.</namePart>
<role>
<roleTerm type="text">author</roleTerm>
</role>
<affiliation>Columbia University. Computer Science</affiliation>
</name>

If you look at the original document you can see, that @type of name is 
"personal". The "simple" comes from the xlink XML-Schema that was 
included by the MODS-Schema. Therefor the result fragment should look 
like this:

<name type="personal" xlink:type="simple">
<namePart type="family">Edwards</namePart>
<namePart type="given">Stephen A.</namePart>
<role>
<roleTerm type="text">author</roleTerm>
</role>
<affiliation>Columbia University. Computer Science</affiliation>
</name>

If I use DOM from Java this is done correctly (but a bit ugly as it does 
not use the namespace prefix already defined).

Could someone just fix this, please?

Regards,

Thomas Scheffler


More information about the jdom-interest mailing list