[jdom-interest] Parsing a MODS-document with validation fails

Thomas Scheffler thomas.scheffler at uni-jena.de
Thu Aug 4 23:25:33 PDT 2011


Am 04.08.2011 22:45, schrieb Jason Hunter:
>> Hi,
>>
>> are there any news about integrating my patch? Locally it is
>> running fine. If you find any issues I am willing to work on that.
>
> Brad wanted to check if it was an issue with JDOM or the SAX parser
> underneath.  He hasn't done it yet, nor have I.  Perhaps you want to
> test that out, if you beat us to it.
>
> It'd also be good if there way a way to solve this issue without
> inventing namespaces.

This is how it is done when using DOM. For the sample MODS document from 
the first post the namespace will be resolved to the "correct" prefix 
"xlink", while DOM will generate "ns0:type". The "invent" mechanism is 
there as a fallback as it is better to invent a prefix than using the 
default namespace.

Here is the code for DOM:

DocumentBuilderFactory builderFactory = 
DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true);
builderFactory.setValidating(true);
builderFactory.setFeature("http://apache.org/xml/features/validation/schema", 
true);
DocumentBuilder builder2 = builderFactory.newDocumentBuilder();
builder2.setEntityResolver(er);
org.w3c.dom.Document document2 = 
builder2.parse("http://academiccommons.columbia.edu/download/fedora_content/show_pretty/ac:111060/CONTENT/ac111060_description.xml");
DOMSource domSource = new DOMSource(document2);
StreamResult result = new StreamResult(System.out);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(domSource, result);

and it will generate this output:

    <name xmlns:ns0="http://www.w3.org/1999/xlink" ns0:type="simple" 
type="personal">
       <namePart type="family">Edwards</namePart>
       <namePart type="given">Stephen A.</namePart>
       <role>
          <roleTerm type="text">author</roleTerm>
       </role>
       <affiliation>Columbia University. Computer Science</affiliation>
    </name>

Using just the QName ought to be enough (that is why it's called 
qualified) but it shows that at least with xerces from JRE and in the 
current version it isn't. So either call my patch a fix or a workaround.

regards,

Thomas


More information about the jdom-interest mailing list