[jdom-interest] Namespace inheritance after cloning

Mon Nov 29 05:01:16 PST 2004

Alistair Young wrote:

> Problem is, there are two "interpretations" of XML - the human and the 
> machine. The JDOM namespace behaviour is appropriate for the machine 
> interpretation but I thought JDOM was to allow easier access to XML for 
> humans. 

I don't think the conflict is between the human and the machine, as you 
make it out to be, though I did have to think for a few minutes to 
realize where the conflict really lies.

The real conflict is between Java and XML. A Java program is not an XML 
document. XML namespace bindings do not apply in a Java program, and 
Java/JDOM namespace bindings do not apply in an XML document. A Java 
program does not contain XML. It contains Strings and other data which 
JDOM converts into XML. Similarly JDOM can go the other way and convert 
XML data into Java/JDOM objects, but these are not the same thing.

Both Java and XML are intended for both humans and machines. When humans 
write or read XML, they need to follow XML rules. When humans write or 
read Java code, they need to follow Java rules. One of the key 
differences between XML and Java/JDOM is that Java/JDOM is not 
hierarchical. It has no notion of parent at the source code level. It 
creates this in object structures in memory, but only as a result of 
running code, not merely writing code. For instance, consider these 
three lines of code:

Element a = new Element("a");
Element div = new Element("div", "http://www.w3.org/1999/xhtml");
Element svg = new Element("svg", "http://www.w3.org/2000/svg");

Each of those elements has a namespace, but that namespace is unrelated 
to any other elements that may exist elsewhere in the Java source code. 
The Java code does not provide anyway to say that the element a should 
inherit its namespace from element div instead of element svg. The three 
statements are separate and independent. the objects created by these 
statements are independent objects that may or may not be connected.

Now this is decidedly not true in XML. In XML I can write this:

<div xmlns="http://www.w3.org/1999/xhtml">
   <svg xmlns="http://www.w3.org/2000/svg">
     <a href="http://www.example.com">Hello</a>
   </svg>
</div>

Now it's unambiguous which namespace the a element has. But this is not 
Java syntax.

It may also be helpful to note that every namespace aware tool: XSLT, 
XPath, XMLSpy, XQuery, DOM, etc., will maintain the namespace of the a 
element when moving it around in the same or a different document. Any 
of these tools could be used to produce the following document:

<div xmlns="http://www.w3.org/1999/xhtml">
   <a xmlns="http://www.w3.org/2000/svg" 
href="http://www.example.com">Hello</a>
   <svg xmlns="http://www.w3.org/2000/svg">
   </svg>
</div>

The only tool that would change the namespace when moving an element is 
a plain vanilla text editor that knows nothing about XML or namespaces: 
that operates only on strings, not on XML structures.

If you really want to process your XML with copy and paste, then do 
that. Don't use a parser. read in the entire document as a string, and 
then use regular expressions to break it apart. That's much harder than 
using JDOM, but it would give you what you're asking for (though not 
what you actually need). However if you're going to use an XML API, then 
let it be an XML API, and work with XML as XML, rather than treating it 
as nothing more than plain text with a lot of angle brackets instead of 
tabs. Otherwise you might as well not use a parser at all.

-- 
Elliotte Rusty Harold  elharo at metalab.unc.edu
XML in a Nutshell 3rd Edition Just Published!
http://www.cafeconleche.org/books/xian3/
http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim