[jdom-interest] The JDOM Model, Chapter 15 of Processing XML with Java

Tue May 7 06:04:08 PDT 2002

At 11:18 AM +0200 5/7/02, Laurent Bihanic wrote:

nt Bihanic wrote:
>>  Now I agree with you that a unique one-size-fits-all Node interface is
>>  definitely not the most elegant solution to the problem.
>>  Yet, given Java inheritance model, I think we need a single
>>  super-interface all JDOM nodes can be casted to.
>>  I also agree that the intersection of the features available from every
>>  XML node is empty.
>>

I don't agree with that. In particular I think you can ask any node 
for its parent, previous sibling, and next sibling. I also think you 
can ask any node for its name, whether it has children, its list of 
children (which may be null or empty), its type, and its value (as 
defined by XPath). All nodes also can be serialized, cloned, have 
toString(), equals(), and hashCode() methods.

>>  So I came up with the following proposal:
>>   - an org.jdom.Node interface that defines a single method int
>>  getNodeType() (plus methods such as isLeaf() / isBranch() if you do not
>>  like the idea of relying on a bit mask in the node type).
>>   - an org.jdom.LeafNode interface that extends Node and includes the
>>  methods available from all existing JDOM leaf nodes (i.e. detach(),
>>  getDocument() and get/setParent()) and will be implemented by Element,
>>  Attribute, CDATA, Text, Comment, PI and EntityRef.
>>   - an org.jdom.BranchNode interface that extends Node and includes the
>>  methods common to Element and Document (i.e. add/get/setContent) and
>>  will be implemented by Element and Document.

Sounds complex to me. I prefer a single, more powerful Node 
interface. Part of the goal is to remove a lot of the casting and 
testing that's necessary now. This just seems to move it to a 
different level. Still I suppose it could solve the other half of the 
problem, which is the collections API returning Object.

>>  The values returned by getNodeType() can be built on the constants
>>  defined by ContentFilter by setting more bits (LEAF, BRANCH). Using a
>>  bit mask also make getNodeType extensible to support advertising new
>>  interfaces/features in the future (e.g. a TextContentNode interface that
>>  would group in a single interface methods for manipulating the content
>>  of CDATA, Text, Comment and Attribute or a NamedNode to group all the
>>  name and namespace related-methods in Element and Attribute, etc.)
>>
>>  It's also probably a good way to clean up the interface of various JDOM
>>  classes (by removing multiple setter methods: add/removeContent(A),
>>  add/removeContent(B)...) and of XMLOutputter.
>

That's a plus. Long classes are intimidating and hard to document. I 
had more trouble writing the Element section than the rest of the 
sections in Chapter 15 combined.

>
>>  2. What is the value of a node?
>
>Opposite to XPath, JDOM does not define what the value of a node is. 
>But is this really a problem? XPath does not define a general rule, 
>it simply defines what the text value is for each type of node.
>We could do the same thing with JDOM (i.e. add a public String 
>getValue()/getText() method to each node or to the Node interface) 
>but we would still hav e to define what thereturned value is on a 
>per-node basis.
>

I've suggested simply adopting the XPath rules here. The biggest 
problem with not having this method is that it's really not possible 
to easily get the string content of an element.

>But I'm not sure this would really be useful except for implementing 
>XPath engines.
>

It's useful anytime someone wants the string content of an element. 
As currently implemented getText() is dangerous. It can lose content 
silently, unexpectedly, and with no warning.

>>  5. What's more important? Performance or Correctness?
>
>That's a question no one can answer because it simply depends on the 
>context. Even within one application, I need to ensure correctness 
>when interfacing with other application while I often want 
>performance while exchanging documents between components inside the 
>application (correctness is then taken care of by unit tests).

My answer is correctness. If it isn't correct, I don't give a damn 
how it performs. I don't believe correctness is taken care of with 
unit tests because:

1. Programmers don't write enough of them.

2. Most developers aren't expert enough in XML to correctly implement 
all the tricky bits. That's why they're using an API that is 
supposedly written, debugged and tested by XML experts.

>So I'd say: Let's make sure JDOM first ensure the correctness of 
>documents with reasonable performances but with validity checks that 
>can be disabled.

More on that in another thread, but the short answer is no. Don't 
allow the well-formedness checks to be disabled. We don't make any 
validity checks, (and that's OK).
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|             http://www.cafeconleche.org/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+