[jdom-interest] JDOM and memory

Mon Jan 16 16:20:53 PST 2012

Hi Leigh

I am uncertain if I am missing something in whether your 
comments/suggestions are specifically related to memory improvement of 
JDOM2 (the subject line), or just general improvements. Reading your 
comments they seem to be unrelated to memory specifically, but more 
general performance/convenience. That's fine if it is, I just want to 
make sure I am not missing something...

Just to summarize your mail very briefly, you are addressing three 
areas: getChild*(), XPath, and Exceptions

getChild...()
=============

As for the getChild(...), getChildren(...) and getContent(Filter) 
methods. They all derive from the same concept ... create a FilterList 
on the underlying ContentList, and scan it for all available (or the 
first available for getChild(...) ) matching content.

JDOM2 already has overridden the 'inefficient' iterator (and 
listIterator) methods to provide a more efficient iterator (a 
significant improvement in performance over JDOM 1.x see 
http://hunterhacker.github.com/jdom/jdom2/performance.html and scroll 
down about half the page to 'First major performance cycle', compare the 
results table to the one below.... )

These improvements do *not* override the isEmpty() call though, and that 
should absolutely be overridden too. By default it compares size() == 0, 
and that would require a full scan of the underlying content, but 
iterator.hasNext() in JDOM2 only does a 'lazy' scan.

So, introduce issue #57, override isEmpty() on FilterList. Since 
ContentList has a fast size() method then there is no need to change 
ContentList.isEmpty(). I am trying to think of any other methods that 
would be slow? There is no way to avoid a full scan for FilterList.size()

So, in summary on the getChildren code... you should already be seeing 
improved performance on the getChildren() method calls with more 
efficient iterators, and soon the isEmpty will be even faster too.

If/when the ContentList 'moves' in to Element to save memory, these 
improvements will be preserved.

XPath
=====

In regards to the XPath I took notes from the XOM project which has the 
'query()' method on all nodes... so for example you can:

element.query(myxpath);

I had a hard look at it and it makes some sense to do something similar. 
Especially now in JDOM2 where XPath supports more than just Element and 
Document 'context' items.

The issue is that full XPath support requires both Namespace and 
'Variable' contexts (XOM does address the Namespace context). This would 
be hard to implement on a simple 'query' method. Additionally, XPaths 
are intended to be 'compiled' and 'reused'. The XOM 'query' 
implementation does not support the reuse of the XPath. The simple query 
method would have to be limited, but would still cover (sucks out of 
thin air) 95% of XPath use in JDOM I am sure.

So, the current XPath implementation in JDOM2 is able to do the full 
gamut of operation, but loses some convenience because you need to 
access it outside of the Element/Content.

I certainly feel that making XPath more accessible to JDOM content would 
be 'friendly', but I worry that it will breed performance problems if it 
is too easy... At the time I worked the JDOM2 XPath code I looked in to 
what it would take to extend the functionality in to the 'Content' area 
of JDOM (like XOM), but found there were more issues than can be 
resolved by a person working alone with limited XPath experience (me). I 
figured I would come back to it. Perhaps now is the time.

Still, taking your JDOMUtil examples:

 > JDOMUtil.selectElementChildren(element, xpath)
 > JDOMUtil.selectElement(element, xpath)
 > JDOMUtil.selectAttribute(element, xpath)
 > JDOMUtil.ref(Element element, String xpath, String defaultValue)

In JDOM2, these same concepts can be 'easily' obtained with:

Filters.element().filter(XPath.selectNodes(element, xpath));
... not sure what the selectElement() would do, but you get the idea.
Filters.attribute().filter(XPath.selectNodes(element, xpath));
... well, the 'defaultValue' would take a tweak....

Exceptions
==========

Interesting observation. I can see the benefit of a JDOM 'Runtime' 
exception in addition to JDOMException. There are a few places where it 
could be useful to indicate a programmatic issue that does not need to 
be explicitly thrown/caught. XPath library is a good example.

I'll think some more on that... see if I can see a problem with 
introducing JDOMRuntimeException...... and see what other places it 
would possibly make sense.

So, thanks for the comments. If there's anything I missed, 
misunderstood, or needs attention, please don't hesitate!

Rolf

On 16/01/2012 4:36 PM, Leigh L Klotz Jr wrote:
> I'm currently evaluating the alpha of JDOM2. Most of the problems I've
> found with JDOM and Java 6 have been fixed in a utility class I have
> called JDOMUtil. A good deal of the methods in there are handling
> generic types,
>
> As for the question below, I don't often have the use case of for()
> iterating over, element.getContent(), but I do often iterate over the
> following:
> element.getChildren()
> element.getChildren(name)
> element.getChildren().isEmpty() as a surrogate for element.hasChildren()
>
> You could have Element.getContent() return a List implementation of your
> own, and make the Iterable.iterate() method in it (which is what for()
> calls) be efficient. That might also make element.getChildren.hasNext be
> efficient, or you could implement isEmpty directly.
>
> For JDOMUtil, I often use these:
> JDOMUtil.selectElementChildren(element, xpath)
> JDOMUtil.selectElement(element, xpath)
> JDOMUtil.selectAttribute(element, xpath)
> JDOMUtil.ref(Element element, String xpath, String defaultValue)
>
> The JDOMUtil.ref(Element element, String xpath, String defaultValue)
> method returns either the leaf-node value of the XPath expression, or
> the defaultValue if the nodeset is empty.
>
> I've also wrapped every one of the JDOMUtil XPath calls with something
> that throws a RuntimeException wrapper for JDOMException, and I let pass
> JDOMException and IOException only on serialization and parsing
> utilities. I believe that checked exceptions for XPath errors are a
> detraction from the simplicity of JDOM. XPath exceptions are always
> internal programming errors, and it is the rare case where they can be
> corrected at the point of invocation. Parsing and IO exceptions can come
> from external system interaction and can reasonably be expected to be
> correctable in point source code.
>
> Leigh.
>