[jdom-interest] Performance issues

Thu Aug 21 08:50:27 PDT 2003

Don't use getChildren (which BTW was renamed to getChildElements in the current
cvs).  The list returned by getChildren is both filtered, to show only
Elements, and "live". "live" meaning changes to the list are reflected in the
element's underlying content.  One problem is a element has no way of knowing
how many child elements it has or were they are in the content list with
scanning the underlying content list and counting. So size() in the line

   for (int i = 0; i < element_list.size (); i ++)

causes element_list's actual content list to be scanned on each call, counting
Elements and skipping any Text, Comment, or ProcessingInstruction node.
Similarly element_list.get(i) cann't determine where the i th Element is
without starting from index 0 in the underlying content list and scanning until
it finds the i th Element or runs out of nodes.

Iterators because a certain state must be maintained between a call to 
hasNext() and next() have their own slew of problems. If your interested,
search the archives for ConcurrentModificationException.

A solution to your problem in this case is to use getContent instead of
getChildren since size() is known (not calculated) by the parent element
and get(index) references the actual content list (not a filtered version).

  private void transform_document (Element parent_element) {
    List list = parent_element.getContent();
    int size = list.size();
    for (int i = 0; i < size; i ++) {
      Object node = list.get(i);
      if (node instanceof Element) {
          Element element = (Element) obj;

          if (element.getNamespace() == orange_namespace) {
            //  do something ....
            //  replace old element
            list.set(i, new_element);
          }

          transform_document(element);
      }
    }
  }

The current cvs version adds the methods getContentSize() and getChild(int)
to Element, so if you using the current cvs version you could also do

  private void transform_document (Element parent_element) {
    int size = parent_element.size();
    for (int i = 0; i < size; i ++) {
      Child node = parent_element.getChild(i);
      if (node instanceof Element) {

which I prefer since your working with the object that actuals encapsulates the
data, and not a secondary object such as the List returned by getContent().

Brad

"Philipp Groeschler" writes:

> Good morning!
> 
> The task in my code is the following: I've got a Document object
> containing an element tree which was build from a valid xml file. This
> element tree is walked through by a recursive method, which in some
> cases has to replace the current element it is working on. After two
> days of getting nervous on ConcurrentModificationExceptions, the code
> works pretty fine, except that it is very slow:
> 
> private void transform_document (Element parent_element)
> {
>   List element_list = parent_element.getChildren ();
> 
>   for (int i = 0; i < element_list.size (); i ++)
>   {
>     Element this_element = (Element) element_list.get (i);
> 
>     if (this_element.getNamespace () == orange_namespace)
>     {
> 	//  do something ....
>       //  replace old element
>       element_list.set (i, new_element);
>     }
> 
>     transform_document (this_element);
>   }
> }
> 
> The FAQ says, using an Iterator except a List and a for-loop is faster.
> But when I use Iterators and the remove-method, the thing throws a
> ConcurrentModificationException when I try to attach the new Element.
> The Element sequence has to be preserved, so adding the new Elements
> after the loop will not work for me.
> 
> Are there any other ways to process an Element tree in such way, which
> will work faster than this? Does converting the List to an Array help me
> out?
> 
> Thanks a lot in advance ...
> 
> Philipp
> 
> _______________________________________________
> To control your jdom-interest membership:
> http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost
> .com