[jdom-interest] Streaming JDOM

Gregor Zeitlinger gregor at zeitlinger.de
Mon Jul 17 12:43:27 PDT 2006

On 7/17/06, Mike The Mathematician <mikeb at mitre.org> wrote:
> For large files, I would almost always want
> a JDOM tree with a tiny subset of a large xml
> file.
You could want that if you are interested only in part of the XML file.
I find that I often need to read an XML file and build an object
structure from it.
In other words
- I need to read the whole file
- I only need to read it once
(which is exactely what SAX is used for today)

For your task, you could use my proposal (see below) or something like
new SaxBuilder().build("/root/path[1]/name[3]");
Can you already do something like that?

> So I need a way to filter
> out children, grandchildren, ...,
> before JDOM builds the tree.
In my proposal you can indeed filter out children and grandchildren.

reader.Element rootElement = doc.getRootElement();
for (reader.Element child : element.getChildren()) {
  System.out.println("child: " + child.getName());
  //each child may have grand children, but they are skipped
  if (child.getName().equals("needed") {
    //iterate children
    //maybe something like (which is not implemented yet)
    org.jdom.Element found = child.asDom();

(BTW: I build the prototype in Java 5, but that can easily be changed)

> Your example below seems to iterate
> through an already-built JDOM tree
No, the file is parsed whenever iterator.next() is called.
I hope your comment implies that the API is easy to use.

> rather than reducing the in-memory
> tree before JDOM builds it.
In my proposal, no tree is build at all.
You are dealing with Element objects and you can iterate over the
children, but the signature is

public interface Element extends Content {
  List<Attribute> getAttributes();

  String getName();

  Iterable<Element> getChildren();

  Iterable<Content> getContent();

In other words you can only read the children once. If you read them
for a second time, you will get an IllegalStateException, because the
children have already been read.

> How do you propose to build only the
> part we need, that is, how do you
> propose to filter everything else out
> from becoming part of the memory tree?
As I said above, I am not building a memory tree - which is the whole
point of the idea.


Gregor Zeitlinger
gregor at zeitlinger.de

More information about the jdom-interest mailing list