[jdom-interest] ElementScanner and Memory

Brian Nahas briannahas at yahoo.com
Mon Nov 13 08:11:04 PST 2006


I have a 1.2 GB xml file I need to parse.  Since it's nicely
partitioned, I planned on using ElementScanner from the contrib package
to only load one item at a time.  Here's an equivalent schema:

<data>
    <item>...</item>
    <item>...</item>
    <item>...</item>
    ...
</data>

The path for I'm using for my listener is "/data/item".

I
assumed any previous items would be released by the parser upon
completion.  ElementScanner was very simple to set up to handle this,
however I ran into an OutOfMemory error on my first try.  I was a
little confused as I thought ElementScanner was specifically designed
to prevent this.  Upon investigation, I found that the SAXHandler used
by the ElementScanner was holding onto the previous items after I was
done with them.  It adds them to the default root element that
FragmentHandler creates and nothing removes them after the listeners
are called.  This seems to be in direct conflict with this message I
found which states that ElementScanner doesn't build a document (this
message is fairly old though):

http://www.servlets.com/archive/servlet/ReadMsg?msgId=350607&listName=jdom-interest

I
worked around this by explicitly detaching the element in my listener
when I was done with it, but since it seems like this would be a common
pattern and subtle trap, so I thought I'd ask and see if I was missing
some setting or improperly using ElementScanner.  There's a namespace
declared on the data element so I don't know if that has something to
do with it.

Thanks,
-Brian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.jdom.org/pipermail/jdom-interest/attachments/20061113/42fc23e2/attachment.htm


More information about the jdom-interest mailing list