[jdom-interest] Partial Tree building/instantiation --- XPathFilter

Steven Gould steven.gould at cgiusa.com
Mon Apr 2 14:13:37 PDT 2001


Jakob,

Could you use XSLT to break the file up into smaller, more manageable
documents? Then use JDOM to manipulate/process each of these smaller
documents.

Steve

---

Jakob Jenkov wrote:

> Hi There. I'm currently working on a long, long :-) project in which
> we parse through some quite long files. We have tried converting these
> files to XML for easier/standard parsing but each file will then be of
> a size of about 16-30+ MB each. I don't even dare think about how much
> memory such a JDOM tree would take! And the plans for lazy evaluation
> won't help, since we are visiting every node in the tree, thus
> instantiating all objects anyway. Parsing the trees solely using SAX
> is not developer-friendly enough. What I have in mind is some kind of
> a XPath filter, allowing you to build JDOM trees from sub trees from
> the data, and dipose these trees when I don't longer need that tree.
> Let me give an example: We parse phone call records in files that
> sometimes can contain thousands and thousands of records. In XML
> format these files and records would look something like
> this: <transferBatch>    <phoneCall>        <details>bla.bla.bla., sub
> records etc.</details>    </phoneCall>    <phoneCall>
> <details>bla.bla.bla., sub records etc.</details>    </phoneCall>
> <phoneCall>        <details>bla.bla.bla., sub records
> etc.</details>    </phoneCall>    ...    ...
> ...</transferBatch>   Each <phoneCall> record with all it's sub
> records can be quite large, and there can be thousands of these
> <phoneCall> records. I'd like some way to get a JDOM tree for each
> <phoneCall> record one at a time, and to be able to dispose
> <phoneCall> JDOM tree before moving on to the next. How will I do
> that? My Suggestion would be to insert an XPathFilter, that only
> builds JDOM trees from the records that match the given XPath. In the
> example above, an XPath of    transferBatch::phoneCall   would have
> done the job. Does my complaints/ideas sound completely
> out-of-this-world? I think there are many out there who will have the
> same problem, parsing one sub tree at a time, without regard to the
> others.  Regards,Jakob Jenkovjakob at jenkov.com



More information about the jdom-interest mailing list