[jdom-interest] JDOM w/ *long* streams

Jason Hunter jhunter at collab.net
Tue Aug 8 17:24:17 PDT 2000


> Well, I'd be interested in benefiting from your insight too. The
> application I am working on will require parsing potentially large XML
> files and I would appreciate being able to build the object structure
> gradually like only the top level elements and go deeper only when
> needed.

That one's pretty hard, esp when reading from a stream.  While it's
fairly easy is to read depth first a la SAX (because that's how the data
comes in), it's a real bitch to do breadth first.  To do breadth first
is our eventual goal, but it requires hooks into the parser, and
probably requires reading from a file, not a stream, so you can do
random access to the unparsed data.  Basic idea is let the parser
lightly scan the file and record indexes and boundaries but postpone
real parsing until requested.

> What happens if you access a Document object defined as the
> DefaultHandler (just as in SAXBuilder.build) while parsing the XML
> InputSource 

I don't get the question.

> Also, forgive me for my being quite a newbie here, but does
> SAXBuilder.build currently build the whole tree in memory or are there
> some lazy instanciation mechanisms ?

Currently builds it all.  Doing true lazy with SAX is nigh impossible
because SAX gives you data as it encounters it in the file and either
you accept it or ignore it but you can never go back and get it later.

-jh-



More information about the jdom-interest mailing list