[jdom-interest] Lazy parsing

Spitz,Ayal aspitz at mitre.org
Mon Apr 16 08:01:17 PDT 2001


Here is a rough sketch of the Tag class.

class Tag{
         long            start; // The location of '<' in bytes relative to 
the beginning of the XML file
         long            end; // The location of '>' in bytes relative to 
the beginning of the XML file
         Tag             parent; // The parent Tag of this tag
         LinkedList      children; // A linked list of this tag's children
         Tag             closingTag; // The matching closing tag to this tag
}

If you were to climb down the Tag tree you would reach your 'foo' tag. You 
wouldn't find text as one of the children, but you would easily be able to 
calculate it's location. You would then find a tag for 'bar' as a child of 
the 'foo' tag. Finally, you would find a tag for '/foo' as the closing tag 
of the 'foo' tag.

Mind you, each of the tags mentioned would have it's location in the XML 
file relative to the start of the XML file.

- AYAL

At 10:30 AM 4/16/2001 -0400, Jon Baer wrote:
>When you say "keep information about where the real XML tag exist in the 
>XML file" does
>this give it a location inside the file say for nested elements?
>
>I am currently working on a project with the requirements like:
>
><foo>This is some <bar/> text here</foo>
>
><bar/> is actually a database call (not using processing instructions), 
>and I need to
>keep track of "This is some" and "text here".  Does your parser work in 
>this manner?
>
>- Jon
>
>"Spitz,Ayal" wrote:
>
> > Jon -
> >
> > The idea with my parser is that it's lightweight and fast. The Tag objects
> > keep information about where the real XML tag exist in the XML file, the
> > name of the tag, a parent, and children. It does not keep attributes.
> >
> > The general idea with my XML parser is that it works quickly to give the
> > user a bare minimum of detail about the XML document so that they can then
> > return and parse only those segments of the tree that they might think are
> > important. I haven't worked out all the details, or all the use cases, but
> > I wanted to see if there was more interest before I went on with the 
> parser.
> >
> > - AYAL
> >
> > At 09:51 AM 4/16/2001 -0400, Jon Baer wrote:
> > >Id be interested in it, but what does the Tag object contain that would
> > >not be in the
> > >Element class?
> > >
> > >Id like to see there be a "basic" lightweight parser to be in JDOM
> > >(instead of Xerces
> > >as the default), so if your parser can construct a JDOM Document then 
> Id be
> > >interested in it.  Can I ask what other people are using (thats free and
> > >lightweight)
> > >when they use SAXBuilder(parser)?
> > >
> > >- Jon
> > >
> > >"Spitz,Ayal" wrote:
> > >
> > > > Hi -
> > > >
> > > > I'm a new user of JDOM, and have been using it only for a short 
> while. A
> > > > co-worker, that pointed JDOM out to me, mentioned that there was an
> > > > interest in a lazy parser. The rough description that I got was 
> that there
> > > > was a need for a quick parser that would be able to go threw an XML
> > > > document and extract the bare essentials of the XML document, 
> building a
> > > > very light tree along the way, This tree would then be used later 
> on to go
> > > > back into the XML document and perform a more extensive parse of 
> part of
> > > > the tree.
> > > >
> > > > It just so happens that I have such a parser. It takes one pass 
> threw a XML
> > > > document and returns a tree of light weight Tag objects. A Tag object
> > > > contains the number of bytes to the beginning and end of each XML 
> tag, the
> > > > name of the tag, and the namespace which it lives in.
> > > >
> > > > Is there still an interest in such code? I would be more then happy to
> > > > either give this code, to the right people to integrate into JDOM, 
> or work
> > > > with someone to integrate it in.
> > > >
> > > > Any interest? - AYAL
> > > >
> > > > _______________________________________________
> > > > To control your jdom-interest membership:
> > > >
> > > 
> http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com
> > >
> > >_______________________________________________
> > >To control your jdom-interest membership:
> > >http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yo 
> urhost.com
>
>_______________________________________________
>To control your jdom-interest membership:
>http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com




More information about the jdom-interest mailing list