[jdom-interest] Best strategy for caching JDom Document instance and provide concurrent read access to it?

Guillaume Berche guillaume.berche at eloquant.com
Wed Jan 21 02:46:06 PST 2004


Hi,

I ran across a Java library which uses introspection to estimate the actual
size of an object graph. The library is described and available from the
following article:
http://www.javaworld.com/javaworld/javaqa/2003-12/02-qa-1226-sizeof.html

If this can be of interest to JDom users, running it on a JDom tree I get
ratios between 8 and 13 on a sample set of XML documents.

Guillaume.

> -----Original Message-----
> From: Per Norrman [mailto:pernorrman at telia.com]
> Sent: jeudi 8 janvier 2004 23:56
> To: 'Guillaume Berche'; jdom-interest at jdom.org
> Subject: SV: [jdom-interest] Best strategy for caching JDom Document
> instance and provide concurrent read access to it?
>
>
> Hi,
>
> did something similar about two years ago, and i can't remember
> concurrent read access being a problem. However, in this case
> the motivation for using a cache was the time saved not parsing
> a document.
>
> If you need to duplicate an instance, I would definitely go with
> deep cloning.
>
> As for the memory issue, as a rule of thumb I use a factor of 10, i.e.
> memory usage = 10 x document size, but this varies a lot in real life.
>
> /pmn
>
>
> > -----Ursprungligt meddelande-----
> > Från: jdom-interest-admin at jdom.org
> > [mailto:jdom-interest-admin at jdom.org] För Guillaume Berche
> > Skickat: den 8 januari 2004 16:00
> > Till: jdom-interest at jdom.org
> > Ämne: [jdom-interest] Best strategy for caching JDom Document
> > instance and provide concurrent read access to it?
> >
> >
> > Hello,
> >
> > I'm very pleased with JDom API because it's simple and
> > intuitive. Thanks Jason for this great library! I looked into
> > the FAQ and into this list archive but could not find a
> > definitive answer to my question. Please point me to it if I
> > missed it.
> >
> >
> > I'm trying to use the use-case described into the FAQ:
> > "Single thread reads an XML stream into JDOM and makes it
> > available to a run time system for read only access"
> >
> > Actually, I am trying to have a cache of JDom trees, and from
> > which a same JDom document instance may be access in read
> > only mode by concurrent threads.
> >
> > In this list Jason wrote the following in
> > http://www.servlets.com/archive/servlet/ReadMsg?msgId=157461&l
> > istName=jdom-i
> > nterest.
> >
> > "> JDOM is generally not thread safe, as I understand it.
> >
> > True.  We follow the same model as ArrayList, which is not by
> > default thread safe."
> >
> >
> > However ArrayList is actually safe for concurrent reads
> > accesses (Iterators and Enumerations keep their own state) as
> > its javadoc specifies:
> >
> > http://java.sun.com/j2se/1.3/docs/api/java/util/ArrayList.html
> >
> > "Note that this implementation is not synchronized. If
> > multiple threads access an ArrayList instance concurrently,
> > and at least one of the threads modifies the list
> > structurally, it must be synchronized externally. (A
> > structural modification is any operation that adds or deletes
> > one or more elements, or explicitly resizes the backing
> > array; merely setting the value of an element is not a
> > structural modification.) "
> >
> >
> > Then I wonder whether JDom beta 8 or beta 9, would have
> > problems with concurrent read accesses. I've haven't yet read
> > the code in details, but I think I read somewhere that JDom
> > was internally using lazy initialization when traversing the
> > tree and that concurrent accesses to it might cause problems.
> > Is this [still] true?
> >
> >
> > If this turns out that it is unsafe to read/traverse in
> > concurrence the same JDom document, then I would like the
> > group opinion on the best way to implement this cache while
> > avoiding creating a contention point at the JDom document
> > read access: my system is supposed to scale as more computing
> > resources is added (i.e. more CPU in // on the same
> > multiprocessor machine)
> >
> > I'm thinking of maintaining a pool of JDom instances. Each
> > concurrent thread would take an instance before traversing
> > it. The multiples instances of the same JDom tree could be created by:
> > 1- reparsing the same source
> > 2- deep cloning the JDom document
> > 3- serializing/unserializing the Jdom Document
> >
> >
> > Side question: my document cache needs to be bound in terms
> > of memory usage. I read some threads concerning this in the
> > list, but again do anyone have figures on the amount of bytes
> > used by JDom for storing a parsed representation of a XML
> > stream of N bytes? The experiment I plan on doing is to
> > instanciate M Document instances and look in a profiler at
> > the consummed space once the GC is triggered. Did anybody ran
> > this test before? I did read some data at
> > http://www.sosnoski.com/opensrc/xmlbench/index> .html but this
> > does not quite answers this question.
> >
> >
> >
> > Thanks in advance for your help,
> >
> > Guillaume.
> >
> >
> > _______________________________________________
> > To control your jdom-interest membership:
> > http://lists.denveronline.net/mailman/options/jdom-interest/yo
> uraddr at yourhost.com
>
>
>





More information about the jdom-interest mailing list