[jdom-interest] Resolver announcement

Michael Kay mike at saxonica.com
Sun Mar 11 15:32:20 PDT 2012


In Saxon 9.4 I have addressed this problem by including a copy of the 
most common resources within the Saxon JAR file, and ensuring that when 
Saxon itself allocates the XMLReader, it uses an EntityResolver that 
grabs these local copies of resources when available. But Saxon isn't 
architecturally the right place for the solution, any more than JDOM is.

I like the idea of a caching resolver: except that surely, the best way 
to offer this to the world is as an implementation of XMLReader that 
wraps an underlying XMLReader with a caching entity resolver. Then 
anyone who picks up this XMLReader implementation will automatically get 
the caching behaviour - even if they implement their own EntityResolver 
on top.

But I think a variant of the caching resolver that only uses a 
pre-initialized cache containing the common W3C files, and doesn't 
attempt any dynamic caching, might be even more useful, because it would 
avoid needing access to writable filestore, and the synchronization and 
permissions issues that this introduces.

Such a beast could easily be carved out of the existing Saxon code and 
turned into a freestanding component.

Michael Kay
Saxonica

On 11/03/2012 21:17, Rolf Lear wrote:
> Hi all.
>
> way-back-when... about July last year, I ran in to a problem resolving 
> documents against w3x resources. Essentially the problem is described 
> here:
>
> http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic/
>
> I thought JDOM was a good location for building a solution to this 
> problem. I even created an 'issue' for it...
> https://github.com/hunterhacker/jdom/issues/26
>
> I later decided that JDOM was not necessarily the correct place to 
> solve that problem, so I 'rejected' that issue.
>
> But I have still been perplexed by this problem for a while now, and I 
> have taken some time in the past few weeks to tackle the problem, and 
> perhaps come up with a solution.
>
> Thus, I invite anyone interested to have a look at:
> https://github.com/rolfl/Resolver
>
> This project has the 'simple' purpose of behaving very much like a 
> caching proxy server for HTTP documents and exposing the cache as an 
> EntityResolver useful for SAX and other parsing.
>
> I decided to tackle the hard parts first - how do you build a 
> file-based cache in a multithreaded system, with the added complexity 
> that it needs to be accessible from multiple JVM's, not just threads 
> within one JVM.
>
> I figure that the code is too 'immature' to call 'stable', and it is 
> not a great fit for JDOM (since the solution has no code shared with 
> anything in JDOM, and it does not even process any XML...). So, 
> releasing it as part of JDOM2 is not appropriate, but its usefulness 
> is significant.
>
> So, if anyone is interested, I am eager to get some input on it...
>
> I think an attempt to make an 'easy to use' system for entity 
> resolving would be a benefit for the entire Java community... A ssytem 
> that allows you to plug in a combination of in-memory cached entities, 
> combined with on-disk 'catalog' systems (perhaps leveraging the xerces 
> 'Resolver' project, then this 'Resolver' for caching non-catalog 
> resources, finally a fall through to more traditional URL-based 
> resolvers would be ideal.
>
> Thanks
>
> Rolf
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>


More information about the jdom-interest mailing list