From Ronny at bigfunfitness.de Sat Apr 6 05:16:20 2013 From: Ronny at bigfunfitness.de (Ronny) Date: Sat, 06 Apr 2013 14:16:20 +0200 Subject: [jdom-interest] Validate XML by XSD failes with jDOM and Xerces 2.10.0 Message-ID: <51601214.1040200@bigfunfitness.de> Hi list, I've got a problem validating a xml by a xsd. A short preview about the xsd: Ok, my XML file looks like this: ... Now I try to validate this with jDOM by using xerces. Then I got the following error: - Error on line 6: cvc-complex-type.3.2.2: Attribute "MyAttribute" must not occur in element "MyType2". I configured jDOM1 to use xerces: SAXBuilder builder = new SAXBuilder("org.apache.xerces.parsers.SAXParser", true); I validated the same xml with the same xsd with oXygen (which also uses xerces) and with some online validators. All told me that the xml is valid. So my question is, why does the JAVA implementation gives tells me that the xml is not valid. I'm grateful for some hints. Thanks! Best regards, Ronny From patrick.dowler at nrc-cnrc.gc.ca Thu Apr 11 10:29:55 2013 From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler) Date: Thu, 11 Apr 2013 10:29:55 -0700 Subject: [jdom-interest] streaming document output Message-ID: <5166F313.7060402@nrc-cnrc.gc.ca> We have a few web services that send XML documents in the response. The documents can be large and when they are there is always one spot where there is an arbitrarily long list of child elements. With jdom1 we had implemented a subclass of Element for the element with the long list of child elements and then had the iterator over that list dynamically generate the children. Since the XMLOutputter used indexed access rather than the iterator, we also had to subclass it and override the list access. That works fine at the time. No we are porting to jdom2 and I see that the outputter still uses indexed access; that is a shame given all the comments in the code about how the iterator is generally better than having to call size() on the lists. It would be really nice and enable people to implement customisations if jdom2 used the iterators rather than the indexing throughout the codebase. Is that a lot of work? The further problem we have right now is that XMLOutputter is final so we can't trivially port our jdom1 code. Is implementing a custom XMLOutputProcessor the right place to do that? The change we'd be making is to change it to use iterators... is that something that should go into the core library? For XMLOutputProcessor, I am looking specifically at these methods: process(Writer,Format,Element) process(Writer,Format,List) Is that the place to change to iterators? -- Patrick Dowler Canadian Astronomy Data Centre National Research Council Canada 5071 West Saanich Road Victoria, BC V9A 2L9 250-363-0044 (office) 250-363-0045 (fax) From jdom at tuis.net Thu Apr 11 11:15:17 2013 From: jdom at tuis.net (Rolf Lear) Date: Thu, 11 Apr 2013 14:15:17 -0400 Subject: [jdom-interest] streaming document output In-Reply-To: <5166F313.7060402@nrc-cnrc.gc.ca> References: <5166F313.7060402@nrc-cnrc.gc.ca> Message-ID: <5166FDB5.7030300@tuis.net> Hi Patrick. OK, you have a long list of child Elements, and you generate them on-the-fly during output. You also fudge the same in JDOM 1 by using indexed access too. Now JDOM 2.x is not using the iterator.... ... OK, I see the problem. I think the right fix would be for the XMLOutputter to use the iterator (yes, they are faster in JDOM 2 than 1.x, but maybe not as fast as indexed access).... As a side note, yes, XMLOutputter is final, by design. The bulk of the logic is 'exported' to the interface XMLOutputProcessor, and there's a 'nearly' concrete implementation of that - http://jdom.org/docs/apidocs/org/jdom2/output/support/AbstractXMLOutputProcessor.html - read the comments. If a fix for the core JDOM code is not enough, you will likely want to extend the AbstractXMLOutputProcessor and override printElement(...). I wrote a little blurb on why I changed the XMLOutputter to be final here: https://github.com/hunterhacker/jdom/wiki/JDOM2-Feature-Outputter-Updates Let me inspect the code for where you believe the index-based lookups are.... if you have a pointer for where I should start that will help.... Ahh, it's in the Walker classes... that is a little bit 'hairy'. Let me play with it a little bit. Rolf On 11/04/2013 1:29 PM, Patrick Dowler wrote: > > We have a few web services that send XML documents in the response. > The documents can be large and when they are there is always one spot > where there is an arbitrarily long list of child elements. > > With jdom1 we had implemented a subclass of Element for the element > with the long list of child elements and then had the iterator over > that list dynamically generate the children. Since the XMLOutputter > used indexed access rather than the iterator, we also had to subclass > it and override the list access. That works fine at the time. > > No we are porting to jdom2 and I see that the outputter still uses > indexed access; that is a shame given all the comments in the code > about how the iterator is generally better than having to call size() > on the lists. It would be really nice and enable people to implement > customisations if jdom2 used the iterators rather than the indexing > throughout the codebase. Is that a lot of work? > > The further problem we have right now is that XMLOutputter is final so > we can't trivially port our jdom1 code. Is implementing a custom > XMLOutputProcessor the right place to do that? The change we'd be > making is to change it to use iterators... is that something that > should go into the core library? > > For XMLOutputProcessor, I am looking specifically at these methods: > > process(Writer,Format,Element) > process(Writer,Format,List) > > Is that the place to change to iterators? > > From jdom at tuis.net Thu Apr 11 11:30:19 2013 From: jdom at tuis.net (Rolf Lear) Date: Thu, 11 Apr 2013 14:30:19 -0400 Subject: [jdom-interest] streaming document output In-Reply-To: <5166F313.7060402@nrc-cnrc.gc.ca> References: <5166F313.7060402@nrc-cnrc.gc.ca> Message-ID: <5167013B.6080707@tuis.net> Oh, are you using 'raw' output, or are you making it 'pretty' or other format? Rolf On 11/04/2013 1:29 PM, Patrick Dowler wrote: > > We have a few web services that send XML documents in the response. > The documents can be large and when they are there is always one spot > where there is an arbitrarily long list of child elements. > > With jdom1 we had implemented a subclass of Element for the element > with the long list of child elements and then had the iterator over > that list dynamically generate the children. Since the XMLOutputter > used indexed access rather than the iterator, we also had to subclass > it and override the list access. That works fine at the time. > > No we are porting to jdom2 and I see that the outputter still uses > indexed access; that is a shame given all the comments in the code > about how the iterator is generally better than having to call size() > on the lists. It would be really nice and enable people to implement > customisations if jdom2 used the iterators rather than the indexing > throughout the codebase. Is that a lot of work? > > The further problem we have right now is that XMLOutputter is final so > we can't trivially port our jdom1 code. Is implementing a custom > XMLOutputProcessor the right place to do that? The change we'd be > making is to change it to use iterators... is that something that > should go into the core library? > > For XMLOutputProcessor, I am looking specifically at these methods: > > process(Writer,Format,Element) > process(Writer,Format,List) > > Is that the place to change to iterators? > > From patrick.dowler at nrc-cnrc.gc.ca Thu Apr 11 13:28:48 2013 From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler) Date: Thu, 11 Apr 2013 13:28:48 -0700 Subject: [jdom-interest] streaming document output In-Reply-To: <5167013B.6080707@tuis.net> References: <5166F313.7060402@nrc-cnrc.gc.ca> <5167013B.6080707@tuis.net> Message-ID: <51671D00.10503@nrc-cnrc.gc.ca> We normally use Format.getPrettyFormat() Patrick On 04/11/2013 11:30 AM, Rolf Lear wrote: > Oh, are you using 'raw' output, or are you making it 'pretty' or other > format? > > Rolf > > On 11/04/2013 1:29 PM, Patrick Dowler wrote: >> >> We have a few web services that send XML documents in the response. >> The documents can be large and when they are there is always one spot >> where there is an arbitrarily long list of child elements. >> >> With jdom1 we had implemented a subclass of Element for the element >> with the long list of child elements and then had the iterator over >> that list dynamically generate the children. Since the XMLOutputter >> used indexed access rather than the iterator, we also had to subclass >> it and override the list access. That works fine at the time. >> >> No we are porting to jdom2 and I see that the outputter still uses >> indexed access; that is a shame given all the comments in the code >> about how the iterator is generally better than having to call size() >> on the lists. It would be really nice and enable people to implement >> customisations if jdom2 used the iterators rather than the indexing >> throughout the codebase. Is that a lot of work? >> >> The further problem we have right now is that XMLOutputter is final so >> we can't trivially port our jdom1 code. Is implementing a custom >> XMLOutputProcessor the right place to do that? The change we'd be >> making is to change it to use iterators... is that something that >> should go into the core library? >> >> For XMLOutputProcessor, I am looking specifically at these methods: >> >> process(Writer,Format,Element) >> process(Writer,Format,List) >> >> Is that the place to change to iterators? >> >> > > . > -- Patrick Dowler Canadian Astronomy Data Centre National Research Council Canada 5071 West Saanich Road Victoria, BC V9A 2L9 250-363-0044 (office) 250-363-0045 (fax) From jdom at tuis.net Thu Apr 11 19:25:42 2013 From: jdom at tuis.net (Rolf Lear) Date: Thu, 11 Apr 2013 22:25:42 -0400 Subject: [jdom-interest] streaming document output In-Reply-To: <51671D00.10503@nrc-cnrc.gc.ca> References: <5166F313.7060402@nrc-cnrc.gc.ca> <5167013B.6080707@tuis.net> <51671D00.10503@nrc-cnrc.gc.ca> Message-ID: <516770A6.6040704@tuis.net> OK, that means I have had to alter the more complicated AbstractFormattedWalker class. I have a test build I would like you to run. Is this an option? Can I email it to you? It passes all my JUnit tests, and uses just an iterator for all the output. Thanks Rolf On 11/04/2013 4:28 PM, Patrick Dowler wrote: > > We normally use Format.getPrettyFormat() > > Patrick > > On 04/11/2013 11:30 AM, Rolf Lear wrote: >> Oh, are you using 'raw' output, or are you making it 'pretty' or other >> format? >> >> Rolf >> >> On 11/04/2013 1:29 PM, Patrick Dowler wrote: >>> >>> We have a few web services that send XML documents in the response. >>> The documents can be large and when they are there is always one spot >>> where there is an arbitrarily long list of child elements. >>> >>> With jdom1 we had implemented a subclass of Element for the element >>> with the long list of child elements and then had the iterator over >>> that list dynamically generate the children. Since the XMLOutputter >>> used indexed access rather than the iterator, we also had to subclass >>> it and override the list access. That works fine at the time. >>> >>> No we are porting to jdom2 and I see that the outputter still uses >>> indexed access; that is a shame given all the comments in the code >>> about how the iterator is generally better than having to call size() >>> on the lists. It would be really nice and enable people to implement >>> customisations if jdom2 used the iterators rather than the indexing >>> throughout the codebase. Is that a lot of work? >>> >>> The further problem we have right now is that XMLOutputter is final so >>> we can't trivially port our jdom1 code. Is implementing a custom >>> XMLOutputProcessor the right place to do that? The change we'd be >>> making is to change it to use iterators... is that something that >>> should go into the core library? >>> >>> For XMLOutputProcessor, I am looking specifically at these methods: >>> >>> process(Writer,Format,Element) >>> process(Writer,Format,List) >>> >>> Is that the place to change to iterators? >>> >>> >> >> . >> > From patrick.dowler at nrc-cnrc.gc.ca Tue Apr 23 12:29:35 2013 From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler) Date: Tue, 23 Apr 2013 12:29:35 -0700 Subject: [jdom-interest] streaming document output In-Reply-To: <516885B4.6030401@tuis.net> References: <5166F313.7060402@nrc-cnrc.gc.ca> <5167013B.6080707@tuis.net> <51671D00.10503@nrc-cnrc.gc.ca> <516770A6.6040704@tuis.net> <51683070.4050007@nrc-cnrc.gc.ca> <516885B4.6030401@tuis.net> Message-ID: <5176E11F.9030906@nrc-cnrc.gc.ca> Rolf, The iterator based code worked fine for our use case and we can stream a dynamic document. Yay! Sometime this week I am going to integrate this into another system which has the potential to stream lots of data; at that point I'll be able to determine that memory consumption remains modest. So -- this is excellent. I'll use the jar you sent me for now and watch for the next release. Thanks again, Pat On 04/12/2013 03:07 PM, Rolf Lear wrote: > See attached. - Jar, and the two files I changed (just in case). > > Please try it... it is iterator-only, passes my tests (except one which > I have updated now because this code produces a better result in one > extreme use case....) > > Rolf > > On 12/04/2013 12:04 PM, Patrick Dowler wrote: >> >> Sure, email me the build and we cna try it out pretty easily. >> >> thanks Rolf!! >> >> Pat >> >> On 04/11/2013 07:25 PM, Rolf Lear wrote: >>> OK, that means I have had to alter the more complicated >>> AbstractFormattedWalker class. >>> >>> I have a test build I would like you to run. Is this an option? Can I >>> email it to you? It passes all my JUnit tests, and uses just an iterator >>> for all the output. >>> >>> Thanks >>> >>> Rolf >>> >>> On 11/04/2013 4:28 PM, Patrick Dowler wrote: >>>> >>>> We normally use Format.getPrettyFormat() >>>> >>>> Patrick >>>> >>>> On 04/11/2013 11:30 AM, Rolf Lear wrote: >>>>> Oh, are you using 'raw' output, or are you making it 'pretty' or other >>>>> format? >>>>> >>>>> Rolf >>>>> >>>>> On 11/04/2013 1:29 PM, Patrick Dowler wrote: >>>>>> >>>>>> We have a few web services that send XML documents in the response. >>>>>> The documents can be large and when they are there is always one spot >>>>>> where there is an arbitrarily long list of child elements. >>>>>> >>>>>> With jdom1 we had implemented a subclass of Element for the element >>>>>> with the long list of child elements and then had the iterator over >>>>>> that list dynamically generate the children. Since the XMLOutputter >>>>>> used indexed access rather than the iterator, we also had to subclass >>>>>> it and override the list access. That works fine at the time. >>>>>> >>>>>> No we are porting to jdom2 and I see that the outputter still uses >>>>>> indexed access; that is a shame given all the comments in the code >>>>>> about how the iterator is generally better than having to call size() >>>>>> on the lists. It would be really nice and enable people to implement >>>>>> customisations if jdom2 used the iterators rather than the indexing >>>>>> throughout the codebase. Is that a lot of work? >>>>>> >>>>>> The further problem we have right now is that XMLOutputter is >>>>>> final so >>>>>> we can't trivially port our jdom1 code. Is implementing a custom >>>>>> XMLOutputProcessor the right place to do that? The change we'd be >>>>>> making is to change it to use iterators... is that something that >>>>>> should go into the core library? >>>>>> >>>>>> For XMLOutputProcessor, I am looking specifically at these methods: >>>>>> >>>>>> process(Writer,Format,Element) >>>>>> process(Writer,Format,List) >>>>>> >>>>>> Is that the place to change to iterators? >>>>>> >>>>>> >>>>> >>>>> . >>>>> >>>> >>> >>> . >>> >> > -- Patrick Dowler Canadian Astronomy Data Centre National Research Council Canada 5071 West Saanich Road Victoria, BC V9A 2L9 250-363-0044 (office) 250-363-0045 (fax) From jdom at tuis.net Mon Apr 29 05:37:54 2013 From: jdom at tuis.net (Rolf Lear) Date: Mon, 29 Apr 2013 08:37:54 -0400 Subject: [jdom-interest] Planning for JDOM 2.1.x In-Reply-To: <50EC2A90.9090607@tuis.net> References: <50EC2A90.9090607@tuis.net> Message-ID: <517E69A2.3030808@tuis.net> Hi all. An update to the JDOM 2.1.x process is as follows... JDOM 2.0.x has been branched off from the master branch in GitHub (it has been branched for a while). Essentially JDOM 2.0.x versions are now in maintenance mode (fully supported still, but no new features). The 'master' branch in GitHub contains the work going in to the future JDOM 2.1.x release stream. Currently the list of items scheduled for inclusion to JDOM 2.1.x are: - improved StAX support - extended XPath API that allows for 'indexed' JDOM Documents that in turn will allow for improved XPath performance - use Iterable<...> inputs to many of the bulk methods ( like addAll(Iterable content) ) which makes JDOM more friendly in some cases - NamespaceStack has been updated with some new query methods, and one new 'push' method. - native support for Saxon 9.5 HE (just released) -> faster XPAths, as well as faster XSLT and extended support for XPath 2.0, etc. These items are all already committed to the master branch, or are in a partially implemented state. I expect to be releasing a Beta version of 2.1.x in a few weeks, but I am looking for people to test the code too. If anyone has an interest in being more involved in the planning and implementation of these items, please speak up. Thanks Rolf On 08/01/2013 9:17 AM, Rolf Lear wrote: > Hi all. > > There are a few new features for JDOM that should be pushed to a 2.1.x > version. These items include: > - an extension to the XPathFactory API that allows for an > implementation to reuse a 'compiled' or 'preprocessed' version of the > JDOMDocument for the XPath lookups. > - additional XMLStream* implementations (StAX) to output/build JDOM > content that would allow JDOM to be a natural storage target for the > native Java JAXB processes. > - let JDOM Core classes accept Iterable<...> instances (for example, > Element.addAll(Collection) could rather be > Element.addAll(Iterable) > - Saxon is anticipating a new version soon, and it would be great to > formalize support for using Saxon for XSLT transformation and XPath > evaluation. > - I have some extensions I would like to make to the NamespaceStack > class to facilitate easier access to Namespace logic. > - for > > Depending on other factors, I think it would be reasonable to target a > 2.1.0 release around April Fool's day. That gives some time for some > beta versions for some field testing. Also, hopefully the 9.5 Saxon > release will be out by then. > > I have been working with Saxon to integrate their API for XPath > evaluation in JDOM, but there are a couple of technical issues in JDOM > and the way that existing XPath evaluation is done that make it > slightly incompatible with the Saxon (and XPath) API. There is a lot > of common ground though, and using native Saxon for JDOM/XSLT makes > sense, but will need some tweaks for use with JDOM XPath evaluation. > > I have also been working with Gordon Burgett on the new StAX Support > based on his code submission on GitHub, and I also have code ready for > the extensions to the XPath API. The Iterable changes would be > relatively easy to implement, and the NamespaceStack class would be a > simple extension with no existing logic changes (and the work is > mostly done). > > I am looking for input on any other features or updates that you would > like included in a 2.1 version. For example, is anyone using any code > from the contrib area, and should that contrib code be 'productionized'? > > Thanks > > Rolf > > > _______________________________________________ > To control your jdom-interest membership: > http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com >