From Ronny at bigfunfitness.de Sat Apr 6 05:16:20 2013
From: Ronny at bigfunfitness.de (Ronny)
Date: Sat, 06 Apr 2013 14:16:20 +0200
Subject: [jdom-interest] Validate XML by XSD failes with jDOM and Xerces
2.10.0
Message-ID: <51601214.1040200@bigfunfitness.de>
Hi list,
I've got a problem validating a xml by a xsd.
A short preview about the xsd:
Ok, my XML file looks like this:
...
Now I try to validate this with jDOM by using xerces. Then I got the
following error:
- Error on line 6: cvc-complex-type.3.2.2: Attribute "MyAttribute" must
not occur in element "MyType2".
I configured jDOM1 to use xerces:
SAXBuilder builder = new
SAXBuilder("org.apache.xerces.parsers.SAXParser", true);
I validated the same xml with the same xsd with oXygen (which also uses
xerces) and with some online validators. All told me that the xml is valid.
So my question is, why does the JAVA implementation gives tells me that
the xml is not valid.
I'm grateful for some hints.
Thanks!
Best regards,
Ronny
From patrick.dowler at nrc-cnrc.gc.ca Thu Apr 11 10:29:55 2013
From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler)
Date: Thu, 11 Apr 2013 10:29:55 -0700
Subject: [jdom-interest] streaming document output
Message-ID: <5166F313.7060402@nrc-cnrc.gc.ca>
We have a few web services that send XML documents in the response. The
documents can be large and when they are there is always one spot where
there is an arbitrarily long list of child elements.
With jdom1 we had implemented a subclass of Element for the element with
the long list of child elements and then had the iterator over that list
dynamically generate the children. Since the XMLOutputter used indexed
access rather than the iterator, we also had to subclass it and override
the list access. That works fine at the time.
No we are porting to jdom2 and I see that the outputter still uses
indexed access; that is a shame given all the comments in the code about
how the iterator is generally better than having to call size() on the
lists. It would be really nice and enable people to implement
customisations if jdom2 used the iterators rather than the indexing
throughout the codebase. Is that a lot of work?
The further problem we have right now is that XMLOutputter is final so
we can't trivially port our jdom1 code. Is implementing a custom
XMLOutputProcessor the right place to do that? The change we'd be making
is to change it to use iterators... is that something that should go
into the core library?
For XMLOutputProcessor, I am looking specifically at these methods:
process(Writer,Format,Element)
process(Writer,Format,List)
Is that the place to change to iterators?
--
Patrick Dowler
Canadian Astronomy Data Centre
National Research Council Canada
5071 West Saanich Road
Victoria, BC V9A 2L9
250-363-0044 (office) 250-363-0045 (fax)
From jdom at tuis.net Thu Apr 11 11:15:17 2013
From: jdom at tuis.net (Rolf Lear)
Date: Thu, 11 Apr 2013 14:15:17 -0400
Subject: [jdom-interest] streaming document output
In-Reply-To: <5166F313.7060402@nrc-cnrc.gc.ca>
References: <5166F313.7060402@nrc-cnrc.gc.ca>
Message-ID: <5166FDB5.7030300@tuis.net>
Hi Patrick.
OK, you have a long list of child Elements, and you generate them
on-the-fly during output. You also fudge the same in JDOM 1 by using
indexed access too. Now JDOM 2.x is not using the iterator....
... OK, I see the problem. I think the right fix would be for the
XMLOutputter to use the iterator (yes, they are faster in JDOM 2 than
1.x, but maybe not as fast as indexed access)....
As a side note, yes, XMLOutputter is final, by design. The bulk of the
logic is 'exported' to the interface XMLOutputProcessor, and there's a
'nearly' concrete implementation of that -
http://jdom.org/docs/apidocs/org/jdom2/output/support/AbstractXMLOutputProcessor.html
- read the comments. If a fix for the core JDOM code is not enough, you
will likely want to extend the AbstractXMLOutputProcessor and override
printElement(...).
I wrote a little blurb on why I changed the XMLOutputter to be final
here:
https://github.com/hunterhacker/jdom/wiki/JDOM2-Feature-Outputter-Updates
Let me inspect the code for where you believe the index-based lookups
are.... if you have a pointer for where I should start that will
help.... Ahh, it's in the Walker classes... that is a little bit
'hairy'. Let me play with it a little bit.
Rolf
On 11/04/2013 1:29 PM, Patrick Dowler wrote:
>
> We have a few web services that send XML documents in the response.
> The documents can be large and when they are there is always one spot
> where there is an arbitrarily long list of child elements.
>
> With jdom1 we had implemented a subclass of Element for the element
> with the long list of child elements and then had the iterator over
> that list dynamically generate the children. Since the XMLOutputter
> used indexed access rather than the iterator, we also had to subclass
> it and override the list access. That works fine at the time.
>
> No we are porting to jdom2 and I see that the outputter still uses
> indexed access; that is a shame given all the comments in the code
> about how the iterator is generally better than having to call size()
> on the lists. It would be really nice and enable people to implement
> customisations if jdom2 used the iterators rather than the indexing
> throughout the codebase. Is that a lot of work?
>
> The further problem we have right now is that XMLOutputter is final so
> we can't trivially port our jdom1 code. Is implementing a custom
> XMLOutputProcessor the right place to do that? The change we'd be
> making is to change it to use iterators... is that something that
> should go into the core library?
>
> For XMLOutputProcessor, I am looking specifically at these methods:
>
> process(Writer,Format,Element)
> process(Writer,Format,List)
>
> Is that the place to change to iterators?
>
>
From jdom at tuis.net Thu Apr 11 11:30:19 2013
From: jdom at tuis.net (Rolf Lear)
Date: Thu, 11 Apr 2013 14:30:19 -0400
Subject: [jdom-interest] streaming document output
In-Reply-To: <5166F313.7060402@nrc-cnrc.gc.ca>
References: <5166F313.7060402@nrc-cnrc.gc.ca>
Message-ID: <5167013B.6080707@tuis.net>
Oh, are you using 'raw' output, or are you making it 'pretty' or other
format?
Rolf
On 11/04/2013 1:29 PM, Patrick Dowler wrote:
>
> We have a few web services that send XML documents in the response.
> The documents can be large and when they are there is always one spot
> where there is an arbitrarily long list of child elements.
>
> With jdom1 we had implemented a subclass of Element for the element
> with the long list of child elements and then had the iterator over
> that list dynamically generate the children. Since the XMLOutputter
> used indexed access rather than the iterator, we also had to subclass
> it and override the list access. That works fine at the time.
>
> No we are porting to jdom2 and I see that the outputter still uses
> indexed access; that is a shame given all the comments in the code
> about how the iterator is generally better than having to call size()
> on the lists. It would be really nice and enable people to implement
> customisations if jdom2 used the iterators rather than the indexing
> throughout the codebase. Is that a lot of work?
>
> The further problem we have right now is that XMLOutputter is final so
> we can't trivially port our jdom1 code. Is implementing a custom
> XMLOutputProcessor the right place to do that? The change we'd be
> making is to change it to use iterators... is that something that
> should go into the core library?
>
> For XMLOutputProcessor, I am looking specifically at these methods:
>
> process(Writer,Format,Element)
> process(Writer,Format,List)
>
> Is that the place to change to iterators?
>
>
From patrick.dowler at nrc-cnrc.gc.ca Thu Apr 11 13:28:48 2013
From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler)
Date: Thu, 11 Apr 2013 13:28:48 -0700
Subject: [jdom-interest] streaming document output
In-Reply-To: <5167013B.6080707@tuis.net>
References: <5166F313.7060402@nrc-cnrc.gc.ca> <5167013B.6080707@tuis.net>
Message-ID: <51671D00.10503@nrc-cnrc.gc.ca>
We normally use Format.getPrettyFormat()
Patrick
On 04/11/2013 11:30 AM, Rolf Lear wrote:
> Oh, are you using 'raw' output, or are you making it 'pretty' or other
> format?
>
> Rolf
>
> On 11/04/2013 1:29 PM, Patrick Dowler wrote:
>>
>> We have a few web services that send XML documents in the response.
>> The documents can be large and when they are there is always one spot
>> where there is an arbitrarily long list of child elements.
>>
>> With jdom1 we had implemented a subclass of Element for the element
>> with the long list of child elements and then had the iterator over
>> that list dynamically generate the children. Since the XMLOutputter
>> used indexed access rather than the iterator, we also had to subclass
>> it and override the list access. That works fine at the time.
>>
>> No we are porting to jdom2 and I see that the outputter still uses
>> indexed access; that is a shame given all the comments in the code
>> about how the iterator is generally better than having to call size()
>> on the lists. It would be really nice and enable people to implement
>> customisations if jdom2 used the iterators rather than the indexing
>> throughout the codebase. Is that a lot of work?
>>
>> The further problem we have right now is that XMLOutputter is final so
>> we can't trivially port our jdom1 code. Is implementing a custom
>> XMLOutputProcessor the right place to do that? The change we'd be
>> making is to change it to use iterators... is that something that
>> should go into the core library?
>>
>> For XMLOutputProcessor, I am looking specifically at these methods:
>>
>> process(Writer,Format,Element)
>> process(Writer,Format,List)
>>
>> Is that the place to change to iterators?
>>
>>
>
> .
>
--
Patrick Dowler
Canadian Astronomy Data Centre
National Research Council Canada
5071 West Saanich Road
Victoria, BC V9A 2L9
250-363-0044 (office) 250-363-0045 (fax)
From jdom at tuis.net Thu Apr 11 19:25:42 2013
From: jdom at tuis.net (Rolf Lear)
Date: Thu, 11 Apr 2013 22:25:42 -0400
Subject: [jdom-interest] streaming document output
In-Reply-To: <51671D00.10503@nrc-cnrc.gc.ca>
References: <5166F313.7060402@nrc-cnrc.gc.ca> <5167013B.6080707@tuis.net>
<51671D00.10503@nrc-cnrc.gc.ca>
Message-ID: <516770A6.6040704@tuis.net>
OK, that means I have had to alter the more complicated
AbstractFormattedWalker class.
I have a test build I would like you to run. Is this an option? Can I
email it to you? It passes all my JUnit tests, and uses just an iterator
for all the output.
Thanks
Rolf
On 11/04/2013 4:28 PM, Patrick Dowler wrote:
>
> We normally use Format.getPrettyFormat()
>
> Patrick
>
> On 04/11/2013 11:30 AM, Rolf Lear wrote:
>> Oh, are you using 'raw' output, or are you making it 'pretty' or other
>> format?
>>
>> Rolf
>>
>> On 11/04/2013 1:29 PM, Patrick Dowler wrote:
>>>
>>> We have a few web services that send XML documents in the response.
>>> The documents can be large and when they are there is always one spot
>>> where there is an arbitrarily long list of child elements.
>>>
>>> With jdom1 we had implemented a subclass of Element for the element
>>> with the long list of child elements and then had the iterator over
>>> that list dynamically generate the children. Since the XMLOutputter
>>> used indexed access rather than the iterator, we also had to subclass
>>> it and override the list access. That works fine at the time.
>>>
>>> No we are porting to jdom2 and I see that the outputter still uses
>>> indexed access; that is a shame given all the comments in the code
>>> about how the iterator is generally better than having to call size()
>>> on the lists. It would be really nice and enable people to implement
>>> customisations if jdom2 used the iterators rather than the indexing
>>> throughout the codebase. Is that a lot of work?
>>>
>>> The further problem we have right now is that XMLOutputter is final so
>>> we can't trivially port our jdom1 code. Is implementing a custom
>>> XMLOutputProcessor the right place to do that? The change we'd be
>>> making is to change it to use iterators... is that something that
>>> should go into the core library?
>>>
>>> For XMLOutputProcessor, I am looking specifically at these methods:
>>>
>>> process(Writer,Format,Element)
>>> process(Writer,Format,List)
>>>
>>> Is that the place to change to iterators?
>>>
>>>
>>
>> .
>>
>
From patrick.dowler at nrc-cnrc.gc.ca Tue Apr 23 12:29:35 2013
From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler)
Date: Tue, 23 Apr 2013 12:29:35 -0700
Subject: [jdom-interest] streaming document output
In-Reply-To: <516885B4.6030401@tuis.net>
References: <5166F313.7060402@nrc-cnrc.gc.ca> <5167013B.6080707@tuis.net>
<51671D00.10503@nrc-cnrc.gc.ca> <516770A6.6040704@tuis.net>
<51683070.4050007@nrc-cnrc.gc.ca> <516885B4.6030401@tuis.net>
Message-ID: <5176E11F.9030906@nrc-cnrc.gc.ca>
Rolf,
The iterator based code worked fine for our use case and we can stream a
dynamic document. Yay! Sometime this week I am going to integrate this
into another system which has the potential to stream lots of data; at
that point I'll be able to determine that memory consumption remains modest.
So -- this is excellent. I'll use the jar you sent me for now and watch
for the next release.
Thanks again,
Pat
On 04/12/2013 03:07 PM, Rolf Lear wrote:
> See attached. - Jar, and the two files I changed (just in case).
>
> Please try it... it is iterator-only, passes my tests (except one which
> I have updated now because this code produces a better result in one
> extreme use case....)
>
> Rolf
>
> On 12/04/2013 12:04 PM, Patrick Dowler wrote:
>>
>> Sure, email me the build and we cna try it out pretty easily.
>>
>> thanks Rolf!!
>>
>> Pat
>>
>> On 04/11/2013 07:25 PM, Rolf Lear wrote:
>>> OK, that means I have had to alter the more complicated
>>> AbstractFormattedWalker class.
>>>
>>> I have a test build I would like you to run. Is this an option? Can I
>>> email it to you? It passes all my JUnit tests, and uses just an iterator
>>> for all the output.
>>>
>>> Thanks
>>>
>>> Rolf
>>>
>>> On 11/04/2013 4:28 PM, Patrick Dowler wrote:
>>>>
>>>> We normally use Format.getPrettyFormat()
>>>>
>>>> Patrick
>>>>
>>>> On 04/11/2013 11:30 AM, Rolf Lear wrote:
>>>>> Oh, are you using 'raw' output, or are you making it 'pretty' or other
>>>>> format?
>>>>>
>>>>> Rolf
>>>>>
>>>>> On 11/04/2013 1:29 PM, Patrick Dowler wrote:
>>>>>>
>>>>>> We have a few web services that send XML documents in the response.
>>>>>> The documents can be large and when they are there is always one spot
>>>>>> where there is an arbitrarily long list of child elements.
>>>>>>
>>>>>> With jdom1 we had implemented a subclass of Element for the element
>>>>>> with the long list of child elements and then had the iterator over
>>>>>> that list dynamically generate the children. Since the XMLOutputter
>>>>>> used indexed access rather than the iterator, we also had to subclass
>>>>>> it and override the list access. That works fine at the time.
>>>>>>
>>>>>> No we are porting to jdom2 and I see that the outputter still uses
>>>>>> indexed access; that is a shame given all the comments in the code
>>>>>> about how the iterator is generally better than having to call size()
>>>>>> on the lists. It would be really nice and enable people to implement
>>>>>> customisations if jdom2 used the iterators rather than the indexing
>>>>>> throughout the codebase. Is that a lot of work?
>>>>>>
>>>>>> The further problem we have right now is that XMLOutputter is
>>>>>> final so
>>>>>> we can't trivially port our jdom1 code. Is implementing a custom
>>>>>> XMLOutputProcessor the right place to do that? The change we'd be
>>>>>> making is to change it to use iterators... is that something that
>>>>>> should go into the core library?
>>>>>>
>>>>>> For XMLOutputProcessor, I am looking specifically at these methods:
>>>>>>
>>>>>> process(Writer,Format,Element)
>>>>>> process(Writer,Format,List)
>>>>>>
>>>>>> Is that the place to change to iterators?
>>>>>>
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>
>>> .
>>>
>>
>
--
Patrick Dowler
Canadian Astronomy Data Centre
National Research Council Canada
5071 West Saanich Road
Victoria, BC V9A 2L9
250-363-0044 (office) 250-363-0045 (fax)
From jdom at tuis.net Mon Apr 29 05:37:54 2013
From: jdom at tuis.net (Rolf Lear)
Date: Mon, 29 Apr 2013 08:37:54 -0400
Subject: [jdom-interest] Planning for JDOM 2.1.x
In-Reply-To: <50EC2A90.9090607@tuis.net>
References: <50EC2A90.9090607@tuis.net>
Message-ID: <517E69A2.3030808@tuis.net>
Hi all.
An update to the JDOM 2.1.x process is as follows...
JDOM 2.0.x has been branched off from the master branch in GitHub (it
has been branched for a while). Essentially JDOM 2.0.x versions are now
in maintenance mode (fully supported still, but no new features).
The 'master' branch in GitHub contains the work going in to the future
JDOM 2.1.x release stream.
Currently the list of items scheduled for inclusion to JDOM 2.1.x are:
- improved StAX support
- extended XPath API that allows for 'indexed' JDOM Documents that in
turn will allow for improved XPath performance
- use Iterable<...> inputs to many of the bulk methods ( like
addAll(Iterable extends Content> content) ) which makes JDOM more
friendly in some cases
- NamespaceStack has been updated with some new query methods, and one
new 'push' method.
- native support for Saxon 9.5 HE (just released) -> faster XPAths, as
well as faster XSLT and extended support for XPath 2.0, etc.
These items are all already committed to the master branch, or are in a
partially implemented state.
I expect to be releasing a Beta version of 2.1.x in a few weeks, but I
am looking for people to test the code too.
If anyone has an interest in being more involved in the planning and
implementation of these items, please speak up.
Thanks
Rolf
On 08/01/2013 9:17 AM, Rolf Lear wrote:
> Hi all.
>
> There are a few new features for JDOM that should be pushed to a 2.1.x
> version. These items include:
> - an extension to the XPathFactory API that allows for an
> implementation to reuse a 'compiled' or 'preprocessed' version of the
> JDOMDocument for the XPath lookups.
> - additional XMLStream* implementations (StAX) to output/build JDOM
> content that would allow JDOM to be a natural storage target for the
> native Java JAXB processes.
> - let JDOM Core classes accept Iterable<...> instances (for example,
> Element.addAll(Collection extends Content>) could rather be
> Element.addAll(Iterable extends Content>)
> - Saxon is anticipating a new version soon, and it would be great to
> formalize support for using Saxon for XSLT transformation and XPath
> evaluation.
> - I have some extensions I would like to make to the NamespaceStack
> class to facilitate easier access to Namespace logic.
> - for
>
> Depending on other factors, I think it would be reasonable to target a
> 2.1.0 release around April Fool's day. That gives some time for some
> beta versions for some field testing. Also, hopefully the 9.5 Saxon
> release will be out by then.
>
> I have been working with Saxon to integrate their API for XPath
> evaluation in JDOM, but there are a couple of technical issues in JDOM
> and the way that existing XPath evaluation is done that make it
> slightly incompatible with the Saxon (and XPath) API. There is a lot
> of common ground though, and using native Saxon for JDOM/XSLT makes
> sense, but will need some tweaks for use with JDOM XPath evaluation.
>
> I have also been working with Gordon Burgett on the new StAX Support
> based on his code submission on GitHub, and I also have code ready for
> the extensions to the XPath API. The Iterable changes would be
> relatively easy to implement, and the NamespaceStack class would be a
> simple extension with no existing logic changes (and the work is
> mostly done).
>
> I am looking for input on any other features or updates that you would
> like included in a 2.1 version. For example, is anyone using any code
> from the contrib area, and should that contrib code be 'productionized'?
>
> Thanks
>
> Rolf
>
>
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com
>