[jdom-interest] CDATA sections?

Fri Jun 2 08:26:39 PDT 2000

Kevin Regan wrote:
> 
> I think we are really talking about 2 different things
> here.  You seem to be focusing on the input and parsing
> part, whereas I am talking more about the building of a JDOM
> tree "by hand" and outputing XML.  For the former case,
> it is arguable whether or not this should be supported.
> However, for XML document _creation_, it is crucial.  I
> am sure that there is more than one "folk" that would
> want to be able to create XML documents that have CDATA
> sections.
> 
> We are talking about a tiny class here -- one that simply
> wraps a String object.  One that would not even get used
> if applications writers did not build a JDOM tree that
> contained one.
> 
> My argument goes mostly for the XML application author (as
> you said).  However, it is too narrow of a view to take
> that this is a human being sitting in front of an XML editor.
> Many XML applications ouput XML documents -- most probably do
> not want to take the time to build DOM trees, so they just
> do something like:
> 
> System.out.println( "<FOO><![CDATA[<BAR>This is CDATA</BAR>]]></FOO>" );
> 
> With JDOM, it makes alot of sense to build a JDOM tree to
> create the output document.  The above would not be possible
> with the current implementation.

No. It would be easier.

Element myElement = new Element("FOO");
myElement.setConent("<BAR>This is CDATA</BAR>");

Why make them create a CDATA? Our Outputter takes care of this for you.

-Brett

> 
> CDATA sections exist and lots of folks use it, not just
> this "folk"!  Please believe me!!! :-)
> 
> --Kevin
> 
> -----Original Message-----
> From: Brett McLaughlin [mailto:brett.mclaughlin at lutris.com]
> Sent: Thursday, June 01, 2000 6:14 PM
> To: Kevin Regan
> Cc: jdom-interest at jdom.org
> Subject: Re: [jdom-interest] CDATA sections?
> 
> Kevin Regan wrote:
> >
> > For output, I don't think you should take a subjective view
> > as to how folks should be building there applications.  That
> 
> We're not. We're taking a spec-compliant view. Nothing says we need to
> preserve aesthetics unless we want to. This is like saying we /have/ to
> be able to preserve location of namespace mappings. It's just not a
> statement we're willing to make
> 
> > really isn't (or shouldn't be) your goal here.  Many application's
> > wish to ouput things using CDATA sections (for aesthetics and other
> > reasons).  JDOM is a much nicer API for building in-memory trees and
> > outputing the corresponding XML than is DOM (or simply outputing tag
> > names and content to a file).  It seems to me a much better way of
> > outputing XML data, and as such, should be able to output whatever
> > XML data that I want.  I don't think that XML application
> > writers will appreciate being told that the shouldn't use CDATA
> > sections because they are somehow "bad".
> 
> Didn't say that. We simply said that XML doesn't distinguish between
> CDATA and text with escaped characters. CDATA is a convenience for the
> document /author/, and /reader/, which in 80% (really more like 95%) of
> cases, are two applications. And it takes more processing time to
> process and transform CDATA than to do escaping. Again, this may go into
> a pretty printer, but not a core part of the API, unless people /really/
> convince me I'm wrong ;-)
> 
> >
> > As for the implementation, CDATA outputing could be added with
> > very minimal overhead (almost no overhead if applications writers
> > never use the particular class).  We are probably talking about < 20
> > lines of code throught the package.
> 
> Extra class, extra VM space, extra processing - it's not complexity, as
> I said before - it's use case. I still haven't seen one. You're talking
> completely abstractly, and I would need to see a concrete, realistic
> use-case. Ask the folks on the list - we make changes (gosh, we totally
> changed namespaces!), but we have to see real uses for it. I still
> haven't seen one - so far, it's just something you seem to want, without
> a real need.
> 
> >
> > Given this minimal effort and the fact there there is almost no
> > overhead added, I would argue that it is important to add this
> > feature so that folks can create XML documents with the features
> > that they would like to include.
> 
> No one else has wanted this! There is no /folks/, there is a /folk/ ;-)
> And not a single use-case yet, except mine (XML Editors)!
> 
> >
> > As for inputing and recognizing CDATA sections, this is more
> > complicated,
> > but it is a feature offered on the "other" XML parsers.
> 
> No, it's recognized by SAX and DOM, which are APIs... parsers just
> implement the API. In our API, we don't require you to implement it. It
> has nothing to do with a parser... it has to do with the API - when I
> see there is a need for CDATA (which there may be, I keep saying) it
> will go in - but not until then, because many many people continually
> want light classes. We don't even let /methods/ in easily, let alone
> entire classes.
> 
> >
> > I would love to see JDOM appeal to as wide a range of Java/XML
> > programmers
> > as possible (I'd like to use it myself).  Not allowing folks to at
> least
> > output CDATA sections goes a big step in the opposite direction.
> 
> I'm not convinced of this - where are all these other upset people ;-)
> Maybe watching the Stanley Cup? I am too, it's between periods.
> 
> -Brett
> 
> >
> > --Kevin
> >
> > Kevin Regan wrote:
> > >
> > > My particular application aside, it seems to be very important to
> > > at least be able to create XML documents with CDATA sections.  I
> don't
> > > think anyone would have an argument with this.
> >
> > I would. CDATA sections are not /any/ different from text sections
> with
> > all entities escaped. It is the /same text/ to any application. I
> don't
> > care if it comes across as CDATA or text with lots of escaped entites,
> > unless I'm writing an XML Editor, which we're not ;-) Other than that,
> I
> > get the same character data in both cases, so I don't care.
> >
> > >
> > > As for my application, we have an annotation field where we store
> > > any text that the user wishes to store.  It is are goal to leave
> > > this in as much a human readable format as possible.  In general,
> > > this section is used to store an HTML snippet or an XML section
> > > that should not be parsed.  It is true that one can output such
> > > data without using a CDATA section, but it sure does get ugly.
> >
> > But who is looking at the XML? I'm OK with, on a pretty print
> > XMLOutputter, doing this. But there's no reason to convert to CDATA
> > unless there is a really compelling need to, and I have yet to see
> one.
> > Are you really showing your users the XML that is generated?
> >
> > > We also need to read in these documents, modify them, and then
> > > write them back out.  We don't want to modify the CDATA section
> > > for the same reason as just stated.
> >
> > You don't - but if they get outputted as CDATA or not shouldn't affect
> > your application - we had this discussion a while back - an
> application
> > that rquires CDATA isn't a good application, more or less ;-)
> >
> > >
> > > Implementing the output part is very easy.  You just need to add a
> > > CDATA class to JDOM that folks would use instead of simple Strings.
> >
> > I've yet to be convinced of this. Adding a class to JDOM is a very
> > non-trivial thing - we are a very small API, and people really like
> > that. Adding this class has yet to be something I see that makes sense
> -
> > the complexity isn't really an issue - the reasoning behind it is.
> >
> > > The XMLOutputer class would recognize this and output the
> appropriate
> > > (unescaped) data with the CDATA wrapper.
> > >
> > > As for reading in CDATA sections, it would be a little more
> difficult.
> > > When the parser reported a CDATA section, you we need to create a
> > > CDATA object rather than just a String and include that in the
> > > element content.  However, this would make it easier for you to
> > > not do character normalization on CDATA sections.
> >
> > Again, I don't see a need. And normalization of text already occurs -
> it
> > isn't a problem in JDOM.
> >
> > -Brett
> >
> > >
> > > As I stated before, I definitely do not think that this should be
> > > the default behavior for input.  Rather, it should be a builder
> flag.
> > > As for output, there is no real concept of a particular mode -- the
> > > CDATA class can be added quite easily without modifying any existing
> > > applications.
> > >
> > > --Kevin
> > >
> > > > What's the reasoning behind CDATA sections (input) needing to be
> > > > outputted /as/ CDATA? The XML spec. doesn't mandate that - as long
> > as
> > > > the character entities are escaped, so that they are translated
> > > > correctly, there is no difference in a CDATA with no escaping and
> a
> > > > normal text with escaping. I'm not saying we won't do this - I
> just
> > > > don't see why it matters? If you application depends on CDATA
> > > > explicitly, then it's not really doing what an XML app should...
> > let's
> > > > hear your use-case.
> > >
> > > -Brett
> > >
> > > >
> > > > --Kevin
> > >
> > > --
> > > Brett McLaughlin, Enhydra Strategist
> > > Lutris Technologies, Inc.
> > > 1200 Pacific Avenue, Suite 300
> > > Santa Cruz, CA 95060 USA
> > > http://www.lutris.com
> > > http://www.enhydra.org
> >
> > --
> > Brett McLaughlin, Enhydra Strategist
> > Lutris Technologies, Inc.
> > 1200 Pacific Avenue, Suite 300
> > Santa Cruz, CA 95060 USA
> > http://www.lutris.com
> > http://www.enhydra.org
> >
> > _______________________________________________
> > To control your jdom-interest membership:
> >
> http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@you
> > rhost.com
> 
> --
> Brett McLaughlin, Enhydra Strategist
> Lutris Technologies, Inc.
> 1200 Pacific Avenue, Suite 300
> Santa Cruz, CA 95060 USA
> http://www.lutris.com
> http://www.enhydra.org

-- 
Brett McLaughlin, Enhydra Strategist
Lutris Technologies, Inc. 
1200 Pacific Avenue, Suite 300 
Santa Cruz, CA 95060 USA 
http://www.lutris.com
http://www.enhydra.org