[jdom-interest] getMixedContent -> getContent

Amy Lewis amyzing at talsever.com
Wed Jun 27 19:13:27 PDT 2001


On Wed, Jun 27, 2001 at 04:34:13PM -0700, guru at stinky.com wrote:
>On Wed, Jun 27, 2001 at 01:04:23PM -0700, Jason Hunter wrote:
>> > Yes, it's always been so.  IMHO, we should do an s/Children/Elements/
>> > on the API.
>> 
>> No one took the bait so you're starting the debate on your own, eh?  :-)
>
>Hey, if you hadn't replied I'd have rebutted myself. :-)
>
>> Here's a little test to help answer the question:
>> 
>> <root>
>>   <!-- The sky is blue -->
>>   <sky color="blue"/>
>> </root>
>> 
>> How many children does root have?  
>> 
>> The current getChildren() naming model says it has one child.
>
>But that hides the fact that there is also a whitespace node, a
>comment node, another whitespace node, an element node, and yet
>another whitespace node, that all have the root element as their
>parent.  Doesn't this mean that parenthood and childhood are not
>transitive?  If A is the parent of B, then B is the child of A, right?

Not true in XPath.  In XPath, Attribute nodes have parents (which are
always Element nodes), but Attribute nodes are not considered to be
children of Element (they're a different axis; the child axis in XPath
includes elements, text, comments, PIs, entities, CDATA sections, but
not attributes, and not namespaces).  XPath, admittedly, has a rather
warped view of the world (namespace declarations aren't really
attributes, either).

BTW, there are many times when I, grinding out the code for yet-another
damned tree-walk, really, really, *really* want to ignore all that
extra *cruft*, because all I care about are attributes, elements, and
text.  I know that preserving accuracy is important, but

<earth>
  <sky />
</earth>

scans to me as a single child element, and it continually irritates me
(and requires me to take the time to fix the new folks' mistakes, every
time, using DOM) that I can't *ignore* the whitespace used for
formatting around that single interesting child node.  I have to
discard it on purpose, every time I see it, instead of being able to
say "ignore anything that isn't ...".

This is one of those places where I think the infoset and various
specifications just got it wrong.  They *don't* support round-tripping,
as a rule (attribute order can change, encoding can change, all sorts
of things aren't reported, so can't be preserved properly, so that
editing applications have to be written to use non-standard APIs of
some sort, in order to preserve things people think are important but
the infoset doesn't guarantee to preserve), except for the ridiculously
trivial case of whitespace.  Which is because you really can't tell the
difference between important whitespace and unimportant whitespace
without a schema (can't tell the players without a score), but the
chosen solution leaves data-oriented code constantly discarding cruft
that it doesn't need (and that nonetheless munches on memory).

Amy!
-- 
Amelia A. Lewis          alicorn at mindspring.com          amyzing at talsever.com
I stopped by the bar at 3 a.m. to seek solace in a bottle, or possibly a
friend.  I woke up with a headache like my head against a board, twice as
cloudy as I'd been the night before.  I went in seeking clarity.
		-- Indigo Girls



More information about the jdom-interest mailing list