[jdom-interest] UNDECLARED_ATTRIBUTE
Elliotte Rusty Harold
elharo at metalab.unc.edu
Fri Apr 19 04:50:08 PDT 2002
At 10:22 AM +0200 4/19/02, Laurent Bihanic wrote:
>Hi, Elliotte,
>
>I do not agree with you: the sentence you are refering to is part of
>the "Attribute-Value Normalization" section. Thus (IMHO) this
>sentence only applies to the way parsers should normalize the value
>of undeclared attribute, not the way they report the attribute type
>to the application.
>
After looking carefully at the XML spec, I now agree with you. There
is such a thing as an attribute that has no declared type, and it
thus makes sense for JDOM to have an UNDECLARED_ATTRIBUTE pseudo-type.
However, I'm still bothered that SAX doesn't agree. See
http://www.saxproject.org/apidoc/org/xml/sax/Attributes.html#getType(int)
which states:
The attribute type is one of the strings "CDATA", "ID", "IDREF",
"IDREFS", "NMTOKEN", "NMTOKENS", "ENTITY", "ENTITIES", or "NOTATION"
(always in upper case).
If the parser has not read a declaration for the attribute, or if the
parser does not report attribute types, then it must return the value
"CDATA" as stated in the XML 1.0 Recommentation (clause 3.3.3,
"Attribute-Value Normalization").
Thus a SAX parser will never report an undeclared attribute to JDOM.
>Also, at the time I added the attribute type support, I did not want
>to use CDATA as default because of the problems I encountered with
>enumerated types. Have a look at SAXHandler's getAttributeType:
>Without the current hack, using CDATA as default may lead to report
>as CDATA some ENUMERATED attributes if someone uses a parser as
>weird as Xerces!!!
>
That's certainly messy. The real issue here seems to be that not all
parsers comply with the SAX2 specification with respect to attribute
types. I'm not sure that's relevant here, however. Consider this code
from the private getAttributeType() method:
private int getAttributeType(String typeName) {
Integer type = (Integer)(attrNameToTypeMap.get(typeName));
if (type == null) {
if (typeName != null && typeName.length() > 0 &&
typeName.charAt(0) == '(') {
// Xerces 1.4.X reports attributes of enumerated type with
// a type string equals to the enumeration definition, i.e.
// starting with an parenthesis.
return Attribute.ENUMERATED_ATTRIBUTE;
}
else {
return Attribute.UNDECLARED_ATTRIBUTE;
}
} else {
return type.intValue();
}
}
You're not actually returning UNDECLARED_ATTRIBUTE for an undeclared
attribute because SAX will not specify null as an attribute type.
What this does is return UNDECLARED_ATTRIBUTE in the event that an
unknown, non-standard attribute type is encountered. A better name
here would be "NONSTANDARD_ATTRIBUTE" or something like that.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| The XML Bible, 2nd Edition (Hungry Minds, 2001) |
| http://www.cafeconleche.org/books/bible2/ |
| http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.cafeconleche.org/ |
+----------------------------------+---------------------------------+
More information about the jdom-interest
mailing list