[jdom-interest] DOCTYPE still giving me the worst headache!

Elliotte Rusty Harold elharo at metalab.unc.edu
Sat Feb 2 19:05:06 PST 2002


>Easiest is to do a "cvs diff -du" or if you still have the originals and
>don't have CVS you can do just "diff -u".
>
>>  I notice we aren't currently doing any sanity checks on system or
>>  public IDs. This should probably be added to the Verifier class, and
>>  rolled into both DocType and EntityRef. I haven't looked at EntityRef
>>  yet but I bet it's got a lot of the same issues with public and
>>  system IDs as DocType.
>
>Sounds good.
>
>-jh-

Here's the diffs for the changes. They don't seem to break anything 
that wasn't broken before. Mostly this adds verification of the 
system and public IDs in both DocType and EntityRef, the root element 
name in DocType, and the entity name in EntityRef. Failing these 
causes the throwing of various runtime exceptions 
(IllegalDataException and IllegalNameexception) so this shouldn't 
break anybody's code.

It also allows null to be used for either system or public IDs in 
DocType to differentiate a missing ID from the empty string. This 
necessitated some small changes in some input and output classes.

I added one new Constructor to EntityRef to allow it to have a system 
ID without a public ID.

I also added a setElementName() method in DocType.

I added checkSystemLiteral() and checkPublicID() methods in Verifier.

I made checkXMLName() in Verifier public so EntityRef and DocType 
could use it. (They do allow colons in their names which 
checkElementName() forbids.) I don't think this will cause a problem, 
but if it does it could be package private.

Finally I fixed a few small errors I noticed in the doc comments here 
and there.

? build
Index: src/java/org/jdom/DocType.java
===================================================================
RCS file: /home/cvspublic/jdom/src/java/org/jdom/DocType.java,v
retrieving revision 1.17
diff -d -u -r1.17 DocType.java
--- src/java/org/jdom/DocType.java	2002/01/08 09:17:10	1.17
+++ src/java/org/jdom/DocType.java	2002/02/03 11:03:43
@@ -113,11 +113,15 @@
       *        referenced DTD
       * @param systemID <code>String</code> system ID of
       *        referenced DTD
+     * @throws IllegalDataException if the given system ID is not a legal
+     *         system literal or the public ID is not a legal public ID.
+     * @throws IllegalNameException if the given root element name 
is not a legal
+     *         XML element name.
       */
      public DocType(String elementName, String publicID, String systemID) {
-        this.elementName = elementName;
-        this.publicID = publicID;
-        this.systemID = systemID;
+        setElementName(elementName);
+        setPublicID(publicID);
+        setSystemID(systemID);
      }

      /**
@@ -131,9 +135,13 @@
       *        element being constrained.
       * @param systemID <code>String</code> system ID of
       *        referenced DTD
+     * @throws IllegalDataException if the given system ID is not a legal
+     *         system literal.
+     * @throws IllegalNameException if the given root element name 
is not a legal
+     *         XML element name.
       */
      public DocType(String elementName, String systemID) {
-        this(elementName, "", systemID);
+        this(elementName, null, systemID);
      }

      /**
@@ -144,9 +152,11 @@
       *
       * @param elementName <code>String</code> name of
       *        element being constrained.
+     * @throws IllegalNameException if the given root element name 
is not a legal
+     *         XML element name.
       */
      public DocType(String elementName) {
-        this(elementName, "", "");
+        this(elementName, null, null);
      }

      /**
@@ -163,6 +173,29 @@

      /**
       * <p>
+     * This will set the root element name declared by this
+     *  DOCTYPE declaration.
+     * </p>
+     *
+     * @return DocType <code>DocType</code> this DocType object
+     * @param elementName <code>String</code> name of
+     *        root element being constrained.
+     * @throws IllegalNameException if the given root element name 
is not a legal
+     *         XML element name.
+     */
+    public DocType setElementName(String elementName) {
+        // This can contain a colon so we use checkXMLName()
+        // instead of checkElementName()
+        String reason = Verifier.checkXMLName(elementName);
+        if (reason != null) {
+            throw new IllegalNameException(elementName, "DocType", reason);
+        }
+        this.elementName = elementName;
+        return this;
+    }
+
+    /**
+     * <p>
       * This will retrieve the public ID of an externally
       *   referenced DTD, or an empty <code>String</code> if
       *   none is referenced.
@@ -180,12 +213,17 @@
       *   referenced DTD.
       * </p>
       *
-     * @return publicID <code>String</code> public ID of
-     *                  referenced DTD.
+     * @return DocType <code>DocType</code> this DocType object
+     * @throws IllegalDataException if the given public ID is not a legal
+     *         public ID.
       */
      public DocType setPublicID(String publicID) {
+        String reason = Verifier.checkPublicID(publicID);
+        if (reason != null) {
+            throw new IllegalDataException(publicID, "DocType", reason);
+        }
          this.publicID = publicID;
-
+
          return this;
      }

@@ -210,8 +248,14 @@
       *
       * @return systemID <code>String</code> system ID of
       *                  referenced DTD.
+     * @throws IllegalDataException if the given system ID is not a legal
+     *         system literal.
       */
      public DocType setSystemID(String systemID) {
+        String reason = Verifier.checkSystemLiteral(systemID);
+        if (reason != null) {
+            throw new IllegalDataException(systemID, "DocType", reason);
+        }
          this.systemID = systemID;

          return this;
@@ -235,7 +279,7 @@
       * This sets the <code>{@link Document}</code> holding this doctype.
       * </p>
       *
-     * @param document <code>Document</code> holding this doctype
+     * @param DocType <code>Document</code> holding this doctype
       * @return <code>Document</code> this <code>DocType</code> modified
       */
      protected DocType setDocument(Document document) {
Index: src/java/org/jdom/Element.java
===================================================================
RCS file: /home/cvspublic/jdom/src/java/org/jdom/Element.java,v
retrieving revision 1.106
diff -d -u -r1.106 Element.java
--- src/java/org/jdom/Element.java	2002/01/25 18:42:52	1.106
+++ src/java/org/jdom/Element.java	2002/02/03 11:03:43
@@ -215,7 +215,9 @@
       *         Element name.
       */
      public Element setName(String name) {
-        String reason = Verifier.checkElementName(name);
+        // Use checkXMLName() instead of checkElementName() here
+        // because we do need to allow this to contain a colon
+        String reason = Verifier.checkXMLName(name);
          if (reason != null) {
              throw new IllegalNameException(name, "element", reason);
          }
Index: src/java/org/jdom/EntityRef.java
===================================================================
RCS file: /home/cvspublic/jdom/src/java/org/jdom/EntityRef.java,v
retrieving revision 1.5
diff -d -u -r1.5 EntityRef.java
--- src/java/org/jdom/EntityRef.java	2002/01/08 09:17:10	1.5
+++ src/java/org/jdom/EntityRef.java	2002/02/03 11:03:43
@@ -59,7 +59,7 @@
  import java.io.Serializable;

  /**
- * <p><code>EntityRef</code> Defines an XML entity in Java.</p>
+ * <p><code>EntityRef</code> Defines an XML entity reference in Java.</p>
   *
   * @author Brett McLaughlin
   * @author Jason Hunter
@@ -101,24 +101,47 @@
       * </p>
       *
       * @param name <code>String</code> name of element.
+     * @throws IllegalNameException if the given name is not a legal
+     *         XML name.
       */
      public EntityRef(String name) {
-        this.name = name;
+        this(name, null, null);

      }

      /**
       * <p>
       * This will create a new <code>EntityRef</code>
+     *   with the supplied name and system id.
+     * </p>
+     *
+     * @param name <code>String</code> name of element.
+     * @throws IllegalNameException if the given name is not a legal
+     *         XML name.
+     * @throws IllegalDataException if the given system ID is not a legal
+     *         system literal.
+     */
+    public EntityRef(String name, String systemID) {
+        this(name, null, systemID);
+    }
+
+    /**
+     * <p>
+     * This will create a new <code>EntityRef</code>
       *   with the supplied name, public id, and system id.
       * </p>
       *
       * @param name <code>String</code> name of element.
+     * @throws IllegalDataException if the given system ID is not a legal
+     *         system literal or the the given public ID is not a
+     *         legal public ID
+     * @throws IllegalNameException if the given name is not a legal
+     *         XML name.
       */
      public EntityRef(String name, String publicID, String systemID) {
-        this.name = name;
-        this.publicID = publicID;
-        this.systemID = systemID;
+        setName(name);
+        setPublicID(publicID);
+        setSystemID(systemID);
      }

      /**
@@ -232,7 +255,7 @@
       *
       * @return public ID of this <code>EntityRef</code>
       */
-    public java.lang.String getPublicID() {
+    public String getPublicID() {
          return publicID;
      }

@@ -244,7 +267,7 @@
       *
       * @return system ID of this <code>EntityRef</code>
       */
-    public java.lang.String getSystemID() {
+    public String getSystemID() {
          return systemID;
      }

@@ -279,8 +302,16 @@
       *
       * @param name new name of the entity
       * @return this <code>EntityRef</code> modified.
+     * @throws IllegalNameException if the given name is not a legal
+     *         XML name.
       */
      public EntityRef setName(String name) {
+        // This can contain a colon so we use checkXMLName()
+        // instead of checkElementName()
+        String reason = Verifier.checkXMLName(name);
+        if (reason != null) {
+            throw new IllegalNameException(name, "EntityRef", reason);
+        }
          this.name = name;
          return this;
      }
@@ -292,8 +323,14 @@
       *
       * @param newPublicID new public id
       * @return this <code>EntityRef</code> modified.
+     * @throws IllegalDataException if the given public ID is not a legal
+     *         public ID.
       */
      public EntityRef setPublicID(String newPublicID) {
+        String reason = Verifier.checkPublicID(publicID);
+        if (reason != null) {
+            throw new IllegalDataException(publicID, "EntityRef", reason);
+        }
          this.publicID = newPublicID;
          return this;
      }
@@ -304,9 +341,15 @@
       * </p>
       *
       * @param newSystemID new system id
+     * @throws IllegalDataException if the given system ID is not a legal
+     *         system literal.
       * @return this <code>EntityRef</code> modified.
       */
      public EntityRef setSystemID(String newSystemID) {
+        String reason = Verifier.checkSystemLiteral(systemID);
+        if (reason != null) {
+            throw new IllegalDataException(systemID, "EntityRef", reason);
+        }
          this.systemID = newSystemID;
          return this;
      }
Index: src/java/org/jdom/Verifier.java
===================================================================
RCS file: /home/cvspublic/jdom/src/java/org/jdom/Verifier.java,v
retrieving revision 1.27
diff -d -u -r1.27 Verifier.java
--- src/java/org/jdom/Verifier.java	2002/01/25 18:42:52	1.27
+++ src/java/org/jdom/Verifier.java	2002/02/03 11:03:44
@@ -485,7 +485,7 @@
       *
       * @param data <code>String</code> data to check.
       * @return <code>String</code> - reason data is invalid, or
-     *         <code>null</code> is name is OK.
+     *         <code>null</code> if data is OK.
       */
      public static final String checkCommentData(String data) {
          String reason = null;
@@ -500,7 +500,85 @@
          // If we got here, everything is OK
          return null;
      }
+
+    // [13] PubidChar ::= #x20 | #xD | #xA | [a-zA-Z0-9] | 
[-'()+,./:=?;*#@$_%]
+    private static boolean isXMLPublicIDCharacter(char c) {
+
+        if (c >= 'a' && c <= 'z') return true;
+        if (c >= '?' && c <= 'Z') return true;
+        if (c >= '\'' && c <= ';') return true;
+
+        if (c == ' ') return true;
+        if (c == '!') return true;
+        if (c == '=') return true;
+        if (c == '#') return true;
+        if (c == '$') return true;
+        if (c == '_') return true;
+        if (c == '%') return true;
+        if (c == '\n') return true;
+        if (c == '\r') return true;
+        if (c == '\t') return true;
+
+        return false;
+    }
+
+    /**
+     * <p>
+     *  This will ensure that the data for a public identifier
+     *  is appropriate.
+     * </p>
+     *
+     * @param publicID <code>String</code> public ID to check.
+     * @return <code>String</code> - reason public ID is invalid, or
+     *         <code>null</code> if public ID is OK.
+     */
+    public static final String checkPublicID(String publicID) {
+        String reason = null;
+
+        if (publicID == null) return null;
+        // This indicates there is no public ID
+
+        for (int i = 0; i < publicID.length(); i++) {
+          char c = publicID.charAt(i);
+          if (!isXMLPublicIDCharacter(c)) {
+            reason = c + " is not a legal character in public IDs";
+            break;
+          }
+        }
+
+        return reason;
+    }
+
+    /**
+     * <p>
+     *  This will ensure that the data for a system literal
+     *  is appropriate.
+     * </p>
+     *
+     * @param systemLiteral <code>String</code> system literal to check.
+     * @return <code>String</code> - reason system literal is invalid, or
+     *         <code>null</code> if system literal is OK.
+     */
+    public static final String checkSystemLiteral(String systemLiteral) {
+        String reason = null;
+
+        if (systemLiteral == null) return null;
+        // This indicates there is no system ID
+
+        if (systemLiteral.indexOf('\'') != -1
+          && systemLiteral.indexOf('"') != -1) {
+            reason =
+             "System literals cannot simultaneously contain both 
single and double quotes.";
+        }
+        else {
+          reason = checkCharacterData(systemLiteral);
+        }
+
+        return reason;
+    }
+

+
      /**
       * <p>
       *  This is a utility function for sharing the base process of checking
@@ -511,7 +589,7 @@
       * @return <code>String</code> - reason the name is invalid, or
       *         <code>null</code> if OK.
       */
-    private static String checkXMLName(String name) {
+    public static String checkXMLName(String name) {
          // Cannot be empty or null
          if ((name == null) || (name.length() == 0)
                             || (name.trim().equals(""))) {
Index: src/java/org/jdom/input/DOMBuilder.java
===================================================================
RCS file: /home/cvspublic/jdom/src/java/org/jdom/input/DOMBuilder.java,v
retrieving revision 1.39
diff -d -u -r1.39 DOMBuilder.java
--- src/java/org/jdom/input/DOMBuilder.java	2002/01/08 09:17:10	1.39
+++ src/java/org/jdom/input/DOMBuilder.java	2002/02/03 11:03:45
@@ -518,12 +518,8 @@
                  String systemID = domDocType.getSystemId();

                  DocType docType = factory.docType(domDocType.getName());
-                if ((publicID != null) && (!publicID.equals(""))) {
-                    docType.setPublicID(publicID);
-                }
-                if ((systemID != null) && (!systemID.equals(""))) {
-                    docType.setSystemID(systemID);
-                }
+                docType.setPublicID(publicID);
+                docType.setSystemID(systemID);

                  doc.setDocType(docType);
                  break;
Index: src/java/org/jdom/input/SAXHandler.java
===================================================================
RCS file: /home/cvspublic/jdom/src/java/org/jdom/input/SAXHandler.java,v
retrieving revision 1.32
diff -d -u -r1.32 SAXHandler.java
--- src/java/org/jdom/input/SAXHandler.java	2002/01/29 05:21:11	1.32
+++ src/java/org/jdom/input/SAXHandler.java	2002/02/03 11:03:45
@@ -788,7 +788,7 @@
       *
       * @param name <code>String</code> name of element listed in DTD
       * @param publicId <code>String</code> public ID of DTD
-     * @param systemId <code>String</code> syste ID of DTD
+     * @param systemId <code>String</code> system ID of DTD
       */
      public void startDTD(String name, String publicId, String systemId)
          throws SAXException {
Index: src/java/org/jdom/output/XMLOutputter.java
===================================================================
RCS file: /home/cvspublic/jdom/src/java/org/jdom/output/XMLOutputter.java,v
retrieving revision 1.72
diff -d -u -r1.72 XMLOutputter.java
--- src/java/org/jdom/output/XMLOutputter.java	2002/01/30 03:32:11	1.72
+++ src/java/org/jdom/output/XMLOutputter.java	2002/02/03 11:03:46
@@ -1179,13 +1179,13 @@

          out.write("<!DOCTYPE ");
          out.write(docType.getElementName());
-        if ((publicID != null) && (!publicID.equals(""))) {
+        if (publicID != null) {
              out.write(" PUBLIC \"");
              out.write(publicID);
              out.write("\"");
              hasPublic = true;
          }
-        if ((systemID != null) && (!systemID.equals(""))) {
+        if (systemID != null) {
              if (!hasPublic) {
                  out.write(" SYSTEM");
              }
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|              http://www.ibiblio.org/xml/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |
+----------------------------------+---------------------------------+



More information about the jdom-interest mailing list