W3C home > Mailing lists > Public > www-tag@w3.org > October 2003

Possible issue: XML DOCTYPE declaration -- should the PUBLIC iden tifier be a URN?

From: Thompson, Bryan B. <BRYAN.B.THOMPSON@saic.com>
Date: Wed, 8 Oct 2003 10:16:39 -0400
Message-Id: <D24D16A6707B0A4B9EF084299CE99B390674D38B@mcl-its-exs02.mail.saic.com>
To: www-tag@w3.org
Cc: "Bebee, Bradley R." <bebeeb@US-McLean.mail.saic.com>, Guy.A.Lukes@frb.gov

I just noticed that the XML 1.0 Recommendation (Second edition) does not
state a requirement that the public identifier in a DOCTYPE declaration must
be a URN, e.g.:

-- snip from http://www.w3.org/TR/REC-xml#NT-ExternalID --

[Definition: In addition to a system identifier, an external identifier may
include a public identifier.] An XML processor attempting to retrieve the
entity's content may use the public identifier to try to generate an
alternative URI reference. If the processor is unable to do so, it must use
the URI reference specified in the system literal. Before a match is
attempted, all strings of white space in the public identifier must be
normalized to single space characters (#x20), and leading and trailing white
space must be removed.

-- end snip --

Further, based on RFC2396, a valid URN has a scheme "urn:" but the public
identifier is often written without any scheme, e.g.:

   "-//W3C//DTD SVG 1.0//EN"

So it is pretty clear that the public identifier is not a URN per RFC2396.
At the same time, it is being used "as if" it were a universal resource name
and the XML processor is encouraged to identify a URI that may be used
address the XML grammar declared by the DOCTYPE declaration.

It seems to me that there should be some clarification here.  Appologies if
this has been covered elsewhere already.



Bryan Thompson
Systems Architect
Hicks & Associates, Inc.
3811 N. Fairfax Drive, Suite 850
Arlington, VA 22203
+1 703-469-3409 (o)
+1 202-285-5099 (c)
Received on Wednesday, 8 October 2003 10:16:50 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:32:40 UTC