RE: Namespaces 1.1 Last Call -- I18N WG comments from Richard Tobin on 2002-11-29 (xml-names-editor@w3.org from November 2002)

From: Richard Tobin <richard@cogsci.ed.ac.uk>
Date: Fri, 29 Nov 2002 16:04:29 GMT
To: Misha.Wolf@reuters.com
Cc: w3c-i18n-ig@w3.org, xml-names-editor@w3.org
Message-Id: <200211291604.QAA14885@mcpherson.cogsci.ed.ac.uk>
Here are the sections of the spec (2.2, 9, appendix B) relevant to
your comments.

I hope the non-ASCII characters don't get messed up in the mail!

-- Richard

2.2 Use of IRIs as Namespace Names

[Definition: IRI references which identify namespaces are considered
identical if and only if they are exactly the same
character-for-character.] This is described in greater detail, with
examples, in B Comparing IRI References.

The empty string, though it is a legal IRI reference, cannot be used
as a namespace name.

The use of relative IRI references, including same-document
references, in namespace declarations is deprecated. Future W3C
specifications will define no interpretation for them.


9 Internationalized Resource Identifiers (IRIs)

Work is currently in progress to produce an RFC defining
Internationalized Resource Identifiers (IRIs). Since this work is not
yet complete, in this section we give a syntactic definition of IRIs
for the purposes of this specification. We expect to issue an erratum
replacing this section with a reference to the RFC when it is
published. Users defining namespaces are advised to restrict namespace
names to URIs until software supporting IRIs is in common use.

For a more general definition and discussion of IRIs see [IRI draft]
(work in progress).

URI references are restricted to a subset of the ASCII characters; IRI
references allow some of the disallowed ASCII characters as well as
most Unicode characters from #xA0 onwards.

[Definition: The additional characters allowed in IRIs are: ]

   space #x20
   the delimiters < #x3C, > #x3E and " #x22
   the unwise characters { #x7B, } #x7D, | #x7C, \ #x5C, ^ #x5E and ` #x60
   the Unicode plane 0 characters #xA0 - #xD7FF, #xF900-#xFDCF, #xFDF0-#xFFEF
   the Unicode plane 1-14 characters #x10000-#x1FFFD ... #xE000-#xEFFD

[Definition: An IRI reference is a string that can be converted to a
URI reference by escaping all additional characters as follows: ]

   1. Each additional character is converted to UTF-8 [Unicode 3.2] as
   one or more bytes.

   2. The resulting bytes are escaped with the URI escaping mechanism
   (that is, converted to %HH, where HH is the hexadecimal notation of
   the byte value). 

   3. The original character is replaced by the resulting character
   sequence.


Appendix B Comparing IRI References

IRI references identifying namespaces are compared when determining
whether a name belongs to a given namespace, and whether two names
belong to the same namespace. The two IRIs are treated as strings, and
they are identical if the strings are identical, that is, if they are
the same sequence of characters. The comparison is case-sensitive, and
no %-escaping is done or undone.

A consequence of this is that IRI references which are not identical
in this sense may resolve to the same resource. Examples include IRI
references which differ only in case or %-escaping, or which are in
external entities which have different base URIs (but note that
relative IRIs are deprecated as namespace names).

In a namespace declaration, the IRI reference is the normalized value
of the attribute, so replacement of XML character and entity
references has already been done before any comparison.

Examples:

The IRI references below are different for the purposes of identifying
namespaces, since they differ in case:

  http://www.example.org/wine
  http://www.example.org/Wine

The IRI references below are also all different for the purposes of
identifying namespaces:

  http://www.example.org/ros�
  http://www.example.org/ros%c3%a9
  http://www.example.org/ros%c3%A9
  http://www.example.org/ros%C3%a9
  http://www.example.org/ros%C3%A9

If the entity eacute has been defined to be �, the start tags below
all contain namespace declarations binding the prefix p to the same
IRI reference, http://example.org/ros�.

  <p:foo xmlns:p="http://example.org/ros�">
  <p:foo xmlns:p="http://example.org/ros&#xe9;">
  <p:foo xmlns:p="http://example.org/ros&#xE9;">
  <p:foo xmlns:p="http://example.org/ros&#233;">
  <p:foo xmlns:p="http://example.org/ros&eacute;">
Received on Friday, 29 November 2002 11:04:33 UTC