Re: XLink 1.1: Charmod conformance from Richard Tobin on 2006-01-25 (www-xml-linking-comments@w3.org from January to March 2006)

From: Richard Tobin <richard@inf.ed.ac.uk>
Date: Wed, 25 Jan 2006 16:47:00 +0000 (GMT)
To: www-xml-linking-comments@w3.org
Cc: Bjoern Hoehrmann <derhoermi@gmx.net>
Message-Id: <20060125164700.B61C8593044@macintosh.inf.ed.ac.uk>

I think that it is quite inappropriate to either perform or check
charmod conformance at this layer.  The data has been read in
(typically by an XML parser) and if normalization is appropriate it
should have been done before parsing, and (as required by a SHOULD in
XML 1.1) checked during parsing.

Therefore in the case of:

> 3) where applicable, requires implementations conforming to the
>    specification to conform to this document,
>
> 4) where applicable, requires content conforming to the
>    specification to conform to this document.

I think the requirements are not applicable.  Perhaps XML 1.0 should
be amended to have such requirements, but not a spec that operates
on already-parsed documents.

Similarly, strings in the document that are interpreted as IRIs should
have been normalized and/or checked when the document was read in.

Or to put it another way, which variant of step 1 in 3.1 of RFC3987
applies to the characters that are being interpreted as an IRI?  It's
certainly not (a), which refers to the case where characters are not
in a computer representation yet, and you have to choose a Unicode
representation.  And it's not (b), which refers to the case of
characters in non-Unicode encodings.  The characters in question are
already Unicode code points, so the variant that most closely matches
is (c):

  If the IRI is in a Unicode-based character encoding (for example,
  UTF-8 or UTF-16), do not normalize (see section 5.3.2.2 for
  details).  Apply step 2 directly to the encoded Unicode character
  sequence.

-- Richard

Received on Wednesday, 25 January 2006 16:47:10 UTC