Re: escaping % in RDF URI references from Peter F. Patel-Schneider on 2003-09-22 (www-rdf-comments@w3.org from July to September 2003)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Mon, 22 Sep 2003 11:50:36 -0400 (EDT)
To: jjc@hpl.hp.com
Cc: www-rdf-comments@w3.org
Message-Id: <20030922.115036.103013799.pfps@research.bell-labs.com>
From: Jeremy Carroll <jjc@hpl.hp.com>
Subject: Re: escaping % in RDF URI references
Date: Mon, 22 Sep 2003 16:12:04 +0300

> 
> Hi Peter,
> 
> the changes referred to in this message are visible in the latest RDF Concepts 
> editors draft:
> 
> http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-concepts-20030117/
> 
> (Which also includes the changes concerning NFC).
> 
> Thanks for your comments concerning RDF URI References.
> http://lists.w3.org/Archives/Public/www-rdf-comments/2003JulSep/0316
> 
> As always your detailed criticism is valuable and has led to changes which the 
> WG hope are improvements.
> 
> Dealing with the PS first:
> 
> [[
> PS:  It appears to me that the translation in RDF Concepts is different
> from the translation in Namespaces in XML 1.1.  In particular, RDF concepts
> allows control characters whereas Namespaces in XML 1.1 does not.
> ]]
> 
> This was a mistake and has been rectified (see below for detail).
> 
> Concerning the main thrust,
> [[
> The wording in the ``Namespaces in XML 1.1'' document is *much*
> preferable.  It lays out the intent, gives reasons why the intent cannot
> be specified with just a pointer, provides a temporary solution, and
> finally gives a way towards a permanent solution. 
> ]]
> 
> We asked the IRI editor for his opinion, he said:
> http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003Sep/0162
> [[
> As for the texts, I think both of them have advantages and
> disadvantages. The best thing is to hurry up with the IRI
> spec and remove these problems.
> ]]
> after which the WG was not inclined to make a large editorial change. The 
> current text has received a number of reviews already.
> 
> We however more positively considered your statement:
> [[
> It lays out the intent, gives reasons why the intent cannot
> be specified with just a pointer, provides a temporary solution, and
> finally gives a way towards a permanent solution.
> ]]
> and felt that we should at least refer to the IRI draft. The RDF Core WG is 
> however reluctant to try and predict the future,. so our wording was less 
> assertive than that in XML Namespaces 1.1
> 
> So, we added:
> - an informative reference to IRI draft
> - the following note concerning the IRI draft
> [[
> Note: this section anticipates an RFC on Internationalized Resource 
> Identifiers. Implementations may issue warnings concerning the use
> of RDF URI References that do not conform with [IRI draft] or its 
> successors.
> ]]
> and made the following change concerning control characters:
> 
> ***
> Replace:
> [[
> A URI reference within an RDF graph (an RDF URI reference) is a Unicode string 
> [UNICODE] that would produce a valid URI ...
> ]]
> 
> with
> [[
> A URI reference within an RDF graph (an RDF URI reference) is a Unicode string 
> [UNICODE] that 
> + does not contain any control characters ( #x00 - #x1F, #x7F-#x9F)
> + and would produce a valid URI ...
> ]]
> ***
> 
> Please reply indicating whether these changes acceptably address your comment, 
> with a copy to www-rdf-comments@w3.org.

I find the relevant section of RDF Concepts continues to be almost
impossible to understand.  I just spend yet another 15 minutes trying to
understand it, and have come to the tentative conclusion that the section
is now consistent with XML Namespaces, but inconsistent with the IRI
draft.   The divergence has to do with the treatment of the space
character, which it appears to me is allowed in XML Namespaces but not in
the IRI draft.  This means that the third note in the section is not
correct.  

> Jeremy
> 
> PS The IRI draft 04 diverges from the XML Namespaces text, it contains the 
> following note:
> 
> [[
>    Note: Earlier drafts of this specification allowed the space
>    character and various delimiters in IRIs and IRI references.  The
>    full list of these characters was: "<", ">", '"', Space, "{", "}",
>    "|", "\", "^", and "`", i.e.  all printable characters in US-ASCII
>    that are not allowed in URIs.  For backwards compatibility,
>    implementations MAY also include these characters in step 3) above.
>    If such characters are found but are not converted, then the
>    conversion SHOULD fail.  Please note that the number sign ("#"), the
>    percent sign ("%"), and the square bracket characters ("[", "]") are
>    not part of the above list, and MUST not be converted.  Protocols and
>    formats that have used earlier definitions of IRIs including these
>    characters MAY require unescaping of these characters as a
>    preprocessing step to extract the actual IRI from a given field.
>    Such preprocessing MAY also be used by applications allowing the user
>    to enter an IRI.
> ]]

I continue to find the wording in XML Namespaces to be much better than the
working in RDF Concepts.

peter
Received on Monday, 22 September 2003 11:52:25 UTC