- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Thu, 11 Sep 2003 16:29:23 +0200
- To: <w3c-rdfcore-wg@w3.org>
- Cc: <w3c-i18n-ig@w3.org>
Jeremy: > Personally my preference would be to follow Martin Durst's advice ... [here > at least :) ]. Brian: > Are you suggesting soliciting further advice? Yes - Martin any comments, would it be better to go with our current text [[ 6.4 RDF URI References A URI reference within an RDF graph (an RDF URI reference) is a Unicode string [UNICODE] that would produce a valid URI character sequence (per RFC2396 [URI], sections 2.1) representing an absolute URI with optional fragment identifier when subjected to the encoding described below. The encoding consists of: 1. encoding the Unicode string as UTF-8 [RFC-2279], giving a sequence of octet values. %-escaping octets that do not correspond to permitted US-ASCII characters. 2. The disallowed octets that must be %-escaped include all those that do not correspond to US-ASCII characters, and the excluded characters listed in Section 2.4 of [URI], except for the number sign (#), percent sign (%), and the square bracket characters re-allowed in [RFC-2732]. Disallowed octets must be escaped with the URI escaping mechanism (that is, converted to %HH, where HH is the 2-digit hexadecimal numeral corresponding to the octet value). Two RDF URI references are equal if and only if they compare as equal, character by character, as Unicode strings. Note: RDF URI references are compatible with the anyURI datatype as defined by XML schema datatypes [XML-SCHEMA2], constrained to be an absolute rather than a relative URI reference. Note: RDF URI references are compatible with International Resource Identifiers as defined by [XML Namespaces 1.1]. Note: The restriction to absolute URI references is found in this abstract syntax. When there is a well-defined base URI, concrete syntaxes, such as RDF/XML, may permit relative URIs as a shorthand for such absolute URI references. ]] or text based on http://www.w3.org/TR/xml-names11/#IRIs [[ Work is currently in progress to produce an RFC defining Internationalized Resource Identifiers (IRIs). Since this work is not yet complete, in this section we give a syntactic definition of IRIs for the purposes of this specification. We expect to issue an erratum replacing this section with a reference to the RFC when it is published. Users defining namespaces are advised to restrict namespace names to URIs until software supporting IRIs is in common use. For a more general definition and discussion of IRIs see [IRI draft] (work in progress). URI references are restricted to a subset of the ASCII characters; IRI references allow some of the disallowed ASCII characters as well as most Unicode characters from #xA0 onwards. [Definition: The additional characters allowed in IRIs are: ] + space #x20 + the delimiters < #x3C, > #x3E and " #x22 + the unwise characters { #x7B, } #x7D, | #x7C, \ #x5C, ^ #x5E and ` #x60 + the Unicode plane 0 characters #xA0 - #xD7FF, #xF900-#xFDCF, #xFDF0-#xFFEF + the Unicode plane 1-14 characters #x10000-#x1FFFD ... #xE000-#xEFFD [Definition: An IRI reference is a string that can be converted to a URI reference by escaping all additional characters as follows: ] 1. Each additional character is converted to UTF-8 [Unicode 3.2] as one or more bytes. 2. The resulting bytes are escaped with the URI escaping mechanism (that is, converted to %HH, where HH is the hexadecimal notation of the byte value). The original character is replaced by the resulting character sequence. ]] Noting that RDF Core WG has declined a comment suggesting using the term IRI thoughout, so that the definition would remain a definition of "RDF URI references". A specific question is ctrl characters - should they be allowed or not? Jeremy
Received on Thursday, 11 September 2003 10:38:10 UTC