- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Tue, 11 Jan 2005 16:20:10 +0000
- To: Martin Duerst <duerst@w3.org>
- CC: "Krall, Gary" <gkrall@verisign.com>, "'Chris Lilley'" <chris@w3.org>, Reto Bachmann-Gmuer <reto@gmuer.ch>, www-international@w3.org
I suspect that formally correct treatment and good practice diverge on this. The text of RDF Concepts and Abstract Syntax which expresses RDF's idea of an IRI is this: http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Graph-URIref [[ A URI reference within an RDF graph (an RDF URI reference) is a Unicode string [UNICODE] that: * does not contain any control characters ( #x00 - #x1F, #x7F-#x9F) * and would produce a valid URI character sequence (per RFC2396 [URI], sections 2.1) representing an absolute URI with optional fragment identifier when subjected to the encoding described below. The encoding consists of: 1. encoding the Unicode string as UTF-8 [RFC-2279], giving a sequence of octet values. 2. %-escaping octets that do not correspond to permitted US-ASCII characters. The disallowed octets that must be %-escaped include all those that do not correspond to US-ASCII characters, and the excluded characters listed in Section 2.4 of [URI], except for the number sign (#), percent sign (%), and the square bracket characters re-allowed in [RFC-2732]. Disallowed octets must be escaped with the URI escaping mechanism (that is, converted to %HH, where HH is the 2-digit hexadecimal numeral corresponding to the octet value). ]] my understanding is that the IDNs are not covered as such by the conversion stated, and any additional conversion required from them would be an extra (non-standard) feature. I guess from a W3C side we should be revving a lot of text in light of IRI and IDN as they come to fruition. e.g. the conversion in http://www.w3.org/International/iri-edit/draft-duerst-iri-10.txt [[ http://résumé.example.org may be converted to http://xn--rsum-bpad.example.org instead of http://r%C3%A9sum%C3%A9.example.org. ]] and in fact, the first string is not an 'RDF URI reference' or an XLink href attribute value, or in the lexical space of xsd:anyURI, because http://r%C3%A9sum%C3%A9.example.org. is not a legal URI, and the provisions of those specs only allow for %-encoding. So, I think an RSS or RDF tool would need either to: - not check that the URIs were legal or - to have an extended check that knew something about IDNs and neither is particularly conformant. Personally I would prefer the latter. Jeremy Martin Duerst wrote: > > At 03:45 05/01/08, Krall, Gary wrote: > > > >Chris: > > > >Just for clarification does your answer imply that an RSS reader would > need > >to support IDNA to make this work? > > If it does resolve IRIs, which I guess most RSS readers do, then yes. > Otherwise no. RDF as such does not require resolution, and therefore > does not require IDNA support. > > IDNA support is available in libraries (e.g. libidn or idnkit) > that can easily be integrated into other software (but are a bit > bulky because of the tables needed). > > Regards, Martin. > > >Thanks, > > > >Gary. > > > >-----Original Message----- > >From: www-international-request@w3.org > >[mailto:www-international-request@w3.org]On Behalf Of Chris Lilley > >Sent: Friday, January 07, 2005 10:36 AM > >To: Reto Bachmann-Gmuer > >Cc: www-international@w3.org > >Subject: Re: IRI and IDN in RDF > > > > > > > >On Friday, January 7, 2005, 7:23:22 PM, Reto wrote: > > > > > >RBG> Hello > > > >RBG> I'm wondering how URLs based on IDN should be represented in > RDF/XML: > >RBG> - no particular encoding (= default of xml document) > >RBG> - %... encoding > >RBG> - punycode > > > >Since it is XML, the IRI can be expressed in regular characters (the > >encoding of the document) and conversion to punycode, hex escaping etc > >can be left to the URI resolver. > > > > > >-- > > Chris Lilley mailto:chris@w3.org > > Chair, W3C SVG Working Group > > Member, W3C Technical Architecture Group >
Received on Tuesday, 11 January 2005 16:20:38 UTC