- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 2 Oct 2012 01:04:25 -0400
- To: RDF-WG WG <public-rdf-wg@w3.org>, Internationalization Core Working Group <www-international@w3.org>
- Cc: Gavin Carothers <gavin@carothers.name>
* Martin J. Dürst <duerst@it.aoyama.ac.jp> [2012-09-08 13:30+0900] > On 2012/09/08 0:49, Internationalization Core Working Group Issue > Tracker wrote: > > I18N-ISSUE-188: special handling of % in IRI [TURTLE] > > > > http://www.w3.org/International/track/issues/188 > > > > Raised by: Addison Phillips > > On product: TURTLE > > > > http://www.w3.org/2012/08/22-i18n-minutes.html#item05 > > > > Section 6.4 contains this Note: > > > > -- > > %-encoded sequences are in the character range for IRIs and are explicitly allowed in local names. These appear as a '%' followed by two hex characters and represent that same sequence of three characters. These sequences are not decoded during processing. A term written as<http://a.example/%66oo-bar> in Turtle designates the IRI http://a.example/%66oo-bar and not IRI http://a.example/foo-bar. A term written as ex:%66oo-bar with a prefix @prefix ex:<http://a.example/> also designates the IRI http://a.example/%66oo-bar. > > > We don't understand why you do this. Can you clarify? > > I'm not speaking for the RDF/TURTLE WG, but RDF (and therefore TURTLE) > are doing IRI comparisons strictly character-by-character (see e.g. > http://tools.ietf.org/html/rfc3987#section-5.3.1), the same as it is > done in XML Namespaces. The RDF model uses IRIs as identifiers; Turtle merely provides a serialization of that model. IRIs include %dd sequences, e.g. <http://伝言.example/?user=أكرم&channel=R%26D>. (It is quite reasonable that such an IRI would include a '%' as otherwise the corresponding URL <http://xn--9oqp94l.example/?user=%D8%A3%D9%83%D8%B1%D9%85&channel=R%26D> would have an extra form-url-encoded parameter from "R&D".) Turtle could de-escape one level of %s, but that would be pretty arbitrary behavior, having the unfortunate effect of requiring anyone composing Turtle to first %-escape RDF's IRIs, including % sequences for any characters in reserved | unreserved | escaped. > It would probably help if this was pointed out more explicitly in the > above text. I think the explanation would be long and would teach people general rules about designing languages with appropriate escaping. I think we're best off saying "Turtle parsers don't do anything with '%dd'." Please indicate whether this response addresses the issue. -- -ericP
Received on Tuesday, 2 October 2012 05:04:57 UTC