Re: unicode escapes in prefix names from Richard Cyganiak on 2011-11-23 (public-rdf-wg@w3.org from November 2011)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Wed, 23 Nov 2011 15:15:13 +0000
To: Eric Prud'hommeaux <eric@w3.org>
Cc: Gavin Carothers <gavin@carothers.name>, Andy Seaborne <andy.seaborne@epimorphics.com>, RDF-WG <public-rdf-wg@w3.org>
Message-Id: <EFA4C836-7D3C-4D52-88F9-61E87F7D556D@cyganiak.de>

On 23 Nov 2011, at 14:50, Eric Prud'hommeaux wrote:
> Of course, URL minters can mint whatever they want, but the mapping to URI (for that popular GET protocol) *loses* '%'s. So a reason to avoid excessive %-ification is that, when you push it through the standard processing at the far end, say, Apache's mapping to a filename, those lost '%'s don't come back. As an example, <http://example.com/R&D> and <http://example.com/R%26D> map to the same URL (Apache will look for <server root>/R&D).

+1

> I've seen short exemplars bandied about, but the ones I deal with reallistically are IRIs mapped from protein identifiers which have ':'s in them. I have a nice syntax for writing most of my queries and most of my data, nicely categorized by namespace prefixes which helps me visually distinguish proteins from mechanisms from drugs. But if I'm unlucky enough to need to reference one with a ':' in it, I'm not allowed to use the obvious escaping syntax? Instead I have to throw all that away and have a big opaque IRI in the middle of some otherwise organized data or query?

Yup, you need to use a full IRI. On the plus side, you don't need to look up unicode code points or do hexadecimal arithmetics.

I think the average user would rather use a full IRI than figure out how to turn the dot at the end of the IRI into a unicode escape sequence.

In many ways, expanding the prefix and wrapping everything into <…> is a friendlier escaping mechanism than looking up unicode code points.

Not everyone is a Unicode geek with an obsession for orderly query layout ;-)

Richard

Received on Wednesday, 23 November 2011 15:15:48 UTC