Re: Colon symbol in URI?

Hi,

The part after the : is a QName, and the relevant spec is at [1]. It does forbid to use a digit as first character after the colon.

While we were working on the OBO ID policy [2], Jonathan Rees (cc) mentioned there was a proposal to relax those constraints by using CURIEs [3] instead of QNames, but I didn't check it and don't know its status; he may be able to add more information.

Cheers,
Melanie

[1] http://www.w3.org/TR/REC-xml-names/#NT-QName
[2] http://www.obofoundry.org/id-policy.shtml
[3] http://www.w3.org/2001/sw/BestPractices/HTML/2005-10-27-CURIE


On 2011-08-12, at 1:53 AM, Markus Krötzsch wrote:

> On 10/08/11 23:04, zhutchok@ebi.ac.uk wrote:
>> Hello all,
>> 
>> I am trying to find out if it is possible to use colon symbol in the URI
>> of RDF/OWL classes, for example:
>> http://purl.obofoundry.org/obo/SMTH:0000353. On one hand, I could not find
>> any mention of it being restricted on W3C web-page, but on the other hand,
>> I could not find any ontology example having a colon in a class URI
>> (except the colon after http) either. Moreover, OBO to OWL converters
>> usually replace colon symbols used in OBO identifiers with an underscore,
>> e.g. Protege will convert smth:0000053 to
>> http://purl.obofoundry.org/obo/SMTH_0000353 (not to
>> http://purl.obofoundry.org/obo/SMTH:0000353).
>> 
>> Does it mean colon is prohibited?
> 
> I think it is but I have not checked the specs. The problem might be in the widely used RDF/XML serialization that imposes its own restrictions on the syntax. In particular, RDF/XML is not able to express all valid RDF graphs. One reason for this is that property names must be XML element names in this syntax, and many symbols are not allowed in local names in this context (including colon which is interpreted as a separator between namespace id and local name). This might be one reason why it is advisable to escape colon.
> 
> Since XML does not provide a suitable escaping mechanism, one is left with ad hoc escaping strategies. For example, Semantic MediaWiki uses the URL encoding of ":" as "%3A" and replaces the "%" with the less problematic character "-". The character "-" is escaped similarly, so the escaping can be inverted. What you describe for Protege seems to be a rather unfortunate approach that leads to an escaping that is not invertible.
> 
> Cheers,
> 
> Markus
> 
> 
> -- 
> Dr. Markus Krötzsch
> Department of Computer Science, University of Oxford
> Room 306, Parks Road, OX1 3QD Oxford, United Kingdom
> +44 (0)1865 283529               http://korrekt.org/
> 

---
Mélanie Courtot
MSFHR/PCIRN trainee, TFL- BCCRC
675 West 10th Avenue
Vancouver, BC
V5Z 1L3, Canada

Received on Friday, 12 August 2011 16:07:51 UTC