W3C home > Mailing lists > Public > public-rdfa-wg@w3.org > April 2011

Re: CURIEorURI Value Space Collisions

From: Ivan Mikhailov <imikhailov@openlinksw.com>
Date: Wed, 13 Apr 2011 16:16:16 +0700
To: Niklas Lindström <lindstream@gmail.com>
Cc: Ivan Herman <ivan@w3.org>, public-rdfa-wg <public-rdfa-wg@w3.org>
Message-ID: <1302686176.7024.957.camel@octo.iv.dev.null>

I've implemented RDFa 1.1 few days ago in OpenLink Virtuoso, and I feel
discomfort for the same reason. I'm a bit pessimistic about the
convenience and safety of the syntax: it is really hard to automatically
distinguish between a typo and a weird-looking URI. SafeCURIEs were
really safer.
If square brackets are annoying then a possible way to tighten nuts is
to declare used URI schemes explicitly and signal an error if a part of
a URI before colon is neither declared namespace prefix nor a schema
listed above. A technically possible way, but it means additional
paragraphs of spec to read and garbage in the header.

I'd propose the following:

1. The spec could mention SafeCURIEs and UnsafeCURIEs
(and maybe AnyCURIE for the sum of them).

2. The phrase "The concept of a safe_curie is retained for backward
compatibility" might be understood as "safe_curie is a legacy scrap
retained for backward compatibility for some period of time". Maybe it's
better to warn that "Only safe_curie provides backward compatibility
with RDFa 1.0 parsers, unsafe_curie does not".

3. We may encourage developers to add "strict validation" mode to their
RDFa 1.1 parsers by providing extra properties in manifest of RDFa 1.1
test suite: "the resource is passed solely because spec says that the
attribute MUST be ignored". An external profile resource that does not
exists is the most evident example, weird CURIEs are candidates too.

This is my private opinion, not an official answer from any WG.

Best Regards,

Ivan Mikhailov
OpenLink Software

On Apr 12, 2011, at 24:02 , Niklas Lindström wrote:

> While I understand that it is confusing to use it as a prefix, I am
> not convinced that it is safe to combine the CURIE and URI value space
>  like this. At least not without a limit on the CURIEs allowed in the
>  joint CURIEorURI space. For instance, not allowing CURIEs in that
>  space to use anything after the prefix+':' other than say an
>  isegment-nz-nc from RFC 3987, or something to that effect (like a
>  "[A-Za-z0-9_-.]+" regexp).

>  If there was such a restriction on the format of CURIEs are allowed in
>  the CURIEorURI mix (and that anything not matching it would be
>  considered a full URI), I would definitely sleep better. :)

>  Am I missing something crucial, or overly worried about the risk of collisions?

>  Best regards,
>  Niklas

>  [1]: http://www.w3.org/TR/HTTP-in-RDF10/
>  [2]: http://dig.csail.mit.edu/hg/tabulator/file/9a135feff10f/chrome/content/js/rdf/rdflib.js#l5644
>  [3]: http://en.wikipedia.org/wiki/URI_scheme
Received on Wednesday, 13 April 2011 09:16:46 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:19:51 UTC