Re: Some further thoughts on the default profile issue ( ISSUE-78 and ISSUE-73 )

On Mon, 2011-01-31 at 10:44 +0100, Ivan Herman wrote:
> However, from a spec writing and definition point of view, it may also
> simplify one thing. The current rule is that if we have a CURIE, but
> the prefix does not resolve to a defined URI, then the whole CURIE is
> considered to be a full URI. This is how we could accommodate having
> @rel="http://www.example.org/bla". But... we could also remove this
> thing altogether (thereby making things simpler and closer to RDFa
> 1.0. Instead, we could agree that the default profile would include
> the prefix mapping 'http' -> 'http:' (and the same for a bunch of
> other http schemes).

That would work for "http:", but not for, say, "urn:". Why? 

Take for example the mapping "urn"=>"urn:". We want to then represent
the IRI <urn:isbn:0123456789> as a CURIE.

The suffix part of a CURIE is defined as an irelative-ref (from the IRI
spec). This in turn is defined as an irelative-part optionally followed
by a query string and fragment. Focusing on the irelative-part, it can
be one of:

 1. something that has an authority
           (which will always start "//", so not relevant here)
 2. an absolute path
    (which will always start "/", so ditto)
 3. an ipath-noscheme
 4. empty

OK, so clearly if we're representing that ISBN IRI as a CURIE, we're
needing to use ipath-noscheme, but drilling down into the definition of
that, an ipath-noscheme cannot contain any colons before its first slash
(it's not required to contain a slash though).

So the mapping "urn"=>"urn:" cannot be used to create a CURIE for
<urn:isbn:0123456789>. (Which is not to say that this IRI can't be
represented as a CURIE - it can - you just need to create a different
mapping, such as "urn-isbn"=>"urn:isbn:".)

A possible fix would be to broaden the allowed syntax for CURIEs. The
suffix part of a CURIE (a.k.a. reference) would be defined as any string
containing no whitespace characters. (And we do already define
whitespace.) It would not surprise me if we discovered that many RDFa
implementations already use that definition.

[ Aside: the current definition of CURIE needs fixing anyway. The empty
string (i.e. no prefix, no colon and no suffix) is allowed as a CURIE,
which will map to the "no prefix" mapping with nothing appended.
However, it's impossible to detect an empty string CURIE in a
whitespace-delimited list; and we do not interpret datatype="" as being
the empty string CURIE. We should almost certainly explicitly forbid the
use of the empty string as a CURIE. (Though we should still allow it as
a safe CURIE.) ]

In general I'm in favour of this suggestion, but I do think it needs to
be combined with this broader syntax definition for CURIEs.

We'd need to discuss which URI schemes get included in the default
profile.

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>

Received on Monday, 31 January 2011 15:05:04 UTC