Re: CURIEorURI Value Space Collisions from Shane McCarron on 2011-05-01 (public-rdfa-wg@w3.org from May 2011)

From: Shane McCarron <shane@aptest.com>
Date: Sun, 01 May 2011 13:20:52 -0500
To: public-rdfa-wg@w3.org
Message-ID: <4DBDA484.4010301@aptest.com>

Ivan,

Part of your reply confused the heck out of me:

On 5/1/2011 11:46 AM, Ivan Herman wrote:
> ....
>
> Having an implementation store the list of registered prefixes (and _not_ shrink, because no scheme goes out of definition), issue warning when a prefix collides with a prefix works for me. Because a prefix definition takes precedence over the uri scheme (say, your wxg prefix), an RDFa content produced today using that prefix remains valid and produces the same RDF graph even if at some point wxg becomes a URI scheme. If at that point somebody wants to use the wxg scheme, then the in a newly produced RDFa content another prefix will have to be chosen but that does not seem to create huge problems...
>

I *think* what you mean is that it needs to store a list of registered
SCHEMES, and issue a warning when a declared CURIE PREFIX collides with
one of those known, registered SCHEMES.

Assuming that is what you mean.... I think it is a waste of processing
and a bad idea. But, if the working group decides to require it, I
won't object.

Why do I think it is a waste? Because a collision does no harm. CURIEs
are only interpreted in RDFa attributes (and @rel and @rev, but those
are linked data attributes anyway). If I define a prefix that collides
with a scheme.... let's say 'sip' for argument, then I might have
@resource='sip:lala'. But @resource is ONLY interpreted by an RDFa
processor. If there is also an @href='sip:shane@aptest.com' somewhere
in the document, that won't be interpreted as a CURIE at all. It is a
URI. I admit that there is the potential for CONFUSION here - in that
an author might THINK that this is supposed to work, and that
@href='sip:lala' is supposed be translated into
http://some.example.com/resources/lala - but it's not, it never has
been, and the changes we made to RDFa Core 1.1 don't make this any worse
(IMHO). RDFa 1.0 worked this way too.

Why do I think it is a bad idea? Because I think it has the potential
for future needless confusion. Let's say I author RDFa content today,
and in that content I use the prefix 'v' (because Google tells me I have
to). That's great - it collides with nothing, everything works. Yay!
Fast forward 10 years. Some bright spark has decided to save people
keystrokes in authoring by minting a new scheme 'v' that means 'Verified
Web Site' - so you can be confident that URIs that start with v: only
connect to sites that are repeatedly verified by some collection of
authorities as being virus free. (Great idea - I think I will starting
writing that RFC now!) Anyway - suddenly my document that generated no
warnings starts generating warnings as RDFa implementations are
updated. And why? Not because there is any actual risk that a URI will
be misinterpreted. But because an embedded linked data pointer could
now be interpreted *BY A HUMAN* as being a legitimate URI. It would
never be misinterpreted by an RDFa Processor - *only by a human*. If
that's the problem we are trying to solve, then I am sorry by I think
its just silly. Humans misinterpret things all the time - we can't
solve that problem.

Moreover, the CURIE prefix definitions and mappings don't survive
interpretation. They are ONLY present in the source document. Once it
is interpreted, they are expanded to full URIs. So they are ephemeral
AND cannot be misinterpreted by an RDFa processor. The semantic web is
safe. User agents are safe in that they continue to work as they always
have. Authors are safe in that if they use schemes in URIs there is no
way they can be misinterpreted. Again, what am I missing here?

--
Shane P. McCarron Phone: +1 763 786-8160 x120
Managing Director Fax: +1 763 786-8180
ApTest Minnesota Inet: shane@aptest.com

Received on Sunday, 1 May 2011 18:21:18 UTC