Re: CURIEorURI Value Space Collisions from Ivan Herman on 2011-05-02 (public-rdfa-wg@w3.org from May 2011)

From: Ivan Herman <ivan@w3.org>
Date: Mon, 2 May 2011 11:53:23 +0200
To: Shane McCarron <shane@aptest.com>
Cc: public-rdfa-wg@w3.org
Message-Id: <0C7B65B2-4B9E-4303-96B0-FAA8F3F054C3@w3.org>
Shane,

first of all, yes, that is what I meant...

I agree you have a point in that specific case. Although... I do think that there is another situation when a warning could be generated based on the list of schemes but that it different. Indeed, if the RDFa content uses a pr:lala, and there is no prefix definition for 'pr', the result is a URI. However, if 'pr' does _not_ appear int the list of registered schemes, there is a very high probability that this is either a misspelling of a prefix or the author has forgotten to define that particular prefix. I believe issuing a warning in that case is useful.

But, as I said, this is not the case I was referring to, where a prefix is defined for an existing scheme. You have a point for that one, so I am less sure this is necessary or useful... 

That being said, these are all warnings and not errors. In this sense, they are not really harmful, so to say.

Ivan

On May 1, 2011, at 20:20 , Shane McCarron wrote:

> Ivan,
> 
> Part of your reply confused the heck out of me:
> 
> On 5/1/2011 11:46 AM, Ivan Herman wrote:
>> ....
>> 
>> Having an implementation store the list of registered prefixes (and _not_ shrink, because no scheme goes out of definition), issue warning when a prefix collides with a prefix works for me. Because a prefix definition takes precedence over the uri scheme (say, your wxg prefix), an RDFa content produced today using that prefix remains valid and produces the same RDF graph even if at some point wxg becomes a URI scheme. If at that point somebody wants to use the wxg scheme, then the in a newly produced RDFa content another prefix will have to be chosen but that does not seem to create huge problems...
>> 
> 
> I *think* what you mean is that it needs to store a list of registered SCHEMES, and issue a warning when a declared CURIE PREFIX collides with one of those known, registered SCHEMES.
> 
> Assuming that is what you mean....  I think it is a waste of processing and a bad idea.  But, if the working group decides to require it, I won't object.
> 
> Why do I think it is a waste?  Because a collision does no harm.  CURIEs are only interpreted in RDFa attributes (and @rel and @rev, but those are linked data attributes anyway).  If I define a prefix that collides with a scheme.... let's say 'sip' for argument, then I might have @resource='sip:lala'.  But @resource is ONLY interpreted by an RDFa processor.  If there is also an @href='sip:shane@aptest.com' somewhere in the document, that won't be interpreted as a CURIE at all.  It is a URI.  I admit that there is the potential for CONFUSION here - in that an author might THINK that this is supposed to work, and that @href='sip:lala' is supposed be translated into http://some.example.com/resources/lala - but it's not, it never has been, and the changes we made to RDFa Core 1.1 don't make this any worse (IMHO).  RDFa 1.0 worked this way too.
> 
> Why do I think it is a bad idea?  Because I think it has the potential for future needless confusion.  Let's say I author RDFa content today, and in that content I use the prefix 'v' (because Google tells me I have to).  That's great - it collides with nothing, everything works. Yay!  Fast forward 10 years.  Some bright spark has decided to save people keystrokes in authoring by minting a new scheme 'v' that means 'Verified Web Site' - so you can be confident that URIs that start with v: only connect to sites that are repeatedly verified by some collection of authorities as being virus free. (Great idea - I think I will starting writing that RFC now!)  Anyway - suddenly my document that generated no warnings starts generating warnings as RDFa implementations are updated.  And why?  Not because there is any actual risk that a URI will be misinterpreted.  But because an embedded linked data pointer could now be interpreted *BY A HUMAN* as being a legitimate URI.  It would never be misinterpreted by an RDFa Processor - *only by a human*.  If that's the problem we are trying to solve, then I am sorry by I think its just silly.    Humans misinterpret things all the time - we can't solve that problem.
> 
> Moreover, the CURIE prefix definitions and mappings don't survive interpretation.  They are ONLY present in the source document.  Once it is interpreted, they are expanded to full URIs.  So they are ephemeral AND cannot be misinterpreted by an RDFa processor.  The semantic web is safe.  User agents are safe in that they continue to work as they always have.  Authors are safe in that if they use schemes in URIs there is no way they can be misinterpreted.  Again, what am I missing here?
> 
> 
> -- 
> Shane P. McCarron                          Phone: +1 763 786-8160 x120
> Managing Director                            Fax: +1 763 786-8180
> ApTest Minnesota                            Inet: shane@aptest.com
> 
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Monday, 2 May 2011 09:52:09 UTC