A rose by any other name is just as thorny...

We have spent a lot of time recently discussing the ways in which we 
could define terms, dereference terms, follow-your-nose to understand 
terms, etc.  In the midst of this discussion I started asking myself a 
question - why do we care what a term means?

Now, don't get your hackles up!  I know that in the fullness of time an 
inference engine that is looking at semantic markup needs to understand 
the meaning of the term.  I still don't believe in the semantic web, but 
I know you guys do and that this is one of its basic tenets.  No.  My 
question is "Why do *WE* care?"

I believe we do not.  And that we never need to.  RDFa is a collection 
of syntactic rules and processing rules that enable a conforming 
processor to extract triples from a conforming document.  That's it.  A 
Conforming RDFa Processor has no need to understand ANYTHING AT ALL 
about those triples!

"So what?"  I hear you saying.  Well, I think this goes to the heart of 
the discussion about 'default vocabulary' specifically, and 
'vocabularies' in general.  There has been a lengthy discussion about 
how to define a vocabulary in such a way that the 'terms' of that 
vocabulary are machine extractable, and there are people who are 
(rightly or wrongly) concerned about the additional overhead that a 
Conforming RDFa Processor might incur in dereferencing and interpreting 
those external vocabularies.  I believe strongly that there is NEVER a 
need to do this.  NEVER.

Here's why.  In RDFa I think we would all agree the following:

<div xmlns:dbp='http://dbpedia.org/property/'
     about="http://dbpedia.org/resource/Albert_Einstein" rel="dbp:citizenship">
  <span about="http://dbpedia.org/resource/Germany"></span>
  <span about="http://dbpedia.org/resource/United_States"></span>
</div>

This is a valid RDFa snippet that makes definitive statements about 
Albert Einstein and his citizenship.  The relation between those is 
defined as dbp:citizenship.  This is expanded by the conforming RDFa 
Processor to be http://dbpedia.org/property/citizenship .  The processor 
trusts that the term 'citizenship' is defined in the vocabulary 
'http://dbpedia.org/property/'.  But we don't dereference that URI - we 
don't go looking to ensure the term is there.

By the same token, there IS NO NEED for us to examine either our current 
(http://www.w3.org/1999/xhtml/vocab) nor any future referenced 
vocabulary to ensure that a term is indeed defined therein - whether 
that vocabulary is being used as the 'default' one or not.  We don't 
care!  In the exact same way that we trust the author to say 
'dbp:citizenship' is a term, we can also trust the author to say <div 
vocab='http://dbpedia.org/property/' rel='citizenship'...>

I do recognize that in RDFa Syntax 1.0 we have rules about unprefixed 
terms and their expansion into triples.  We built some terms into the 
RDFa Recommendation, and those terms constitute the 1.0 'default 
vocabulary'. However, if you actually look at the vocab document there 
are lots of other terms in there as well!  If I were to explicitly 
declare the prefix for that vocabulary and use the prefix when 
referencing terms, there would be no checking of those references done 
by an RDFa Syntax 1.0 conforming processor.  It is only in the 
unprefixed case where we added some restrictions about the generation of 
triples. In retrospect, I think that was a mistake.  Why? because if I 
were to explicitly declare the prefix for that vocabulary and use the 
prefix for terms, there would be no checking of the terms done by an 
RDFa Syntax 1.0 processor.  It is only in the unprefixed case where we 
added some restrictions about the generation of triples.  I could say 
rel='xv:foo' and I would get a triple.  Of course I would.

What's my point?  My point is this.  In a world where we permit the 
declaration of new default vocabulary prefixes, we have no need to ever 
determine the collection of terms that is defined by that vocabulary.  
We should just trust the document.  We are already doing that in every 
other instance anyway.

Thoughts?

-- 
Shane P. McCarron                          Phone: +1 763 786-8160 x120
Managing Director                            Fax: +1 763 786-8180
ApTest Minnesota                            Inet: shane@aptest.com

Received on Thursday, 25 March 2010 19:12:11 UTC