- From: Dan Brickley <danbri@danbri.org>
- Date: Fri, 22 May 2009 20:49:51 +0200
- To: Pat Hayes <phayes@ihmc.us>
- CC: David Booth <david@dbooth.org>, Hugh Glaser <hg@ecs.soton.ac.uk>, semantic-web <semantic-web@w3.org>, Linked Data community <public-lod@w3.org>
On 22/5/09 19:47, Pat Hayes wrote: >> Yes, that's a great topic for discussion. It is clear that semantic >> drift is a natural part of natural language: a word that meant one thing >> years ago may mean something quite different now. > > And the same is happening with URIs. My favorite example is dc:author, > which when coined was intended to refer to the relation of authorship > between people and things like books, things that would be found in a > library catalog. May I spoil your example? dc:author doesn't exist. Never has. Well, very early in Dublin Core history we had "dc:author". Since 1996 or so it has been "creator". This was because of the early workshop http://dublincore.org/workshops/dc3/ Workshop on Metadata for Networked Images (Sept 1996), where it was realised the DC was useful for images with only modest changes. "The CNI/OCLC Image Metadata Workshop focused on the use of the Dublin Core (DC)to describe images. Consensus formed around the assertion that, with some modifications of element names and definition, the Dublin Core would serve quite adequately for description of a large class of image resources, particularly those that share characteristics with the document-like objects that were the original focus of DC." http://www.dlib.org/dlib/june97/metadata/06weibel.html Cultural heritage, museums etc., applications of DC showed up around the same time. It has been clear for a very long time that dc:creator applies to anything that can be created. DC isn't very discriminating. > But by now, thanks to FOAF, the overwhelmingly largest > usage of dc:author is to state the relationship between a person and > their FOAF home page. Even reading "dc:creator" there, I'm doubtful. Well, it depends on your measure. Perhaps every user on livejournal.com has this markup. Which makes for millions of documents and triples, but ... this could also be changed with a single line of Perl code being updated. A couple of online library catalogues could probably balance all this, or chuck in a museum or two. It's hard to know what kinds of thing to count in these comparisions: triples, documents, consuming apps, producing apps, projects, etc. But anyway, having spoiled your example, may I offer a new one in it's place? foaf:schoolHomepage. This is a property originally created by brits for whom School is where you go until you're at most 18. After which it's off to University, College, Tescos, or whatever YTS schemes are called these days. *However* ... shortly after deploying foaf:schoolHomepage, it became clear that it meant something quite different to USAmericans and presumably others. We started seeing instance data where people were asserting foaf:schoolHomepage between themselves and the homepage of their University. This was unexpected, but not really suprising. Being a pragmatist, I updated http://xmlns.com/foaf/spec/#term_schoolHomepage ... It now mentions this drift explicitly: "The original application area for foaf:schoolHomepage was for 'schools' in the British-English sense; however American-English usage has dominated, and it is now perfectly reasonable to describe Universities, Colleges and post-graduate study using foaf:schoolHomepage." > This is a real social meaning shift, and it > happened without anyone really noticing and without anything breaking or > failing to work. For the DC case, (a) I think the FOAF usage is within the broad and naturally scruffy meaning of dc:creator. Some specific issues and problems were very much noticed, mostly to do with the confusing range of dc:creator (string or thing or Seq, etc), but this wasn't FOAF specific. For the schoolHomepage case, yes the shift was natural and normal, although it was noticed and the documentation eventually caught up with the world. Just as it works with dictionaries. Relatedly, the abstract for the FOAF spec calls out the dictionary analogy: "This specification describes the FOAF language, defined as a dictionary of named properties and classes using W3C's RDF technology." > If the original DC specs had posted a detailed > 'authoritative' ontology, the change would still have happened and it > would still have worked, but there would have been interminable debates > about whether a home page was really a "work" (or whatever the term that > was used), suggestions that FOAF use a different URI, etc., etc.,, all > to absolutely no purpose. Yeah, same with schoolHomepage. We could have had a School, University or Educational Institution class in there from the start, but defining exactly what counts as a University is somewhat fiddly. Do Polytechnics count as Universities? What about the schools and organizations run by Scientology, etc? (They don't call it the pedantic Web for nothing...) > Just look at the interminable and utterly > pointless debate now raging about exactly what an 'information resource' > *really is*, none of which has any bearing whatsoever on how the actual > Web works, even though the latter is actually constructed almost > entirely out of the former. > >> As humans we can >> usually deal with this semantic drift by knowing the context in which a >> word is used, though it can cause real life misunderstandings sometimes. >> >> However, I think our use of URIs in RDF is different from our use of >> words in natural language, in two important ways: >> >> - RDF is designed for machine processing -- not just human >> communication -- and machines are not so good at understanding context >> and resolving ambiguity; and >> >> - with URI declarations there is a simple, feasible, low-cost mechanism >> available that can be used to anchor the semantics of a URI. > > But that begs the question of whether you want them to be anchored. I > suggest that we often don't: that letting them 'drift' in meaning to fit > their usage is exactly what we want to be happening. > >> >> In short, although semantic web architecture could be designed to permit >> unrestricted semantic drift, I think it is a better design -- better >> serving the semantic web community as a whole -- to adopt an >> architecture that permits the semantics of each URI to be anchored, by >> use of a URI declaration. > > And I disagree. Seconded. But perhaps for different reasons. We need to leave some flexibility in the system so that the most useful uses of classes and properties can emerge from experimentation and deployment. > I think this whole idea is based on the insistence of > various authoritative sources upon the naive idea that URIs have to > "identify" things. This has never been the case, in fact, even in the > pre-Semantic web, and its even less the case now. Its a chimera: forget > about it, rather than try to enforce it. What URIs do is fetch chunks of > information. Hardly anyone using the normal Web in the normal way gives > a damn what "thing" their URIs "identify": they only care about what > they are looking at, which is whatever that "thing" sent back to them in > the body of the 200 response, and what that means or what it can do. The > very design of html is all about *hiding* the URIs from users, not about > telling them what it is that URIs identify. The "URIs are identifiers" story is a convenient enough fiction but one for engineers not end users. Trying to nail down what exactly it means for some symbol to name some thing (or identify that thing) is equally doomed online and off. I'm not going to hold my breath waiting for a realist theory of reference to succeed in the cognitive sciences (by which I mean an account for how words or mental gubbins truly come to "refer"). And if I'm not going to wait there, I'm not going to wait w.r.t. Web specs either. People get by assuming that there is some fact of the matter about how it all works, even if there isn't. But all that said, the rough idea that URIs are names for things is useful enough and is all we need. We just shouldn't poke into the details too much 'cos the whole story will unravel... cheers, Dan
Received on Friday, 22 May 2009 18:50:35 UTC