- From: Dan Brickley <danbri@danbri.org>
- Date: Sat, 29 Sep 2007 01:04:13 +0200
- To: Pat Hayes <phayes@ihmc.us>
- CC: Dan Connolly <connolly@w3.org>, Tim Berners-Lee <timbl@w3.org>, Technical Architecture Group WG <www-tag@w3.org>, Susie Stephens <susie.stephens@gmail.com>
Hi folks short version: I didn't (and don't) consider classes and properties "non-information" resources. I don't believe there's a useful and consensual definition of "information resource" that classes and properties won't squeeze into. "/" vs "#" was a choice amongst two bad options; each had scary issues. Other problems were more important so I decided and moved on. Pat Hayes wrote: > >> On Thu, 2007-09-27 at 15:13 -0500, Pat Hayes wrote: >> [...] >>> >Can a city be an HTTP endpoint? >>> >How about a physical book? >>> >a robot? >>> >an integer? >>> >a set of integers? >>> >an RDF Class? >>> >an RDF property? >>> >an XML namespace? >>> >a language (such as XML Schema)? >>> > >>> >Those are the practical questions that I see the community >>> >working on. >>> >>> Surely most of these answers are blindingly obvious. >> >> The long history of the httpRange-14 issue suggests the >> answers are anything _but_ obvious. For me, the interesting question is: are these important distinctions to be trying to make? What breaks if we sweep this debate under the rug? >>> Integer, set (of >>> anything), class, property, namespace, language: all obviously, >>> necessarily, not, as these aren't physical entities. >> >> That wasn't obvious to the dublin core nor FOAF designers; >> they chose hash-less http URIs for dc:title and foaf:name and such, >> and they used to give 200 responses to GET requests there for years, >> despite TimBL's protests. > > Well, but wait. Did they do this because they thought these WERE > information resources or HTTP endpoints, or because (contra or pre- > http-range-14) they thought it was fine to give a 200 code back from a > URI denoting a non-information resource? I suspect the latter. > I felt this way myself until quite recently (cf > http://www.ihmc.us/users/phayes/PatHayes.html) and I still get the > occasional referential quiver. Glad you asked. Here's what happened. In early 2000 I had to pick a URI to put in those xmlns:foaf="http://blahblah" places. And it had to end in some character or other, obviously. This was pre-TAG. It was in the era of drift when W3C had blessed the evocative but vague RDF'99 Model and Syntax spec as a standard, then left things resting for a while as the dot-com world went crazy for XML. RDFCore didn't exist. Many things were uncertain. And RDF was more or less un-used outside of Netscape/Mozilla, Dublin Core and the RSS 1.0 drafts. At the time I was bored and worried by long rambling threads on the RDF lists about reification, anonymous nodes, debates about whether people were resources, how many angels would fit on the head of an http server, and what the "real" URI of an anonymous node should be, what "reification" really meant, etc. (Some things change, some stay the same :) So I thought, okay, let's see how this stuff bears up outside the lab. Can we put descriptions of real existing people in the public Web, link them together, and crawl them back into a usable database. Can we figure out which people they're describing, despite the lack of globally adopted well-known identifier mechanisms for humans (no urn:person:blah blah). Can we use PGP etc to get some more assurance of which people were behind which RDF statements? Can we make a linked Web of machine-readable pages just like we have one for humans... And it turned out we could pretty much do that. And by doing so, certain weaknesses in the RDF toolset and specs became clear: RDF databases at the time tended to simply store triples, and threw away information about where those triples had come from. People like Edd Dumbill who were crawling RDF/FOAF data at the time had to hack provenance/source mechanisms themselves - eg see his writeup at http://www.ibm.com/developerworks/xml/library/x-rdfprov.html ... changes which got reflected into Redland core eventually, and provide use cases for things like SPARQL's 'GRAPH' construct. These strike me as useful areas for to explore. In that context, choosing the final character of the URIs that named our classes and properties was the least of our worries. It was my choice and I chose what at the time seemed the lessers of two evils. Or rather, of two uncertainties. The leading contenders for last-character-in-uri were "/" and "#". My reading of the relevant URI specs at the time made me worried about using "#" because its meaning/interpretation was relative to a media type, and the Web architecture encourages content negotiation of document formats. I wanted to make both human and machine documentation available at the namespace URI, so that seemed a rather unfortunate interaction (particularly because RDFa didn't exist yet). So I went for "/"-based names for classes and properties, ie. as names for things like foaf:Person, foaf:mbox_sha1sum. To me, such classes and properties are not abstract mathematical sets (although they are closely related to the maths). Someone else might define another Person class with exact same instances. But that's someone else's work, ... ie a separate thing. It might even have a different rdfs:label and rdfs:comment. To me, classes and properties in RDF are more or less "works", analagous to a book, poem or song. Some people join orchestras or theatres; others write computer code and ontologies. And the Web architecture - it seemed to me then as now - allows works (such as my homepage, another of my works) to be named or identified with URIs that begin "http://" and end in "/". Before 2000, the thing called http://xmlns.com/foaf/Person hadn't been created. Just the same with homepages etc. So at the time it didn't seem particularly contentious to treat classes and properties as just more kinds of resource that might have Web accessible representations (like Hamlet and the Bible). So I used "/" and life went on. I find it incredibly embarrassing that we are still discussing, all these years later, whether (in effect) there is an important distinction between the Bible and dc:description such that one work can have URIs beginning "http://" and ending "/" while the other can't. There are so many different ways of carving up the world into categories, each with merits and flaws. Why on earth should we hard code one such arbitrary distinction right into the core of the Web architecture? So I welcomed the resolution of HTTP-range-13 as a way of putting this behind us, and adding a few lines of sysadmin voodoo into an Apache HTTP config file was a small price to pay. But I really don't think it should be the business of W3C to try to divide the world into two crisply defined categories and police laws for the spelling of their respective URIs. There are a great many important practical problems out there to address, problems which no organisation but W3C could solve. And this ain't one of them... cheers, Dan -- http://danbri.org/ ps. re DOLCE, we weren't oblivious to such richer models, but the underlying languages of the Semantic Web from W3C simply didn't provide the primitives I wanted for treating time and change properly. Hence the brief prose attempt in http://xmlns.com/foaf/spec/#term_mbox to define "static inverse functional property" as one in which "there is (across time and change) at most one individual that ever has any particular value for foaf:mbox.". The weakness of DAML and later OWL in this regard means that the semantics of OWL alone don't really justify all the inferences we need when merging information using reference-by-description techniques. Trying to model time and change very formally on top of vanilla RDFS/OWL isn't something I'm yet convinced is useful or practical. Would be happy to be proved wrong :)
Received on Friday, 28 September 2007 23:04:25 UTC