- From: Bernard Vatant <bernard.vatant@mondeca.com>
- Date: Mon, 07 Aug 2006 12:15:52 +0200
- To: Dan Connolly <connolly@w3.org>
- CC: semantic-web@w3.org, Eric van der Vlist <vdv@dyomedea.com>, Franck Cotton <franck.cotton@insee.fr>
Dan Out of your exchanges, Eric will certainly come out with a smart and elegant solution for serving something else than 404 pages for the various URIs in the namespace http://rdf.insee.fr/geo/, and (almost) everybody and the TAG will be happy with it. Nevertheless, I would like to go further in this discussion. Don Connolly a écrit : > On Fri, 2006-08-04 at 23:53 +0200, Bernard Vatant wrote: > [...] > >> And actually, this should be the general situation in SW publication : >> there is no authoritative, definitive, complete, description of a >> resource, packaged in one file, with a single access point. >> > > Yes, there are authoritative descriptions in the Semantic Web. > Yes, there are. But if you look at them, those are rather exceptions than the general rule. When I write the *general situation*, it means what it means : most resources can't have a *single* authoritative description, for various reasons. More below. > Perhaps not complete definitions, but for a URI such > as http://www.w3.org/2000/01/rdf-schema#subClassOf > any document you get back by doing an http GET > of http://www.w3.org/2000/01/rdf-schema > is authoritative. Indeed. That looks an exceptional case, because the semantics of rdfs:subClassOf are cast in stone in the W3C specification and everyone doing a GET of http://www.w3.org/2000/01/rdf-schema will get the same authoritative description in secula seculorum. Or so you think. Suppose W3C is disbanded some time in the future (no human organisation is forever), and the Web is still there anyway, and the domain w3.org is for sale, and some entity buys it, and does not care about whatever has been published before and puts any silly document at http://www.w3.org/2000/01/rdf-schema. Or, some hacker takes the control of W3C servers and publishes Ben Laden declarations at this page. Or, more simply, the servers are down, and you get 404. How will applications know, in any of those cases that the content is not authoritative any more? I guess all applications using RDFS will not stop running in any of those cases, because they have cached or built-in the semantics of http://www.w3.org/2000/01/rdf-schema#subClassOf. When I use Protégé off-line, it does not stop running because it can't check the semantics of rdfs:subClassOf at runtime, right? > That's how the web works: we all agree > that if you lease/buy a domain name, you get to say what > the URIs starting with http:// and that domain mean, and > we agree that if you run a web server and serve up > documents there, they are authoritative w.r.t. the meanings > of those URIs. > As long as you have control, yes, but anything can happen. Euclide's elements were not made obsolete when they burnt with Alexandria's Library, thanks to many cached copies. > Anybody else is free to say things about rdfs:subClassOf, > but the document that W3C serves up at > http://www.w3.org/2000/01/rdf-schema says that rdfs:subClassOf > is an rdf:Property, and is some other document > says that it's not an rdf:Property, that other document should > be considered in error. > > As I will show in INSEE example, it can happen that the same publisher, under the same namespace, can make different descriptions, with different and even conflicting semantics, of the same entity (defined by the same URI). The RDFS specification is a bad counter-example because it defines entities like logical and mathematical entities, which are defined by time-independent axioms, but don't pretend to represent real-world entities like cities, products or people, of which main characteristic is to be both permanent and changing, like the Ship of Theseus. >> So, the best an URI can do, when its referent is not an accessible >> thing, and that its main purpose is identifying the resource in >> distributed descriptions, if one wants to make sense of it through >> http protocol - since it's an http URI after all - is to get acces >> some information like : "Sorry, what you try to access by this URI is >> not an accessible resource. But its description can be found in RDF >> files X, Y, Z, ...". >> > > That's not the best we can do. > If you use URIs of the form DOC#TERM for non-information > resources, then the information resource DOC can > say things like { <#TERM> rdf:type geo:City }. > > I think you miss here the mots important point. It's not yet another hash-vs-slash discussion. Be it one document or a fragment, the point is that I can't have a *single consistent* description of the resource. >> And the more I think about it, the more I think that the 404 page >> that you get through http://rdf.insee.fr/geo/COM_80078 is close to >> that. Agreed, the current message displayed on the page is suboptimal, >> independently of the fact that it is in French, but replace it by the >> quote I suggest above, and it makes much more sense that any fragment >> identifier. >> > > Really? It doesn't appeal to me at all. > > Well, the point is not to be sexy here, but conformant to what happens in the real worl. Inaccessible means inaccessible. Full stop. >> Maybe in contradiction with what I wrote in a previous message, where >> I suggested that maybe we could have kept the # namespace for the >> ontology, I think now that this argument holds for ontology elements >> as well. Granted, we have now published a single ontology file >> containing a description of e.g., http://rdf.insee.fr/geo/Commune. But >> next year we can have another version, or another ontology defining >> the same entity, with the same URI, at another level of detail, and >> which the publisher would not like to see merged with the previous >> one. >> > > Hmm... I can't imagine why not. Care to elaborate? > You know we got something in Europe called History. It means that things change over time. I heard you have something of the like on your side of the ocean, is that correct? Take for example the class http://rdf.insee.fr/geo/Region. See http://fr.wikipedia.org/wiki/R%C3%A9gion_fran%C3%A7aise to figure that the concept of "Région" in its current definition, as an administrative territory, was defined by a long and difficult process (including resignation of Charles de Gaulle from Presidency in 1969) of which formal achievment is quite recent (1982). Is INSEE had published a geo ontology in the 60's, this class would not have been included, nor the subdivision of regions in departments. But departments were there, most of them were defined by Napoleon about 200 years ago, some of them have changed names over time (Seine-Inférieure became Seine-Maritime), some have been splitted, like Corsica which was splitted in two departments in 1976, etc ... So, a description which was authoritative in 1960 is maybe not more so in 2006. That's why INSEE will publish RDF files with a time stamp, and had it published an ontology and instances back in 1960, some entities would have different - and when I write different I mean not mutually consistent - descriptions. So which description is authoritative, 1960's or 2006's? Both, because I can be interested in information of today, or dealing with 1960 documents. And, as Eric pointed out, I don't want to have new URIs for the entities that are permanent, in each new publication. BTW, what would you recommend to capture the information that a triple which was valid in 1960 is no more so in 2006? Is there a way to put a validity time span on an RDF description, apart of reification? >> There again, packaging considerations naturally lead to define >> several files containing partial descriptions of the same resource. >> > > It seems very unnatural to me to use anything other than a single > static file for the case of an ontology with just a few dozen terms. > Maybe a handful of content-negotiated static files. But not more than > that. > > Just a few dozen terms, yes. But with semantics not so "static" as you would like them to be. geo:Region is not rdfs:subClassOf - The real world apologizes for being so messy, changing and unstable :-) . Bernard.
Received on Monday, 7 August 2006 10:16:20 UTC