Dan

Just some additional remarks to hit the nail a bit more. (I do not cc to xg-geo because I think we deal with a more general SW issue here).
Not only do I fully agree with whatever Eric has answered below, but I think it's a very good illustration of the distinction between reference and access, as Pat Hayes has clearly expressed in his "In Defence of Ambiguity" paper [1]. I don't know if Pat is lurking at this thread, I hope I will correctly illustrate his viewpoint by our use case.
An instance of geo:Territoire is supposed to reference some "real world object". Let's call it a "non-information resource" if you like. Every description (RDF or otherwise) of its referent (resource if you like) is partial, each of them represent some viewpoint, some facet of it. The fact that we have packaged the RDF in such a way that a city has parts of its description distributed in different files is a deliberate feature, designed out of the knowledge the publisher has of customers needs. Customers and applications need descriptions at various levels of details, and at various moments in time, there is this versioning issue mentioned by Eric.
And actually, this should be the general situation in SW publication : there is no authoritative, definitive, complete, description of a resource, packaged in one file, with a single access point. So, the best an URI can do, when its referent is not an accessible thing, and that its main purpose is identifying the resource in distributed descriptions, if one wants to make sense of it through http protocol - since it's an http URI after all - is to get acces some information like : "Sorry, what you try to access by this URI is not an accessible resource. But its description can be found in RDF files X, Y, Z, ...". And the more I think about it, the more I think that the 404 page that you get through http://rdf.insee.fr/geo/COM_80078 is close to that. Agreed, the current message displayed on the page is suboptimal, independently of the fact that it is in French, but replace it by the quote I suggest above, and it makes much more sense that any fragment identifier.

Maybe in contradiction with what I wrote in a previous message, where I suggested that maybe we could have kept the # namespace for the ontology, I think now that this argument holds for ontology elements as well. Granted, we have now published a single ontology file containing a description of e.g., http://rdf.insee.fr/geo/Commune. But next year we can have another version, or another ontology defining the same entity, with the same URI, at another level of detail, and which the publisher would not like to see merged with the previous one. There again, packaging considerations naturally lead to define several files containing partial descriptions of the same resource.

I'm well aware this is highly controversial and certainly not in tune with the TAG recommendation on httpRange14 issue.

Bernard

[1] http://www.ibiblio.org/hhalpin/irw2006/presentations/HayesSlides.pdf

Eric van der Vlist a écrit :

Dan,

Le vendredi 04 août 2006 à 09:32 -0500, Dan Connolly a écrit :

On Fri, 2006-08-04 at 09:28 +0200, Eric van der Vlist wrote:

Hi,

Hi Eric,

Le jeudi 03 août 2006 à 23:26 +0200, Bernard Vatant a écrit :

Dan

did you consider using # rather than /? i.e.
  http://rdf.insee.fr/geo#code_commune
rather than
  http://rdf.insee.fr/geo/code_commune
especially for ontologies, it's a lot easier to manage.

We did consider. Actually my first version of the ontology used a #
namespace. Eric (in cc)  was the one who suggested a / namespace,
especially for the data and somehow convinced the rest of us. That was
six months ago, but if I remember correctly, the idea was that at some
point, each instance URI would  be (should be, hopefully will be)
associated  with, and access to, a  separate resource, which is not
the case now.

Yes, that was the first comment I did on your first proposal end of
January.

The idea was that to identify a city, http://rdf.insee.fr/geo/COM_80078
is better than http://rdf.insee.fr/geo#COM_80078.

You might also consider http://rdf.insee.fr/geo/COM_80078#city for
the city itself and http://rdf.insee.fr/geo/COM_80078 for a document
about the city.

If the cities come in natural chunks, perhaps
http://rdf.insee.fr/geo/COM_800#city78
for the city and http://rdf.insee.fr/geo/COM_800 for a document about
the cities in some region.


You mean that we should use the same URI to identify geographical
entities and locate the fragment where there are defined?

We have rejected this idea for a number of reasons. I think that the
most important of these reasons is that it would assume that the entity
is described at only one location in only one RDF document and that's
not true in our case.

If you take an entity such as a city, this entity can be located over
two higher level entities and its description is then split between the
different higher level entities to which it belongs.

Even when a city belongs to only one higher level entities, important
pieces of its description can be found in the description of the
different layers of higher level entities and the description of
entities such as department is spread over four different documents.

We also think that splitting entities into RDF documents is a packaging
issue that may evolve over time and shouldn't impact the URIs
identifying the entities.

Furthermore, we believe that hard coding the links between entities
identifiers and RDF documents would make the version management of these
documents more complex. We have included a year in the URIs for the RDF
documents so that we can easily publish new versions and keep the
previous one (an "old" version carries valid information about the
ontology for a specific date and we think that it should remain online).
And of course, we wouldn't like that the URIs identifying the entities
change over time.

 Of course, these URIs 
are only identifiers but who konws, we might want some day to publish
some kind of documentation (like we do in RDDL to document namespaces)
at these URIs.

"only identifiers"? sigh. I got the impression you wanted to publish
information about them in the Semantic Web.


These are semantic information conform to the W3C recommendations and
published on the World Wide Web. Isn't it sufficient to be part of the
Semantic Web?

If we do so, the first URI makes each city a standalone entity while the
second one means that they need to be fragments in a huge document which
can cause a lot of issues (we don't know which media types we might want
to publish and the definition of fragments is inconsistent between media
types

It's within your control to choose media types where the definition
of fragments is consistent. The easiest way is to just use one
media type: application/rdf+xml .


What we have in mind for these URIs isn't necessarily limited to RDF but
could include XHTML documentations or other kind of resources. Both RDF
and XHTML can be published at the same location using content
negotiation... What I meant by being inconsistent between media types is
that if you use content negotiation you need to make sure that each
content has the same fragments which is a further complication.

BTW, If we ever serve RDF at these addresses, I guess that it would
kind of placeholders with seeAlso attributes to point to the different
documents in which an entity is described rather than the actual
definition of the entity.

 (some of them don't even support fragments), the document might
grow very large, ...). 

Now, the thing that we've not considered is to have a namespace URI
different from the RDF base.

Agreed, we could have kept the # namespace for the ontology at least.

Dan, can you elaborate why that makes ontologies a lot easier to manage?

Because with a # namespace, publishing the ontology just involves
sticking one static file on a web server. (the URI looks nicer
if the web server can handle leaving the .rdf or .owl off, but
that's not completely essential).

And then to look up http://rdf.insee.fr/geo#code_commune , a consumer
just GETs http://rdf.insee.fr/geo as usual; then when they want
to look up another term such as http://rdf.insee.fr/geo#subdivision,
they can save a round trip because they already have it.

Using a / namespace has a higher cost for the producer (redirects)
and for the consumer (one GET per term rather than one GET
for the ontology).


That's true only if you assume that these identifiers are also used as
locations...

I know that this is a highly controversial debate, but I have always
thought that the big advantage of RDF over XML vocabularies such as
XLink is that it differentiates the two notions and I wouldn't want to
loose this benefit!

Thanks your clarifications!

Eric