W3C home > Mailing lists > Public > www-tag@w3.org > July 2003

Re: resources and URIs

From: Tim Berners-Lee <timbl@w3.org>
Date: Thu, 31 Jul 2003 17:38:30 -0400
Cc: www-tag@w3.org
To: Norman Walsh <Norman.Walsh@Sun.COM>
Message-Id: <53B6124E-C39F-11D7-AB0E-000393914268@w3.org>


On Thursday, Jul 31, 2003, at 15:46 US/Eastern, Norman Walsh wrote:
>
> / Tim Berners-Lee <timbl@w3.org> was heard to say:
> | As a web user. when I quote you
> | http://www.galegroup.com/free_resources/poets/poems/kublakhan.htm
> | What do I expect of this URI when I dereference it?
> |
> | 1. - That it is always something *about* stately pleasure domes?
> | 2. - That it is always the poem "Khubla Khan"?
> | 3. - That it is always a poem "Khubla Khan", as published in the
> | Thompson/Gale collection?
> | 4. - That it is always a particular set of bits?
> |
> | The answer is  2 or 3.  That is what we expect to be invariant for 
> the
> | URI.
> | That is the basis on which I make hypertext links.
> | That is the basis upon which URIs are used.
> |
> | That is why we say the URI identifies or denotes the poem.
>
> Right, so I might say that it is the URI for the poem Khubla Khan, by
> which I mean the poem as an abstraction, not the document that
> contains the poem.

Well, think: which do you actually expect to get representations of?  
(3).
3.  "Khubla Khan"  as published on that website.  People will bicker 
over words, but the "document" may be the word you are looking for 
here. "Contains"?  This particular one is a free, advertising sponsored 
publication by a particular group.   The URI does not identify "Khubla 
Khan" the poem in the total abstract, as in fact there is more 
consistency here than it being that poem. And more (extraneous to you, 
important to the,) information.  Theer are other differences - te web 
site owner has some IPR in that page, but Coleridge is the only one who 
owns IPR in the first.

> Does that square with your thinking on httpRange-14?

Yes, if it is (3).

> | I think that the word "representation"might have been historically
> | unfortunate as it suggests "portrayal" (or a subject) rather than
> | expression (of a message).
>
>   Expression:
>
>   1 a : an act, process, or instance of representing in a medium (as 
> words)
>      : UTTERANCE <freedom of expression> b (1) : something that 
> manifests,
>      embodies, or symbolizes something else
>
> I'm not sure "expression" is going to help much if we're looking for
> English language definitions that capture the distinction you're
> looking for because in discourse we are quite able to use the symbols
> without confusion.

Indeed, english does not come with words for the whole web architecture.
We will have to do that spec-writing thing and nail down some local 
meanings
for the purposes of the arch doc.

> I don't see why I can't say that http://norman.walsh.name/ symbolizes
> me (I grant that it may not be the best choice of identifiers given
> the ambiguity it introduces, but I can still do it.)

This is called Indirect identification. It is normal, everyone does it.
One does not have to be ashamed of it.

You go one to describe how human beings tend to use terms loosely.
We can't let that guide us in nailing down some crisp terms for the 
arch doc.

[...]
> No matter what we say, some people will want to use identifiers that
> way. The longer we talk about this, the more convinced I become that
> the most we can possibly do is to explain why this might be a problem
> and encourage people to avoid it unless they have good reason.

We are not trying to control how people talk to each other
at the bus stop.  We are trying to get engineers to write programs
which interoperate. These are technical specs.


> On the topic of good reasons, I was talking to Dan Brickley this
> morning and he pointed out that WordNet[1] uses "/" instead of "#" in
> identifiers. That's a case where I think the use of "/" is entirely
> justified. It means that I can assign the WordNet URI to the "wn" 
> prefix
> and refer to wn:City, wn:State, wn:Person, etc.

That you can do if it used the hash too.

> If I used "#" instead of "/" it would be practically impossible to get
> the RDF that described each word (it would mean downloading the
> *entire* Word Net database as a single document and extracting the
> relevant bit). Technically better, perhaps, but practically impossible
> given the size of Word Net.

I would suggest for this corner case downloading a file which
gives instructions in some rule language as to where to find
the word in question.

I wonder what Cyc does.

Other good things to do include responding with something which says
that the is a service available for querying for things of this form.

I would note that word net is a set of words but does not see, to be a
RDF ontology like cyc.  It is more a set of english words.
I agree it is nice to have URIs for them, though.

It is not a normal case at all.

The document is a bit weird:

    <Aarhus>     a :Class;
          :description "port city of Denmark in eastern Jutland";
          :label "Aarhus";
          :subClassOf <> .

Aarhus is a class, says the RDF, and  city, says the description.
Apart from their weird use of class and subClass, though....

Aahus is a subclass of the document.
The empty URIref <> always refers to the current document.
It is actually defined that way in the URI spec.
So the problem of doing it this way is apparent immediately.


> The fact that I can GET http://xmlns.com/wordnet/1.6/City and retrieve
> RDF that defines "City" is so valuable, I'm quite willing to live with
> the contradiction that if you choose to assert axiomatically that
> http://xmlns.com/wordnet/1.6/City must be a document, you're saying
> that the URI is both a city and a document.

We have different priorities.  You are happy to make the whole system
inconsistent because something is convenient in a corner case.
I am not.  I'd prefer to do though more hoops in the wordnet case.


>                                         Be seeing you,
>                                           norm

Tim
Received on Thursday, 31 July 2003 17:38:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:19 GMT