Re: Should tags be URNs? (was Re: Proposal: 'tag' URIs) from Tim Kindberg on 2001-05-04 (uri@w3.org from May 2001)

From: Tim Kindberg <timothy@hpl.hp.com>
Date: Fri, 04 May 2001 10:29:10 -0700
To: michaelm@netsol.com
Cc: michaelm@netsol.com, Sandro Hawke <sandro@w3.org>, uri@w3.org
Message-Id: <5.0.2.1.1.20010504071335.03231870@hplex1.hpl.hp.com>
At 10:06 AM 5/1/01 -0400, Michael Mealling wrote:
>Ok, I finally have time to get back to this...

Ditto!

Rather than give ad hoc answers to each point that has arisen in responses 
so far, I'll try to piece together what Michael and I have said and also 
bring in some things that Al Gilman said subsequently. I'm thereby missing, 
no doubt, some important things that others have said.

The topic is "resources and Resources, bindings and Bindings". But my goal, 
still, is answering the question  "Are tags URNs?".

Michael:
>I'd actually say that the etching of the number on the frame wasn't
>the actual binding. The binding is abstract.

>'identify a resource' is something that the URI does inherently.

>A URI identifiers one and only one abstract Resource (note the 
>capitalization).

>It is up to the Resource to be defined as being plural or not.

This view of naming becomes more mysterious to me rather than less: 
bindings ('Bindings') are 'abstract' and Resources are 'abstract'; URIs 
just do identify.

I would say that, like any 'name' or 'identifier', URIs aren't 
intrinsically identifiers: they're intrinsically strings that are unique in 
some context; it's how they are used that make them identifiers.

I also want to say that anything that can be usefully expressed in our 
domain can be specified in terms of sets and functions (alright, alright: 
categories). I would like to rid us of 'abstract' and 'conceptual'  and 
replace them with simple mathematical objects that we can all understand, 
reason about and implement inside machines. I'll try and do that, below.

Al Gilman says:
>For
>search URLs there is no persistent identity, no "capital-R Resource posessing
>persistent identity" which is "referred to."

In general, as I argue below, URLs map to time-varying functions or calls 
to functions. In other words, I agree.

Then Al says:
>.....In other words, names are
>descriptions.  But not all descriptions are names.

I cannot understand the first part of that. If I refer to 'John Fandago' 
(of whom you've never heard),  you have a description of .... what? Anyway, 
you're no doubt aware that we could consult e.g. Bertrand Russell for a 
philosophical treatment of names vs. descriptions. And then we could read 
all the people who repudiate Russell. The question I want to answer is not 
the philosophical one but: What are we going to do about these issues as 
they affect our system design?

Al also says:
>'Identifiers' is a good mnemonic handle for URIs but not, if literal
>construed,
>a satisfactory definition.

Agreed.

Let me try and specify the system(s) we're talking about. Then you can all 
tell me how wrong I am. But I hope we'll all be more precise.

First, there's a fiction that helps keep the specification simple without 
losing, I claim, anything essential. The fiction is that no URLs have a 
suffix "?...". I'll assume that any use of a URL such as 
"http://champignon.net/cgi-bin/foo?doc=myDoc" in HTTP can be replaced by 
the URL "http://champignon.net/cgi-bin/foo", to which the "content" 
"doc=myDoc" is posted. I could get rid of this fiction but the spec would 
be messier.

Now I need to define some sets:
URI = set of all URIs (syntactically defined, not necessarily bound to 
anything)
URN = set of all URNs (syntactically defined, not necessarily bound)
URL = set of all absolute URLs (syntactically defined, in my restricted 
sense, and not necessarily bound)
CONTENT = set of all (MIME-typed) content.
rESOURCE (note lower case 'r') = set of all web resources, i.e. whatever 
corresponds to a URL (with my restricted definition) and thus can be 
accessed by HTTP.
RESOURCE = set of all Michael's Resources.

To avoid defining subsets of URIs later on, I'll take rESOURCE  and 
RESOURCE to include the special value "No r/Resource". In what follows, any 
identifier that's unbound maps to that special value.

(1) locate(t)
This is the function that takes a URL and produces a resource:
locate(t): URL --> rESOURCE
'Locate is just the normal function of the Web. resource(url, t) is the web 
resource that we get when we use locate on url at time t. Since the owners 
may substitute some other resource at a given URL, this really is a 
function of time.

(2) resource(url, t)
This is also a function: it takes elements of some subset of CONTENT 
(supplied by HTTP) and its image is a subset of CONTENT (returned by HTTP).
E.g. the web resource at http://www.google.com/search is a function that 
takes content of the form "q=URI" and produces HTML content.
The web resource at http://champignon.net/TimsDoc.html is a (trivial) 
function that takes "null" as a content value and produces some HTML content.

(3) The family of resolution functions, resolve-i (i elementOf I)
resolve-i(t) : URI --> URL
These correspond to all the ways we can devise of taking a URI and 
producing a URL from it. We can build functions from URI to rESOURCE, via 
resolve-i(t) and locate(t).
In my view of the world, this completes the specification of relevant 
functions. Extensionally, each function resolve-i is a set of Xbindings -- 
<uri, url, stuff about this binding, e.g. who asserts it>.

(4) The family of functions Resolve-i (i elementOf I')
Resolve-i(t) : URI --> RESOURCE
This seems to me to be Michael's view of the world. Myself, I've no idea 
what's in RESOURCE. But, whatever those Resources are, Michael says (a) 
that this function is 1-1 and (b) that each Resource provides a family of 
what we might call 'manifestation' mappings, manifest-j : Resource |--> 
resource, which associate the Resource with various resources that manifest it.

(5) The function ResolveURN (one of the functions in (4))
ResolveURN : URI (really, URN) --> RESOURCE
The only thing I know about this function from Michael's description is 
that it is constant (not a function of time), since URN are designed to be 
persistently bound.

Perhaps Michael can comment on (4) and (5) as expressions of his system 
model. I don't see what we've gained by interposing Resources. It seems to 
be an attempt to incorporate 'conceptual harmony'. I believe that that is 
an artifice that clouds the issues. (2) and (3) give us more flexibility 
and they are concrete.

Further to (3), there's also what I earlier called 'original' bindings -- 
and probably should call, following a remark that Sean made,  'proprietary' 
bindings. I gave the example of a bicycle with a serial number etched onto 
it. Proprietary bindings are quite different to Xbindings. They occur where 
the identifier is part of some content and is, by convention, used as an 
identifier of that content. E.g. the identifier appears in the header of my 
document, or it's attached to a physical object such as a bicycle or a 
museum exhibit. Physical objects, I now realise, are analogous to content, 
not (web) resources. That's because they're not functions in themselves, 
they're stuff.

I said, & Michael commented:
> > Now, *you* may bind that same identifier to whatever you like. It's your
> > (or your community's) business. But *you* may not 'originally bind' your
> > tag to *my* resource.
>
>Ok. That just confused me. What identifier is 'the same identifier'?

I should have said: you may Xbind the proprietarily-bound identifier that 
appears in my content to any resource you like. That's your (or your 
community's) business. But you may not proprietarily bind an identifier 
into my content (i.e. overwite or maybe even ambiguate it). Nor may you 
proprietarily bind an identifier minted by me into your content.

That reflects, I believe, existing laws of property and copyright, except, 
perhaps, the last sentence. If you steal my document's proprietary 
identifier and make it the identifier of your document, you insert entropy 
into the resolution system that I am proposing. You may also libel me.

Does any of that help?

Tim.

Tim Kindberg

internet & mobile systems lab  hewlett-packard laboratories
1501 page mill road, ms 1u-17
palo alto
ca 94304-1126
usa

www.champignon.net/TimKindberg/
timothy@hpl.hp.com
voice +1 650 857 5609
fax +1 650 857 2358
Received on Friday, 4 May 2001 13:29:59 UTC