Re: I-D ACTION:draft-daigle-uri-std-00.txt from Dan Connolly on 2000-09-07 (xml-uri@w3.org from September 2000)

From: Dan Connolly <connolly@w3.org>
Date: Thu, 07 Sep 2000 12:07:49 -0500
To: "Simon St.Laurent" <simonstl@simonstl.com>
CC: XML-uri@w3.org
Message-ID: <39B7CB65.AA428D54@w3.org>
"Simon St.Laurent" wrote:
> 
> At 08:35 AM 9/7/00 -0700, Henrik Frystyk Nielsen wrote:
> >Resources are first class objects - you identify them using URIs. When
> >describing or talking about a resource, you use the URI to refer to that
> >resource.
> 
> Do you really?  What resource exactly does the URI
> http://www.w3.org/1999/xhtml identify?

The one identified by that URI. It's not clear to me
that you need to know more about this resource other
than that it can be identified by that URI.

If you do, can you explain why?

If you want to know more about it, one of the
things you can try is to access it. In the
case of this resource, you're likely
to learn a little bit about its current state:

	"This is an XML namespace defined in the XHTML[tm]
	1.0 specification. [...]
	This namespace may change without notice. 
	[...]
"


> >You can never get to the resource - you can get a manifestation of the
> >resource - for example by performing an HTTP GET request on it. A resource
> >can have any number of manifestations - think of each manifestation as a
> >snapshot of a living thing: you can take as many snapshots you like - some
> >may be the same and some may not.
> 
> It seems, however, that we may have different manifestations based on the
> context within which a URI is used

Yes... if you do your GET request differently, you
might get a description in french or in XML Schema syntax.
W3C (i.e. the HTML WG) hasn't guaranteed that its state
doesn't change over time either, so you might
discover that this namespace is specified
by XHTML version 62 if you ask at some later date.

But it's the same resoure, regardless of what
you learn about it over time.

> - and no clear picture at all of what
> the resource might actually be.

Clear enough, no? Just like you have a clear
enough picture of the number 42. If two
numerals are the same (digit for digit,
with an agreement about what radix we're using)
then the numbers they denote are the same.
But I don't know much more about the numbers --
I don't know what color they are, if they are
colored at all. Or what part of the plan they're
on, if they're on this planet at all.
But I can happily add and subtract them and
go about my business.

> I'm tired of koans.

I suggest that you deal happily with relationships
much like the relationship between resources
and URIs all the time. Numerals and numbers,
code points and characters, noun
phrases and the concepts they denote, etc.

Numbers are simpler that resources in that they
don't have state. Resources, in general, are
more like people or scheme procedures. They
have state that may change over time and may be
hard to observe completely. But we have
a cultural agreement that "Daniel W. Connolly,
born 9 Dec 1967 in Kansas City, MO to
John and Marilyn Connolly"
refers to the same person
over time. If he commits a crime one day,
you can hold him accountable for it several
days later, even though all of the molecules
that committed the crime have been flushed
down various toilets, flaked off into the
wind, etc.




> >I don't believe it mentions anywhere that you are describing URIs because
> >that wouldn't make sense.
> 
> Then I suppose Namespaces in XML is foolish for using URI references in a
> fashion that ignores the resource (or fails to define the relationship
> between the namespace and the resource) entirely...

Why? The functionality that "Namespaces in XML" gives
us is to associate a URI with some of the elements
and attributes in an XML document. That's all,
but it turns out to be handy for all sorts of stuff.

Why is it foolish?

> >A baseline comparison is exactly what RFC 2396 defines - you keep saying
> >this - what is it that you don't see defined?
> 
> if (uriOne==uriTwo) {
>    processing
> }
> I'd like a simple baseline definition for what exactly that == is supposed
> to be, and what != would be, without requiring reference to every document
> describing a scheme.

If uriOne and uriTwo are java strings that
conform to the syntax of an absolue URI,
then java's String.equal will work just fine.

If uriOne and uriTwo are ANSI C char*s
that point to strings encoded using US-ASCII,
then strcmp() will work just fine.
(other encodings will work too, as long
as they're handled consistently.)

This code snippet doesn't motivate any need
to compare resources; just a need to compare
URIs.

There are some sort of odd cases like
	"http://www.w3.org/1999/xhtml"
		=?=
	"http://www.w3.org:80/1999/xhtml

where if you read the spec for the relevant URI
scheme, you'll discover that those two URIs are
guaranteed to denote the same resource. If
you're writing a caching proxy, you'll probably
want to take advantage of that knowledge to
increase the hit-rate of your cache. But if
you're developing generic URI processing
stuff, like XML namespaces, you don't bother
with such arcana; you just conclude that
"no, those are distinct character sequences,
and hence different URIs."


> Section 6 of RFC 2396 is inadequate in circumstances where applications
> must deal with URIs of more than one scheme - especially if those URIs
> don't "use elements of the common syntax".  I'll take byte-by-byte or
> case-insensitive as a foundation happily, but it has to apply across the
> board.  It clearly doesn't, at present.

strcmp()/String.equal doesn't? That's news to me.

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Thursday, 7 September 2000 13:09:10 UTC