RE: Does a URI identify a "web page"? from Bernard Vatant on 2003-01-25 (www-tag@w3.org from January 2003)

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Sat, 25 Jan 2003 23:13:41 +0100
To: <www-tag@w3.org>
Message-ID: <3E26DA70005788B6@mel-rta8.wanadoo.fr> (added by postmaster@wanadoo.fr)
Hello

I've been attracted to this debate because the noise it makes has been
heard as far as in my original topic maps land :)

Maybe those things have been already told here - anyway. This is more or
less a copy of a message I send to OASIS XRI list.

To well understand what follows, first remind that the notion of subject
is the most generic in topic maps. It's "whatever anyone cares to speak
about", be it abstract, concrete, generic or individual, real or
imaginary, network-retrievable or not...

The topic maps standard clearly provides distinct ways to deal with
so-called "addressable" subjects (network-retrievable resources) and
"non-addressable" subjects (abstract entities or physical individuals). 

Both can be identified by URIs. Let's see how it works using the example
of David Booth at
http://www.w3.org/2002/11/dbooth-names/dbooth-names_clean.htm

The URL "http://x.org/love" can be used in topic maps land to identify
two subjects among the four that David quotes: subject 2. and subject 3.

Subject 2. is not addressable. It is a certain "concept of love", that
is supposed to be expressed, defined or at least "indicated" to human
users when they de-reference the URL. 

Subject 3. is exactly "whatever the hell you get when you de-reference
the URL". Of course the actual content of that can evolve with time,
because the conception of love expressed by the editor of that resource
can evolve when (s)he advances in age :)) So it's not a document, and
the issue how to identify Subject 4 (the document which is actually
there when I get to the address) is not really addressed in that scheme.

Think about an exemple like http://meteo.org/myplace/today
I guess the document is changing, unless I live on the Moon, but the
subject indicated is well identified: "today's weather in my place".

How do you express that in topic maps land?

Let's start by subject 3. Let's define a topic to represent this
subject.
This is basic in topic maps. When you want to speak about a subject, you
represent it by a topic. Cool :)

How do I express that my subject *is* a resource? 
In XTM syntax, I will write the following:

<topicMap>
  <topic id="subject3">
	<subjectIdentity>
		<resourceRef xlink:href="http://x.org/love"/>
	<subjectIdentity>
  </topic>
</topicMap>

Now for subject 2. I define a topic of which identity is "indicated" by
the same URI:

<topicMap>
  <topic id="subject2">
	<subjectIdentity>
		<subjectIndicatorRef xlink:href="http://x.org/love"/>
	<subjectIdentity>
  </topic>
</topicMap>

In that case, TM terminology is that http://x.org/love is for subject 2
a "subject identifier", and the resource itself is a "subject
indicator".

So the same URL is used to identify two distinct subjects, but with no
ambiguity, due to the syntactic context in which it is used.

Bottom line: a URL, like any other kind of character string, does not
identify any subject by itself, but can be used as an identifier if you
provide a clear and unambiguous identification process. And since
process can be different, the same string can be used different ways to
identify different things. This is not specific to URLs, it is the same
with whatever naming or identification system. "2003-01-24" is only a
character string, and it does not identify the current day, out of
context. 

Hope that helps

Bernard

============================================
Bernard Vatant
Knowledge Engineering
Mondeca - www.mondeca.com
OASIS Published Subjects Technical Committee
www.oasis-open.org/committees/tm-pubsubj
============================================

|-----Original Message-----
|From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] On Behalf
Of
|Tim Bray
|Sent: jeudi 1 janvier 1970 09:24
|To: Tim Berners-Lee
|Cc: Roy T. Fielding; Sandro Hawke; www-tag@w3.org
|
|
|Tim Berners-Lee wrote:
|
|> A model in which URIs identify web pages is a REST model,
|> and a rather better one than one where they don't.  I would
|> really like you to work through the model and see that it doesn't
|> break anywhere. You may have to introduce another term.
|> But you;ll end up with something much more useful IMHO.
|
|Anybody who isn't, by this point, bleeding from the brain over this
|thread is a much stronger person than I.  I have made a resolution that
|I will go back and read all the messages once or twice more and think
|about it some more, but in the interim here are some more data points.
|
|I've always been really sympathetic to what you might call the Fielding
|position ("web architecture doesn't know/care what a resource *is*, it
|just compares URIs and interchanges representations").  The corollary
is
|that since people obviously care what a resource is, we need to
|establish some policy to keep things manageable ("Cool URLs don't
|change") and some mechanisms to talk about what resources are (RDF &
the
|rest of the semweb stuff).
|
|While TimBL's world-view seems consistent, I just have real trouble
with
|the notion that http: URIs necessarily identify web pages, because it
|seems to me that there are lots of them that just don't.  Let me give a
|couple of examples.
|
|1. Antarctica's Visual Net
|
|This is the application that my company sells, of which I wrote a large
|part.  It is implemented as an Apache module, and presents maps of
|information spaces.  For a large information space with millions of
|objects, clearly an effectively infinite number of useful maps can be
|drawn.
|
|Each of those maps is URI-addressable (with a certain amount of
|"?arg=value&arg=value" in the URI, but that's fair), and each
|dereference request provokes a really complex flurry of computation
|against a bunch of volatile in-memory data structures, some really
|aggressive user-agent sniffing, and the emission of  pure HTML with
|bitmaps, pure HTML with a bunch of vector graphics code, or pure XML
|with no graphics code at all, in two different possible XML dialects,
|and in the future likely something completely different.  When we
|generate XML, the representation is of almost no direct use and needs
|further processing on the client side (in XSL or the Flash MX engine or
|a 3d renderer) to be useful.
|
|We violate REST in that we use cookies, but we try really hard to pack
|as much of the map identification into the URI as possible.  We *hope*
|that the Web's caching machinery will keep clients from stupidly
|re-dereferencing a map in the interests of keeping our server loads
|manageable.  In some deployments, when you drill way down into the maps
|at a high level of detail, the next drill-down URI into the map space
|might well decide to branch into the underlying data store (ERP
systemk,
|library catalog, whatever) use its output as the representation.  We
|reserve the right in future to invent new kinds of representations that
|we can't begin to imagine now.
|
|Anyhow, no matter how far I turn my head sideways and squint, it just
|doesn't feel correct to say that the URIs pointing into one of our map
|deployments represent, in any meaningful sense, a "web page".  That is
|to say, the representation returned by any one dereference is not
|fundamental; it is ephemeral and neither the users nor the programmers
|would for a second consider it to "be" the resource.  It feels
perfectly
|comfortable to say that each of these URIs identifies a resource and
|that our software emits representations.  It feels perfectly natural to
|make RDF assertions about particular URIs in the space without worrying
|about what representation you might see next.  I'm sorry, I don't think
|these URIs identify web pages; they identify resources.
|
|2. XML Namespace Names
|
|Namespace names are URIs, and they were chosen this way back in 1999
|largely (in the XML community) because of their useful syntactic
|uniqueness properties and (in the nascent RDF community) because of the
|emerging grander ambitions for URIs.
|
|For some years, I steadfastly argued that these URIs were just names
and
|don't you worry your pretty little head about what they point at.  This
|position turned out to be untenable; the user population really wanted
|to dereference these and get something back.
|
|So now we're arguing about what representations to return and the
|various flavors of RDDL.  Well, if you consider that an XML Namespace
is
|a Resource, there's no inconsistency or angst here.  The resource
|previously was typically without representations and still worked OK;
|and now it turns out that a RDDL document will likely be a very useful
|representation of that resource.  Dan argues hotly that an XML Schema
is
|a useful representation of a namespace-name resource and despite the
|fact that <snicker> he's clearly wrong about it being useful, it is
|undeniably some kind of a representation.
|
|Once again, no matter how hard I try, it's easy to believe that XML
|Namespaces are resources, but really hard to believe that they're web
|pages.
|
|Concluding notes:
|(a) In both of my examples, the resources identified by the URI map
|fairly nicely onto the actual meaning of the English word "resource" -
|one of Antarctica's maps is a resource in human-speak (that's why
|people pay for the software), and if an XML Namespace (typically a
|pre-coooked XML vocabulary with pre-cooked semantics) isn't a resource
|as the word is normally used, I don't know what is.  My point is not
|only is the Fielding formalism useful to programmers and
|self-consistent, the terminology is useful to ordinary people.
|
|(b) In my vision of the semantic web, it makes all sorts of sense to
|package up RDF assertions about Antarctica's maps or XML namespaces and
|these could be really useful without pretending, against the evidence,
|that either kind of URI actually points at a "web page".
|
|-Tim
Received on Saturday, 25 January 2003 17:18:17 UTC