RE: working around the identity crisis

> -----Original Message-----
> From: ext Dirk-Willem van Gulik []
> Sent: 23 November, 2004 00:48
> To: Stickler Patrick (Nokia-TP-MSW/Tampere)
> Cc:;; 
> Subject: RE: working around the identity crisis
> On Fri, 19 Nov 2004 wrote:
> > If you're going to restrict each term to being
> > defined by a single document, where that document
> > only describes one term, then why not just
> > have a URI for the document and URI for the term
> > and use conneg to relate them. E.g.
> >
> >      the concept
> > HTML representation
> >  RDF representation
> >  text representation
> >  JPG representation
> Of course if you are -this- http inclined (and you must have 
> a protocol
> like ftp or http which at the very least have some assumptions on the
> hierarchy - which may not be warrented, say, for a z39.50 
> URI) then note
> that you can also use:
> 	Accept: text/plain
> ...
> and/or use a multiple choise response as a server.

Sure. I was merely trying to reflect a set of URIs, not
how one might obtain the same representation via different
HTTP requests.

Certainly you can ask for particular variant representations
of e.g. the concept resource via the URI identifying the concept
resource -- but you still need distinct URIs for each
representation if you want to talk about the representation
distinctly from the concept resource.

Thus, both of the requests

   GET /knowlegebase/chemistry/water
   Accept: text/plain


   GET /knowlegebase/chemistry/water.txt

would presumably return the same representation (in the
latter case, since the request is for the actual representation,
the representation returned in the response would constitute a 
bit-equal copy of the representation resource).

The response for the first request, specifying the concept and
utilizing conneg to indicate the variant representation preferred, 
would presumably include the header


which tells you which representation of the concept you have been
provided in the response.

(it's unfortunate, albeit historically understandable, that the header 
which identifies the representation returned is called 'Content-Location:'
rather than e.g. 'Representation-Identifier:' or similar, but I digress...)

Thus, there is no ambiguity, as all resources are provided distinct URIs,
and clients are free to refer to resources at whatever resolution they like. 

E.g.   [1]                  the entire knowledgebase [2][3]                   HTML rep. of knowledgebase                    RDF rep. of knowledgebase
...              the concept 'chemistry'         HTML rep. of concept 'chemistry'
...        the concept 'water'   HTML rep of concept 'water'    RDF rep of concept 'water'    text rep of concept 'water'    JPG rep of concept 'water'


[1] Note that the above example set of URIs need not correspond to an
actual filesystem, rather all requests beginning with the URI prefix could be handled by a specialized, automated
web service portal which maps request URIs to representations based on an
RDF savvy database, and autogenerates all representations on the fly
(that's how I'd do it).

[2] Note that the representations of the knowledgebase and higher level
concepts could be huge, if representations are "exhaustive" in conveying the
state/substance/definition of the resource. Though, it may also be the case 
that the representation at each 'level' in the conceptual hiearchy would
merely refer to its immediate super- and subconcepts, and thus be fairly
concise -- the HTML representation providing links, the RDF representation 
simply referring by URI, etc. and clients being able to traverse the
knowledgebase via subsequent requests about related/referenced resources.

[3] Note that it is useful to make a distinct between a "traditional" 
representation, which likely would convey an information resource
in its entirety, versus a description, which will provide more concise
information about the resource itself, regardless of the type of resource
and for information resources, regardless of the substance of that

Thus, the response provided to the request

   GET /knowlegebase

is likely to be much larger than for

   MGET /knowlegebase

And also, the description provided in response to the latter
request may very well include alot of information which is not
included in the representation provided in response to the 
first request, as the purpose/utility of the two different
responses are sufficiently functionally distinct.



That said, it's also, I think, important to point out that the following
naming schemes would be just as architecturally valid, and in practice 
would provide just as workable a solution (apart from human users probably
feeling more comfortable with mnemmonic URIs -- though I think alot of
users simply won't care one way or another):           the entire knowledgebase         HTML rep. of knowledgebase         RDF rep. of knowledgebase        the concept 'chemistry'      HTML rep. of concept 'chemistry'     the concept 'water'   HTML rep of concept 'water'   RDF rep of concept 'water'   text rep of concept 'water'   JPEG rep of concept 'water'

or the entire knowledgebase HTML rep. of knowledgebase RDF rep. of knowledgebase the concept 'chemistry' HTML rep. of concept 'chemistry' the concept 'water' HTML rep of concept 'water' RDF rep of concept 'water' text rep of concept 'water' JPEG rep of concept 'water'

I.e. it is *entirely* up to the publisher to decide which URIs will be
used to identify which resources, and whether or not relationships between
those resources should be reflected lexically in the URIs. 

The key is that there is no ambiguity/overloading of URIs, and when known,
the distinct URI of each representation should be indicated in the response.

Web clients should not presume anything about the identity of the resources
or their relationships based on the lexical nature of the URIs, and even
if the client is aware of the proprietary URI management methodologies used
to construct URIs, it is unwise and IMO poor engineering to base client
behavior on the lexical structure of those URIs. Any client that does so
will be fragile and inherently non-portable/scalable to other applications.

A web client that is properly designed should be able to interact with
concept representations and descriptions per all of the above examples 
irregardless of what proprietary naming methodology is employed by the



Received on Tuesday, 23 November 2004 09:04:10 UTC