W3C home > Mailing lists > Public > www-talk@w3.org > January to February 2002

Re: discovering site metadata

From: Al Gilman <asgilman@iamdigex.net>
Date: Fri, 08 Feb 2002 15:36:01 -0500
Message-Id: <200202082035.PAA982756@smtp1.mail.iamworld.net>
To: Mark Nottingham <mnot@mnot.net>
Cc: Dan Connolly <connolly@w3.org>, www-talk@w3.org
At 12:32 PM 2002-02-08 , Mark Nottingham wrote:
>On Fri, Feb 08, 2002 at 11:59:20AM -0500, Al Gilman wrote:
>> >I think it's much more than that; it's a resource that is the
>> >hierarchical root of all other resources made available by that
>> >authority.
>> This is commercial custom but not part of the technical specification. 
>Sure it is; RFC1808. (and why just 'commercial' custom?)

I guess I assume that site ego is a commercial hazard.  Then again it is Government sites that are determined to warn you when you leave their site; so I should probably re-think that slur.  Ego is endemic.  My bad.

The semantics I find in RFC-1808 deal with BASE and where you find the BASE and how you derive the equivalent absolute URL from the relative URL and the BASE.  Else, not.

Nothing that says anything strict about relationships between the senses of resources that appear higher and lower in the hierarchical structure of a particular URI-space that uses the path part of the URI syntax.

Now, actually, I agree with you 200% that the fact users suspect that the stuff in a tree URI space is likely to be a hierarchical, organized sort of some information _is an asset we should exploit_.  That we should pump the heuristic appeal of "seek up the tree for help and orientation."  For this reason one of the practices that I recommend is that when an HTTP request _fails_ and the requested URI is of the form of a directory request, that is to say ends in a trailing '/' character, that the 404 or 403 response should bear a body containing a link to the site index, and specifically a context-specific link into the site index where it talks about the most immediate or local actual site-organization grouping containing the location that failed.

But we have to provide the same kind of information for resource-representations whenever such representations are encapsulated and communicated, not just when they are located in hierarchical namespaces where it is convenient to associate common properties with the least common ancestor node in the tree.  Site metadata are simply assertions about resources within a [conveniently shaped] scoped region of URI space.  Not every resource has a 'site' and the 'site' is not necessarily the actual enclosing scope over which the context-climate of knowledge is uniform.

>> Hierarchy in the namespace is a syntactic convenience as far as
>> URIs are concerned, the sense as nested contexts is a good guess
>> based on practice but not a requirement to use this form of URL. 
>> Compare with URL-encoded script parameters that use path segments
>> rather than the searchpart syntax.  The sense of the sequence of
>> path segments is at the discretion of the service offeror.
>Of course you can stuff anything you want in URIs; My site can be
>composed of
>if I really want to. However, the Web is architected to take
>advantage of the relationships between resources and their

The web is architected to capitalize on a fine-grain blending of human and machine understanding of relationships, alternating in the browse cycle.

 HCI Fundamentals and PWD Failure Modes

This particular relationship is approximate, defined in the the human interpretation domain, and may be applied by machines so long as it is understood to be heuristic, not definitive.

>> A collection of data which is excised from the assets on hand at a
>> server and dispatched to a recipient as a representation of "a
>> resource" needs more "packaging for delivery" than just the
>> Location reference.  Anything from the context of that Location
>> that matters should be pulled into the packaging (SOAP envelope,
>> HTTP headers) as an explicit reference.
>Is the SOAP envelope becoming the de facto XML packaging mechanism?
>I've been wondering about this since XMLP started...

for XML transmittal.  For retention, look to things like MCAT.

 MCAT - A Meta Information Catalog (Version 1.1)

Think about it.  The _combination of rich and precise_ that you can achieve using message metadata that are in XML is just _so vastly improved_ over the RFC-822 context (inherited by HTTP), that this is going to become the overwhelmingly dominant mode of communication.  First between webservers and the back-office tier, but then the HTTP under it will be replaced, and then it will be the new standard platform of the Internet after TCP/IP and HTTP in their turn.

Don't tell anyone, though.  Let it be a surprise.


>Mark Nottingham
Received on Friday, 8 February 2002 15:35:42 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:33:03 UTC