Re: Proposed issue: site metadata hook from Seairth Jacobs on 2003-02-11 (www-tag@w3.org from February 2003)

From: Seairth Jacobs <seairth@seairth.com>
Date: Tue, 11 Feb 2003 10:44:31 -0500
To: "www-tag" <www-tag@w3.org>
Message-ID: <003d01c2d1e4$782ba4b0$a800a8c0@SeairthA31>
I agree.  Repeating a post I made [1], you could use OPTIONS to accomplish
such a thing.  No need for additional verbs. In my original post, I
suggested one could return a URI to the appropriate RDF (or whatever)
document.  You could also just return an RDF document directly that contains
URIs to appropriate resources.

However, these approaches require at least two hits on the server.  While
this may be fine for favico or P3P (from the client perspective), I wonder
if you will be able to convince crawlers, bots, etc. to give up the
robots.txt file.  From their perspective, any of these solutions would
double the amount of time it would take to do their job.  This could be even
worse if they have to process something like RDF just to find a URI to a
robots.txt-like file.  I think that any effort that would require this
additional work by those programs would fail before it even started.

Would it be possible to use OPTIONS along with a new series of
content-types?  For instance, suppose there was a "metadata/favico" and
"metadata/robots".  Then, use of OPTIONS with conneg would return the
requested resource directly.  Using the "x-" convention, people could still
implement proprietary metadata extensions.  The advantage to this approach
is that the effect is the same as the current practices, but no longer
violating the server's URI space.  The disadvantage is the potential for a
similar invasion of the MIME-type space...

---
Seairth Jacobs
seairth@seairth.com

[1] http://lists.w3.org/Archives/Public/www-tag/2003Feb/0071.html

----- Original Message -----
From: "Jeffrey Winter" <JeffreyWinter@crd.com>
To: "Tim Berners-Lee" <timbl@w3.org>; <www-tag@w3.org>
Cc: <tag@w3.org>
Sent: Tuesday, February 11, 2003 8:39 AM
Subject: RE: Proposed issue: site metadata hook


>
>
> Why limit this approach to just site-level
> metadata?  Shouldn't a similar approach be
> adopted to bind metadata to any resource
> under the control of the "publisher"?
>
> I can see how this would benefit an RPC-style
> gateway as a means of (for example) standardizing
> how to obtain a WSDL document, but what about
> REST-style applications where each resource
> may (and probably will) have its own specific
> metadata?
>
>
>
> > -----Original Message-----
> > From: Tim Berners-Lee [mailto:timbl@w3.org]
> > Sent: Monday, February 10, 2003 11:02 AM
> > To: www-tag@w3.org
> > Cc: tag@w3.org
> > Subject: Proposed issue: site metadata hook
> >
> >
> >
> > In the face-face meeting I took an action to write up a proposal for
> > the following potential issue:
> >
> >
> > Proposed Short name:  SiteMetadata-nn
> >
> > Title:   Web site metadata improving on robots.txt, w3c/p3p
> > and favicon
> > etc
> >
> > The architecture of the web is that the space of identifiers
> > on an http web site is owned by the owner of the domain name.
> > The owner, "publisher",  is free to allocate identifiers
> > and define how they are served.
> >
> > Any variation from this breaks the web.  The problem
> > is that there are some conventions for the identifies on websites,
> > that
> >
> >     /robots.txt  is a file controlling robot access
> >     /w3c/p3p is where you put a privacy policy
> >     /favico   is an icon representative of the web site
> >
> > and who knows what others.  There is of course no
> > list available of the assumptions different groups and manufacturers
> > have used.
> >
> > These break the rule.  If you put a file which happens to be
> > called robots.txt  but has something else in, then weird
> > things happen.
> > One might think that this is unlikely, now, but the situation could
> > get a lot worse.  It is disturbing that a
> > precedent has been set and the number of these may increase.
> >
> > There are other problems as well - as well sites are catalogued
> > by a number of different agents, there tend to be all kinds
> > or request for things like the above, while one would like to
> > be able to pick such things up as quickly as possible.
> >
> > If, when these features were designed, there had been a
> > general way of attaching metadata to a web site, it would
> > not have been necessary.
> >
> > The TAG should address this issue and find a solution,
> > or put in place steps for a solution to be found,
> > which allows the metadata about a site, including that for
> > later applications, to be found with the minimum overhead
> > and no use of reserved URIs within the server space.
> >
> > Example solution for feasability
> >
> > A new http tag such as "Metadata:" is introduced into HTTP
> > This takes one parameter, which is the URI of the
> > metadata document.  The header is supplied on response to
> > any GET or HEAD of the root document  ("/"). It may also
> > be supplied on a any other request, including error
> > requests.
> >
> > The Metadata document is conventionally written in RDF/XML.
> > It contains pointers to all kinds of standard and/or proprietary
> > metadata about the site, including for example
> >
> > - privacy policy
> > - robot control
> > - icon for representing the site
> > - site maps
> > - syndicates (RSS ) feeds
> > - IPR information
> > - site policy
> > - site owners
> >
> > The solution only needs to document the hook and the
> > vocabulary to point to metadata resources in current
> > use.  Vocabulary for new applications can be defined
> > by those applications.
> >
> > timbl
> >
> >
>
>
Received on Tuesday, 11 February 2003 10:45:07 UTC