Site Metadata - descriptive requirements

There's one aspect of the site metadata issue that I haven't seen  
much discussion of yet.

Imagine that site metadata takes off (and arguably, it already has;  
think P3P, Robots.txt, Google Sitemaps, the access control work  
spooling up). If you have many different kinds of metadata mapped to  
a site, based on the structure of the URIs, the structure of the site  
can become constrained by the metadata.

This is especially true if the different site metadata association/ 
attachment mechanisms aren't well-aligned. For example, many of the  
mechanisms mentioned have a coarse-grained globbing ("*") approach,  
so that a site needs to put different kinds of content -- cut along  
one aspect -- in different directories. As more kinds of site  
metadata come into use, organising a site becomes less about  
modelling the resources and their relationships, and more about  
satisfying the operational necessities of their metadata.

I talked about this in more detail in the Background section of  
URISpace;
   http://www.w3.org/TR/urispace.html
but haven't seen much more since. My concern is that without a  
framework for talking about Web site-level metadata, we're going to  
see a proliferation of small, slightly incompatible formats that  
constrain the choices that Web publishers have.

Some questions as a consequence;

* What is the right granularity for site metadata? E.g., per- 
directory, per-resource?
* Should it be possible to assign metadata based on URI components?  
parts of components (e.g., substrings, query args, file extensions,  
etc.)?
* Should site metadata be variable based on request method?
* Should site metadata be applicable to different variants (e.g., by  
content-language, content-type)?
* Should site metadata be interoperable with that defined by HTTP  
headers and WebDAV properties?

I think the last question is especially interesting; I can see some  
obvious benefits, but the scope of metadata for HTTP headers is  
sometimes fuzzy (e.g., server-wide vs. site-wide vs. resource- 
specific vs. representation-specific).

--
Mark Nottingham     http://www.mnot.net/

Received on Sunday, 26 March 2006 16:09:51 UTC