Re: Moving forward on Site Metadata [siteData-36] from Dan Connolly on 2006-05-05 (www-tag@w3.org from May 2006)

From: Dan Connolly <connolly@w3.org>
Date: Fri, 05 May 2006 17:06:37 -0500
To: Mark Nottingham <mnot@yahoo-inc.com>
Cc: www-tag@w3.org
Message-Id: <1146866797.22658.48.camel@dirk.w3.org>
Oops... this didn't make it thru my spam defenses
due to your new email address. Sorry about that.
I see you sent a message back on 26 March too. Oops.

On Mon, 2006-04-24 at 11:09 -0700, Mark Nottingham wrote: 
> What's the status of the Site Metadata work? From the Web site, it  
> doesn't appear if any progress has been made in some time.

Indeed. Tim Bray sketched something back in Jan 2004...
  http://www.tbray.org/ongoing/When/200x/2004/01/08/WebSite36
I took an action soon after taht to
"Propose an example of a site description."
  -- http://www.w3.org/2004/01/12-tag-summary.html#siteData-36

I haven't made any tangible progress on it.

Vincent saw your message and put the issue on our agenda this
week.
  http://www.w3.org/2001/tag/2006/05/02-minutes.html#item02

Tim suggested again using an HTTP header pointing to some
site metadata in RDF... I see pretty much the same suggestion
in the Jan 2004 meeting record.

Using the favico use case, I think the idea is something
like:

C->S:
	GET /some/page HTTP/1.1
	Host: www.foo.org

S->C:
	200 OK
	Link: rel="meta" </sitedescr>

And in /sitedescr we'd have something like

  <http://www.foo.org/sitedescr#thisSite>
    urispace:prefix "http://www.foo.org";
    chrome:icon <http://www.foo.org/icons/site_icon>.


> I ask because it's getting quite relevant; e.g., there is a Task  
> Force that's looking at mechanisms for cross-site access control  
> <http://lists.w3.org/Archives/Member/member-accesscontrol-tf/>, and a  
> site metadata format is one potential solution.

Yes... has anybody sketched that out in any detail?

(the tf seems to have member-confidential proceedings, which
makes collaboration with www-tag awkward.)

>  Additionally,  
> individual groups, sites and services continue to develop ad hoc site  
> metadata formats. Combined with the growing popularity of  
> microformats, service description, etc., I suspect Web metadata is  
> about to become a lot more useful and prevalent, and site metadata is  
> part of that.

You're dangerously close to volunteering to do my action for me ;-)

> I also wonder if it would help the TAG to bite off a more manageable  
> piece of the site metadata problem, by first considering what the  
> appropriate uses of site metadata are.
> 
> In particular, some people argue that any sort of site-wide metadata  
> is bad, because it disadvantages people who don't have control over a  
> whole site;
>    http://lists.w3.org/Archives/Public/public-webapi/2006Apr/0245.html


I like the idea of being able to declare icons, access control,
etc. by URI prefix.

I guess it doesn't allow for the *.foo.com pattern currently
covered by the <?access-control ?> PI.

I'll have to take another look at
  http://www.w3.org/TR/urispace.html

There's also the cwm/N3 mechanisms of log:uri combined
with str:startsWith and str:matches (regex matching). Prolly
better to use the more standard XQuery/XPath string manipulation
library; I suspect the overlap is around 95%.

> On the other hand, there are use cases that require knowledge of  
> metadata for a particular resource before accessing it; for example,  
> privacy policy, robots policy, and other access control policy (which  
> is very relevant, as there's currently a TF looking at that topic).

Yes, the P3P case is extra tricky.

As I understand it, in the javascript access control case,
the sandbox in the client is trusted to fetch the data and
decide after it comes back whether to let the hosted javascript
app see it.

Robots also don't need to know the whole story before they
do their 1st GET.


> In these cases, doing a resource-specific policy request before each  
> request incurs too much of an overhead to be practical. For many of  
> them, a site-wide metadata format is very attractive (as seen in P3P  
> and Robot Exclusion).
> 
> Perhaps it would help to develop guidelines for the establishment of  
> new types of site-wide metadata, e.g.:
>    - Site metadata is most appropriate when it is applicable to a  
> potentially large number of resources, knowledge of it is necessary  
> before access to a resource.
>    - Site metadata is least approrpriate when it is specific to a  
> small number of resources, and knowledge of it is necessary before  
> access to a resource.
>    - Site metadata should be able to be mirrored in content (e.g.,  
> meta tags, microformat) and in HTTP headers (e.g., the Link header).
>    - Site metadata formats should be modular; it should be possible  
> to delegate authority to part of a site to a different resource  
> (e.g., policy for /foo/ delegated to /foo/policy.xml).

That's an interesting list. I'll have to think about it some more.

> Cheers,

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E
Received on Friday, 5 May 2006 22:07:07 UTC