Re: Content type for /site-meta (or HTTP header fragment format)

Eran,

I've made a couple of minor comments on this proposal which in general I 
like as it does seem to be the well known location to end all well known 
locations (which I reckon is about the only justification there could be 
for a new well known location).

Eran Hammer-Lahav wrote:
> Context
> 
> The /site-meta proposal (a known-location solution for site metadata) [1]
> includes a simple XML format for representing site metadata directly or via
> links. In discussing the proposal and the appropriate format for the list of
> meta resources, John Panzer suggested using a simpler text format [2]
> directly based on the content of the Link header [3].
> 
> While I see the value of an XML format for this data, and was the main
> supported of it, I now strongly support the idea of using a super-simple
> text-based document. Partially because it fits better with the current
> use-cases, and partially because I am an editor of a "competing" XML format
> which covers this use case (XRDS/XRD) but is too complex to be positioned as
> the default form.
> 
> I would like /site-meta to list a single text-based format with a clear
> Content-type associated with it. I also want the spec to explicitly allow
> user-agents to request other representations of the /site-meta resource with
> the default being the super-simple-text-based version. One such
> representation (I expect to be widely supported) will be
> application/xrd+xml.
> 
> 
> Some Questions (and answers)
> 
> - Should the /site-meta text format be restricted to a set of links or
> provide an easy path for extensions of some other kinds of records?
> 
> While I can't come come up with compelling use cases for /site-meta to
> directly include other metadata, it is likely someone else will in the
> future. 

I fully understand the desire for extensibility and for not imposing 
restrictions unnecessarily. However, I do think it would be a big 
mistake to allow a /site-meta file to include anything other than links 
to data. Let's imagine you allowed, say, Dublin Core and Creative 
Commons to be encoded in a /site-meta file directly. Why not? They're 
well-defined, well used metadata systems that can often be applied to a 
whole site.

Why make people put this in a separate file when it could, surely, go in 
the /site-meta file? Well, you could allow it, and any other metadata - 
and hey presto you've just reinvented a WKL for POWDER, XRD and whatever 
comes next.

No... if /site-meta is the WKL to end all WKLs then it has to be just a 
set of pointers to where the 'real data' actually is. So I would say 
that there is a case for deliberately limiting the extensibility. As you 
go on to point out, if it supports an HTTP Link-like structure, that's 
already flexible and it meets the need. When extensibility leads to 
mission creep, things will go wrong.



By replacing each record in John's proposal:
> 
> ---
> /robots.txt rel="robots"
> /p3p.xml rel="privacy"
> http://other.example.net/example rel="http://example.com/rel"
> ---
> 
> with actual Link headers:
> 
> ---
> Link: </robots.txt>; rel="robots"
> Link: </p3p.xml>; rel="privacy"
> Link: <http://other.example.net/example>; rel="http://example.com/rel"
> ---
> 
> other record types can be added in the future.

Indeed. Here are two that come to mind:

Link: </styles.css>; rel="stylesheet"; type="text/css"
Link: </powder.xml> rel="describedby"; type="text/powder+xml"

The mobile world would probably like something like

Link: <http://m.example.com>;
   rel="http://example.org/mobile-vocab#mobile_entry_page"

Link: <http://example.com>;
   rel="http://example.org/mobile-vocab#desktop_entry_page"

(I'm basing this on the metaTXT work just getting going [PA1])

Oops... I'm straying into mission creep there aren't I? I mean, are 
those URIs links or metadata? I hope it doesn't matter - I've used URIs 
where URIs are allowed.

One thing I have done in my first 2 examples is to include the type 
attribute (which if we're following the HTTP Link format is allowed and, 
IMHO, should be encouraged!)

  This also means the same code
> used to read Link headers (or HTTP headers in general) can be used for this
> format. This also plays nicely with the idea of equating links in /site-meta
> to Links in individual resources' HTTP response headers.
> 
> - Should /site-meta define its own content type, use an existing content
> type, or define a new generic content type?
> 
> If we take the route of using an HTTP-header-like format for /site-meta, is
> there value in making this format generally available for other resources.
> RFC 2616 offers a similar construct in the form of message/http. It seems
> that as long as the document can be considered a valid HTTP request or
> response, we can use this content type.
> 
> So /site-meta can be considered a body-less HTTP response with Link headers.
> The question is, is such a header-fragment allowed in a message/http
> document? It is not clear if in this use-case, the Date header may be
> omitted, which is otherwise required for a valid response header. The Date
> header makes little sense in this context and should be omitted. Note that
> the HTTP header for GET /site-meta must still include Date.
> 
> 
> In Conclusion
> 
> 1. The idea of allowing multiple representations for /site-meta resources
> suggests the use of a more generic content type for the default (and the
> only required) representation than application/site-meta.

I'd stick with one format. Choice can be overrated and leads to 
confusion (and you thought I was a dripping wet liberal? Only when it 
suits me ;-) )

> 
> 2. There is value in using a single mechanism for metadata discovery, either
> for an individual resource (via HTTP Link header or HTML/ATOM Link element)
> and for a domain authority (via /site-meta list of links). Using the exact
> same semantics between HTTP Link and /site-meta links seems productive.

Agreed. And this further supports the one-format point.

> 
> 3. Preparing for some unknown need for extending /site-meta while not
> increasing complexity (assuming Link header structured is simple enough)
> seems like a good idea.

Yes - but the flexibility is in the relationship and content types. Sign 
posts can point to towns, multi-lane highways, country dirt tracks and 
little 'ol houses on the prairie, but they're still sign posts and 
that's what, for me, /site-meta is about. Enough with the flexibility.

Actually, at least 3 use cases - robots.txt, p3p and POWDER - all have 
their own method of defining which sections of Web sites they refer to. 
If there is an argument for making /site-meta more complex or flexible, 
I'd say it would be in the area of defining a common method of doing 
that - but that means re-writing those specs so let's not go there.

> 
> 
> Action Items
> 
> * Change /site-meta draft to use the Link header format instead of the
> current XML proposal.

+1

> * If allowed, use message/http as the default content type for /site-meta.
> If not, register a new content type, preferably something like
> application/http-header-fragment, or just application/site-meta.

Why application? I'd say text was more appropriate. Application suggests 
something really complicated that needs a lot of processing. This is 
just a bunch of links and a little syntactic sugar.


> * Clarify that the content of /site-meta does not describe any actual
> resource or URI, but the abstract concept of 'web site' or 'domain
> authority', expressed as an HTTP header. In practice, it is still just a
> registry for resource locations to avoid more known-location solutions.

+1

> 
> Thoughts?
> 
> EHL
> 
> 
> [1] http://tools.ietf.org/html/draft-nottingham-site-meta-00
> [2]
> http://www.abstractioneer.org/2008/11/one-site-meta-to-rule-them-all.html
> [3] http://tools.ietf.org/html/draft-nottingham-http-link-header-02
> 
> 
> 
> 

[PA1] http://www.visibilitymobile.com/Whitepaper_On_MetaTXT.pdf

-- 

Phil Archer
w. http://philarcher.org/

Received on Friday, 28 November 2008 23:45:16 UTC