Resource-DR relationship type

In our telecon this week [1] we talked about how a client might discover 
that a particular resource (and we're thinking mostly about HTML pages 
here) is described by a DR and what the client should be expected to do 
in order to process the data it receives. The discussion has continued 
on the member-list as well so this e-mail tries to summarise current 
group thinking and the open questions.

Basic problems:

1. How do you link to a DR in such a way that a processor that is only 
interested in DRs knows to follow the link?

2. If the RDF instance carrying the DR includes other RDF data, how can 
the processor only receive what it's interested in and not the other data?

Tools available to us:

1. Link elements with:
   - relationship type (existing or newly defined in an HTML profile)
   - MIME type
   - title

2. Meta tags
   - Tag names can be associated with a profile and namespace
   - Usual practice is that the subject of any meta tag is the document 
itself

3. RDFa
  - Can effectively embed RDF triples in XHTML
  - Can set the subject of RDF triples to be either the current document 
or another.

For the purposes of the following discussion, let H be an HTML page that 
carries the link to the Description Resource (DR)


Server-side processing
======================

Imagine that H includes a link tag like this:

<link rel="meta" type="application/rdf+xml" title="POWDER" 
href="http://example.com/describe?uri=referer" />

The expectation we were given at the Boston meeting [2] is that this 
should return RDF triples about H. From a POWDER perspective, this 
implies server-side processing as it requires the DR to be parsed and 
the correct triples extracted and returned. Thus, dereferencing the URI 
  (http://example.com/describe?uri=referer) would give us something like:

<H> ex:colour "blue" ;
     ex:shape "square" ;
     wdr:describedBy <http://example.com/powder.rdf>

The first two predicate/object pairs describe H. The third one says 
"this data came from http://example.com/powder.rdf which you might want 
to grab and process yourself to save making look ups to this service all 
the time."

This assumes that the client passes on <H> (presumably as an 
HTTP-REFERRER header) and that the server includes software that can 
deal with this.

Personally, I think this has real potential for organisations like my 
own that plan to offer online services based on POWDER. I can see ICRA 
running such a service (quick plug here for our hosting company Kingston 
Communications [3] 'cos I can only make plans like this with their support!)

It might also have application in large (enlightened) content providers 
who wish to provide good discovery metadata that promotes their 
material. The key thing here is that POWDER is a background process - 
what comes out is straight RDF that can use used by any Semantic Web 
application.

If such services were in place, my expectation would be that a 
relatively small amount of data would be returned from such a request. 
Chaals' point about several KB of data coming down the pipe when all you 
really want to know is "is this think mobileOK/child-friendly/licensed 
or whatever" shouldn't be a problem.

That's not the case though if the link points to an RDF instance that 
could be several MB in size, in amongst which were a few lines that 
constituted a DR. And again, we're left with how to declare the 
difference between those two in the link tag?

Well... it could be done with RDFa [4] which has a relationship type of 
@about so it should be possible to make statements in the HTML that the 
thing at the end of the rel="meta" link is just going to send back 
triples about H in a POWDER-based system. Possible - but it's starting 
to look a little messy if we _require_ the use of RDFa? (not to mention 
a whole new load of dependencies for the WG).

I my mind I'm imagining presenting all this to a bunch of highly 
sceptical content providers who need to be convinced of the benefits of 
adding good quality metadata to their material (why should we bother 
when it's time consuming and Google indexes our stuff just fine anyway? 
I know, I know, but that's the question we're going to be asked.) Now 
imagine telling those folk that they really ought to install this bit of 
software on their servers and add RDFa to all their pages to let the 
machines know that the service exists.

Hmmm...

Kevin's ideas on using meta tags to not only link to but describe the DR 
- in particular, to provide authorship and validity information, makes 
sense in that you'd be giving a lot of information to the POWDER client 
that it can use to decide whether it wants to bother fetching the DR 
itself. Again, I think we'd need to use RDFa to do this, I don't think 
there's another (unambiguous) way to provide metadata in H about 
anything other than H. The main problem I see with this is in 
maintaining that data about the DR - you'd need to edit H if you updated 
the DR which may or may not be practical depending on the content 
management environment.

Andrea has suggested defining our own MIME type (presumably something 
like application/xml+powder ). This _is_ possible and if the WG ends up 
deciding that really is what we should do, then, OK we'll look into the 
IANA process for doing this and take the hit on things like getting it 
into Apache, IIS etc (mind you, Microsoft seems so averse to Sem Web 
that even the RDF MIME type isn't installed by default in IIS but that's 
another story). For me, the deciding factor here would be whether the 
structure of a DR ends up being such that off the shelf SW apps 
shouldn't process the data - which we're bending over backwards to avoid.

Whichever way we look at it though, I keep coming back to

<link rel="powder" type="application/rdf+xml" 
href="http://example.com/powder.rdf" title="mobileOK, ICRA, WCAG, TRUSTe" />

This tells you explicitly that there is an RDF instance at 
<http://example.com/powder.rdf> that uses POWDER to describe H.
It also gives you a processing hint that the vocabularies used to 
describe H are mobileOK, ICRA, WCAG and TRUSTe.

We know that defining rel="powder" can be done in an HTML profile and we 
can perhaps work to get it into the main HTML 5 docs too. This looks 
easier than defining a new MIME type.

Given the above link you would follow it if a) you know what POWDER is; 
and, b) if you are interested in those vocabularies. RDFa in the doc 
might also give you information on authorship so that you could add in 
further conditions on whether the DR creator was someone you trusted and 
you were within the time period stated.

Picking up the point that Chaals was making on the call... we've said 
that a DR will be defined in terms of its full semantics but that we 
will also provide an XML schema against which it can be validated 
(Kevin's even volunteered to create it!). We're thus providing scope for 
real-world optimisation.

On the downside, we are pushing all processing onto the client and 
potentially alienating developers of existing RDF/OWL clients. Thus the 
adoption of POWDER is dependent on its integration in lots of clients 
rather than, perhaps, a single reference application that can be 
installed on the relatively small number of popular servers. However, 
personally, I do think this is the way to go. Clients can be made better 
by recognising DRs, especially if they're backed up by real-time 
authentication services (automated versions of 'click to verify').

Phil.




[1] 
http://www.w3.org/blog/powder/2007/12/12/meeting_summary_10_december_2007
[2] 
http://www.w3.org/blog/powder/2007/11/10/summary_of_face_to_face_meeting_held_dur_2007
[3] http://www.kcom.com
[4] http://www.w3.org/TR/rdfa-syntax/

Received on Wednesday, 12 December 2007 10:53:12 UTC