voiD, SIOC and the case for site-wide metadata discovery (Was: Re: [call for comments] voiD 1.0)

Simon Reinhardt wrote:
> 
> Keith Alexander wrote:
>> Can you explain why you prefer sioc:has_container to dcterms:isPartOf ?
> 
> Let's call it consistent use of a vocabulary. Since I'm using SIOC for 
> lots of things in the platform anyway (like, most resources in my 
> dataset are sioc:Items) it makes sense to use SIOC the way it is 
> expected to be used. Then again I also want to use voiD the way it is 
> expected to be used, so I'm in a dilemma. :-)

Actually, let me expand on that.

As we know finding resource descriptions is still an open issue [1] but discussions around that are mostly about relating one resource to one other resource that has metadata about it. The other case is that you want to find out which set of resources a resource belongs to and which other resources (that you might want to lookup later) belong to it as well.
There's the POWDER way which uses link rels to go from a resource to such a collective description [2] and in that collective description uses "irisets" to define which other resources fall into the scope of that description [3].
There's the voiD way where you can go to a voiD description file by use of dcterms:isPartOf and then use void:uriRegexPattern to define which other resources will be in the same dataset.

Those two ways of discovery are basically the same. But I think they solve an important issue, namely discovery of site-wide metadata, without restricting you in how you partition your URI space.
The TAG issue siteData-36 [4] is related to that and has a proposed solutions to introduce another, final, top-level resource /site-meta [5]. But that is more about finding resources related to a site while voiD is about finding resources related to a *data*set. The scope is different here: a dataset is not a site and if you're interested in descriptions of a dataset you don't really care about where to find favicons for pages on a site etc.

But, to finally get to my point here, if you mix site description and dataset description you can kill two birds with one stone. You rid yourself of the need of having a central entry point to your site (/robots.txt or /site-meta or whatever) and can partition the URI space however you want. This is important for domains hosting several sites (a use-case that seems to become less and less important though) where the creator of the site doesn't have control over resources like /robots.txt.
SIOC is perfect for describing sites so in my void.ttl I can have a description of my sioc:Site which is sioc:space_of a sioc:Container/void:Dataset. Since void:uriRegexPattern has an rdfs:domain of void:Dataset I can't really use it to say which pages are part of my site but at least I can relate them to a container/dataset. Maybe I can use POWDER to describe the scope of the site, have to look into it again.
The idea of using one property to go from a resource to the collective description file (whether that property be sioc:has_container or dcterms:isPartOf) is just that I reduce duplication. I re-use the discovery mechanism of voiD and POWDER for finding site-wide metadata (like a privacy policy) as well.

Regards,
  Simon

[1] http://esw.w3.org/topic/FindingResourceDescriptions
[2] http://www.w3.org/TR/2008/WD-powder-dr-20081114/#assoc
[3] http://www.w3.org/TR/powder-grouping/#byIRI
[4] http://www.w3.org/2001/tag/group/track/issues/36
[5] http://tools.ietf.org/html/draft-nottingham-site-meta-00

Received on Friday, 30 January 2009 16:34:31 UTC