- From: William Waites <ww@styx.org>
- Date: Mon, 23 May 2011 15:01:40 +0200
- To: public-lod@w3.org
This is the RDF version of the question I just sent to the CKAN list [1]. It is somewhat a policy question and I believe that in RDF terms the open world means the answer is basically, "yes you can say what you want". Consider the diagram here, http://semantic.ckan.net/group/?group=http://ckan.net/group/lld this is interconnections between library datasets. You'll notice there is a partition. This partition is not really there. Here's why. In library world, perhaps more than elsewhere, it is common to do things like this, <http://example.org/issn/1234-5678> a bibo:Jornal; blah blah blah some descriptions; owl:sameAs <urn:issn:1234-5678>. This is because there are standard identifiers for lots of things that are found in libraries and they even have a urn namespace. So it is a lot easier when publishing this data than to go out and use something like silk to try to find links. They're already implied by the identifiers we have in hand. So given two such datasets, they are indeed connected in the way we think of RDF datasets as being connected, not necessarily with semantics as strict as owl:sameAs - we would probably not choose to actually materialise its productions here especially since the entities might be modelled in different, incompatible ways, and the owl:sameAs is really not the right predicate to be using, but at least connected with semantics along the lines of rdfs:seeAlso. The point is, the two datasets are transitively connected. But because we have no extant dataset that contains all the ISSNs, particularly all ISSNs where the identifier is expressed as a urn: URI, we have nothing to put in our voiD linkset -- which is how the relationships between these datasets are represented at a high level. So we have an apparent partition. What I propose to do here, is invent an implied dataset, the one that contains in principle the entire list of ISSNs. Something like, <urn:issn:0000-0000> a rdf:Resource. <urn:issn:0000-0001> a rdf:Resource. ... but which actually should contain X a rdf:Resource for everything in the valid lexical space of urn:issn, which may be (countably) infinite for all I know. Then for each dataset that I have that uses the links to this space, I count them up and make a linkset pointing at this imaginary dataset. Obviously the same strategy for anywhere there exist some kind of standard identifiers that are not URIs in HTTP. Does this make sense? Can we sensibly talk about and even assert the existence of a dataset of infinite size? (whatever "existence" means). Is this an abuse of DCat/voiD? Are this class of datasets subsets of sameAs.org (assuming sameAs.org to be complete in principle?) Cheers, -w [1] http://lists.okfn.org/pipermail/ckan-discuss/2011-May/001269.html -- William Waites <mailto:ww@styx.org> http://river.styx.org/ww/ <sip:ww@styx.org> F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Received on Monday, 23 May 2011 13:02:04 UTC