Re: Can Shapes always be Classes? from Holger Knublauch on 2014-11-20 (public-data-shapes-wg@w3.org from November 2014)

From: Holger Knublauch <holger@topquadrant.com>
Date: Fri, 21 Nov 2014 08:51:24 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <546E706C.5070506@topquadrant.com>
Thanks for describing your use cases in more details, Karen. I 
understand you don't have wiki access yet, but I am looking forward to 
seeing the DC stories on our Wiki in the future. I acknowledge that 
there is a group of use cases where almost random RDF triples need to be 
processed, and as stated elsewhere this can be represented via global 
SPIN constraints. What would need to be worked out is how to trigger the 
execution of constraints on such data, but this should be quite easy to 
agree on (e.g. some metadata triple pointing at a shapes graph).

Yet I remain convinced that for the majority of use cases, a 
class-instance structure is a natural way of representing data, and 
therefore associating classes with constraints is a good design 
short-cut. It's a bit like in the old days of Expert Systems that only 
supported global lists of rules, which later evolved into a better 
maintainable object-oriented grouping of rules. SPIN promotes the latter 
design without excluding the former.

Holger


On 11/21/2014 2:36, Karen Coyle wrote:
>
>
> On 11/19/14 5:12 PM, Holger Knublauch wrote:
>> Forgive my ignorance but isn't Dublic Core Elements just a collection of
>> properties? And those properties have no rdfs:domain, which means they
>> can be attached to anything.
>
> Or to nothing. There are many who use DCE on its own for description 
> of resources, adding no constraints. At the moment that metadata tends 
> to be exposed as JSON, but as we move more toward RDF,it could also be 
> exposed in RDF with a simple transformation. As it has been developed 
> without constraints or classes, I would expect this to look like:
>
> <URI>
>   dce:title "something" ;
>   dce:creator "something" ;
>   dce:publisher "something .
>
> Any clue as to constraints will not be in the instance data, but in a 
> backend application whose rules could possibly exported as what DCMI 
> calls an "application profile" - a set of rules that are separate from 
> the instance data.
>
> The advantage of this, btw, is that one can massively aggregate data 
> from these hundreds of systems without running into conflicts between 
> constraints. This is something that is commonly done in the cultural 
> heritage world, and that is already done with the non-RDF data coming 
> from these systems. What you get is imprecise but mashable (something 
> that I think RDF OW suports).
>
>  But any class can add constraints on how
>> these DC properties shall be used, e.g. to indicate that dc:author must
>> be present and an xsd:string. But that topic feels unrelated to the
>> rdf:type discussion.
>
> It's related because it represents a case where a vocabulary is not 
> organized around classes. I know of two other vocabularies, not yet in 
> use but potentially important: #1 only provides 5-6 classes for over 
> 900 properties; #2 defines no classes at all and have about 800 
> properties. I'm not prepared to make assumptions about how the 
> instance data will look, but I can say that there is a huge range in 
> how people approach vocabulary design and the instance data that will 
> be produced. There's another vocabulary in progress that is proposing 
> a class with about 200-300 subclasses in order to handle the great 
> variety of kinds of creators that are recognized in library and 
> archive data - all of which need to be validated. (I will write these 
> up as use cases, but at the moment do not have access to the wiki.)
>
> The upshot is that if validation is based on class membership, then we 
> need to look at a wide variety of data -- especially data that did not 
> specifically design its classes as validation points.
>
> kc
>
Received on Thursday, 20 November 2014 22:54:11 UTC