Re: Shapes are Classes, even if you don't use rdf:type from Dimitris Kontokostas on 2015-01-26 (public-data-shapes-wg@w3.org from January 2015)

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Mon, 26 Jan 2015 08:56:50 +0200
To: Holger Knublauch <holger@topquadrant.com>
Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <CA+u4+a3cEG9ho8NDwrouuwCcy909eO9fhQ0HoQfWtPiqCECF+g@mail.gmail.com>
On Mon, Jan 26, 2015 at 1:38 AM, Holger Knublauch <holger@topquadrant.com>
wrote:

> Hi Jerven,
>
> many thanks for speaking out. You have some really useful ideas.
>
> On 1/25/2015 22:25, Jerven Bolleman wrote:
>
>> Dear Working Group,
>>
>> I have tried to keep to the sidelines in this discussion,
>> but as a very interested user of this kind of tech I feel
>> I need to speak out.
>>
>> Shapes are Classes, in all practical and theoretical terms [1].
>>
>
> The terms are indeed extremely similar, to the point where the difference
> becomes almost philosophical. A class is a group of instances with similar
> characteristics. A shape is a group of nodes with similar characteristics.
> The difference is that class membership can also be asserted (via
> rdf:type). Yet OWL and LDOM also use class definitions as templates that
> can be used to "classify" instances. In OWL this is for example done by
> owl:equivalentClass definitions, in LDOM the same can be achieved via the
> built-in ldom:violatesConstraints function, which is basically a way to
> check whether a given node *could* be a valid instance of a class. You can
> rewrite that last sentence to use "Shape" without much difference.
>
> But the difference is that classes are already established. rdf:type is
> already established. rdfs:subClassOf is already established. The question
> is whether we need equivalent terms such as oslc:instanceShape and an
> equivalent of sub-shape relationship when we could just hook into the
> already existing infrastructure, hook into the already existing ontologies
> and be compatible to mainstream OO principles instead of coming up with
> some parallel universe.
>
>
>  ShEX shapes are just another way to infer class membership
>> (Closed World but otherwise basically OWL all over again)
>>
>> Instead of inferring example:A is a member of an owl:Class you now
>> infer that example:A is a member of things that have shape Y.
>> Using the word shape instead of Class is good to avoid confusing
>> between OWL and this standard, but they are the same thing just
>> relabelled.
>>
>>
>> The fact that shapes tries to avoid rdf:type at all cost is
>> going to be a real problem in even trivial real world cases.
>> e.g.
>>
>> example:office example:telNo “+41 41 41 41” .
>>
>> example:person example:name “example person” ;
>>                example:telNo “+32 32 32 32” .
>>
>> <officeShape> {
>>         example:telNo xsd:string
>> }
>>
>> <personShape> {
>>         example:telNo xsd:string
>>         example:name xsd:string
>> }
>>
>> Is example:office a member of the <personShape> just without a phone
>> number?
>> Yes or No. If it is not clear in this trivial example, how can we end
>> users,
>> reason about it and build stable software?
>>
>> LDOM, SPIN and OCLS all solve this by depending on the rdf:type.
>> Its simple and clear cut.
>>
>> Now sometimes a direct rdf:type use is not enough or can be confusing.
>> Because, in all proposals what is lacking is associating a
>> shape/constraint
>> with the context in which this constraint should apply.
>> Introducing a new predicate _ldom:context_ which links a resource
>> describing
>> when the constraint could be used.
>>
>> e.g.
>> ex:Rectangle
>>         a rdfs:Class ;
>>         rdfs:subClassOf rdfs:Resource ;
>>         rdfs:label "Rectangle" ;
>>         ldom:property [
>>                 a ldom:PropertyConstraint ;     # This type declaration
>> is optional
>>                 ldom:predicate ex:height ;
>>                 ldom:minCount 1 ;
>>                 ldom:maxCount 1 ;
>>                 ldom:valueType xsd:integer ;
>>                 rdfs:label "height" ;
>>                 rdfs:comment "The height of the Rectangle.” ;
>>                 ldom:context ex:Normal_Geometry ;  # Here we say where we
>> intent the context to apply
>>         ] .
>> ex:Normal_Geometry rfds:label “Euclidean geometry in 2 dimensions” .
>>
>> If we give each ldom:property an explicit way to state in which context
>> they apply
>> we can actually deal with different people using foaf:person in multiple
>> manners.
>> e.g. the constraints on foaf:person data being submitted to a restaurant
>> reservation
>> site is different to the constraints on foaf:person data being submitted
>> to a car rental
>> site.
>>
>> The LDOM processor can then choose to state which contexts applies to its
>> users needs.
>> The default would sensibly be all, and allow users to white or black
>> lists to include or exclude
>> contexts as they want.
>>
>> This is a much cleaner solution than the shapes one. In shapes we attempt
>> to separate the ontologies and
>> their constraints to avoid constraint collisions, but we just hope that
>> we don’t import them anyway.
>> With this context suggestion, constraint collisions become something we
>> can deal with.
>>
>> The advantage of attaching a context to constraints is that you can then
>> say something like a
>> post request with RDF data to book the rental of a car requires 1 driver,
>> 1 driver license and 1 payment method.
>> Currently in shapes and ldom, an empty message validates as well :( Plus
>> it allows users
>> to communicate when constraints should hold and when not. e.g. describing
>> the steps in a wizard,
>> step 1 has less constraints  on the submitted data then after step 2.
>>
>
> This sounds like a brilliant suggestion to me. It is a strong alternative
> to what we proposed using named graphs in the past, only that it makes the
> graph name explicit, which means that when named graphs get merged into a
> single flat graph, they can still be distinguished.
>
> I believe this approach can solve many use cases in which there were
> application-specific or portal-specific extra constraints. In order to make
> it a proposal to the group, I have therefore turned it into a corresponding
> requirement
>
> https://www.w3.org/2014/data-shapes/wiki/Requirements#
> Grouping_Constraints_into_Contexts


This is exactly how RDFUnit deals with constraint discovery and I 'd
definitely +1 this approach
http://lists.w3.org/Archives/Public/public-data-shapes-wg/2014Nov/0245.html


>
>
>  Secondly, I do think that ldom should be able to work from predicates as
>> well.
>>
>> ex:widthIn_cm a rdf:Property ;
>>          rdfs:label “width in centimetre” ;
>>          ldom:property [ ldom:valueType xsd:positiveInteger
>>                          ldom:context ex:realSpace ] .
>>
>> Allowing this kind of construct should help the dc:terms case where
>> rdf:types
>> are not specified.
>>
>> While modelling from a predicate is not everyone’s cup of tea I find that
>> it meshes
>> nicely with the Smalltalk message based OO paradigm, in comparison to the
>> conventional
>> ADT type OO paradigm of Java&C++. Which is why I believe it should have a
>> place in this
>> standard.
>>
>
> Again, a very interesting idea that looks implementable. I am undecided
> whether I would propose this as a requirement yet, because there might be a
> more general solution which is to use a combination of
> ldom:GlobalConstraint and templates. Your example above could be turned
> into a reusable LDOM Template with an argument ldom:valueType and an
> argument ldom:predicate. We (or anyone really) could create a template
> superclass for all the property-related global constraint templates. These
> would even have an explicit reference to the predicate that tools could use
> for display purposes etc. So the following would already work without
> requiring changes to the execution engine:
>
> ex:WidthInCmConstraints
>     a ldom:GlobalPropertyConstraint ;
>     ldom:predicate ex:widthIn_cm ;
>     ldom:valueType xsd:positiveInteger ;
> .
>
> and the ldom:sparql behind this would look for all occurrences of
> ex:widthIn_cm in the middle of a triple, regardless of the type of the
> subject.
>
> While I would indeed prefer to align with UML-style OO as much as possible
> (and have these properties attached to rdfs:Resource), you have a good
> point that sometimes this may be inconvenient.
>
> I welcome further discussion of this idea.
>
>
>> Sometimes data does not have types associated with them. In this
>> relatively rare
>> case I humbly suggest that the user use an existing W3C standard to infer
>> a type:
>> namely OWL. And if OWL doesn’t float their boat then use a SPARQL update
>> statement.
>> Totally typeless data is rare and should not be the primary use case for
>> this WG.
>>
>
> +1 there is the danger of designing a language that is optimized for a
> small set of corner cases, while the majority of cases is unnecessarily
> complicated (e.g. by having parallel definitions such as ex:Person and
> ex:PersonShape).
>
> Holger
>
>
>
>
>> e.g.
>>
>> <officeShape> {
>>         example:telNo xsd:string
>> }
>>
>> is practically equivalent to
>>
>> : officeShape a owl:Class ;
>>          rdfs:subClasOf [ a owl:Restriction ;
>>                           owl:onProperty example:telNo ;
>>                            owl:minCardinality 1 .
>>                         ].
>>
>> In both cases some kind of reasoning has to take place to determine if
>> the following triple
>>
>>   example:office example:telNo “+41 41 41 41” .
>>
>> means that triples about example:office meet the criteria of
>> <officeShape>.
>>
>> Now get back to work and standardise something fantastic !
>>
>> Sincere regards,
>> Jerven Bolleman
>>
>> [1] If it quacks like a duck and does not carry a shotgun then for all
>> practical purposes it is a duck. All though for our favourite instance
>> example Dick Cheney its “If it quacks like a duck then its a target” ;)
>> even if what quacks wears a bright fluorescent jacket and practices law.
>>
>> -------------------------------------------------------------------
>> Jerven Bolleman                        Jerven.Bolleman@isb-sib.ch
>> SIB Swiss Institute of Bioinformatics      Tel: +41 (0)22 379 58 85
>> CMU, rue Michel Servet 1               Fax: +41 (0)22 379 58 58
>> 1211 Geneve 4,
>> Switzerland     www.isb-sib.ch - www.uniprot.org
>> Follow us at https://twitter.com/#!/uniprot
>> -------------------------------------------------------------------
>>
>>
>>
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
Received on Monday, 26 January 2015 06:57:46 UTC