- From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
- Date: Mon, 26 Jan 2015 08:56:50 +0200
- To: Holger Knublauch <holger@topquadrant.com>
- Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
- Message-ID: <CA+u4+a3cEG9ho8NDwrouuwCcy909eO9fhQ0HoQfWtPiqCECF+g@mail.gmail.com>
On Mon, Jan 26, 2015 at 1:38 AM, Holger Knublauch <holger@topquadrant.com> wrote: > Hi Jerven, > > many thanks for speaking out. You have some really useful ideas. > > On 1/25/2015 22:25, Jerven Bolleman wrote: > >> Dear Working Group, >> >> I have tried to keep to the sidelines in this discussion, >> but as a very interested user of this kind of tech I feel >> I need to speak out. >> >> Shapes are Classes, in all practical and theoretical terms [1]. >> > > The terms are indeed extremely similar, to the point where the difference > becomes almost philosophical. A class is a group of instances with similar > characteristics. A shape is a group of nodes with similar characteristics. > The difference is that class membership can also be asserted (via > rdf:type). Yet OWL and LDOM also use class definitions as templates that > can be used to "classify" instances. In OWL this is for example done by > owl:equivalentClass definitions, in LDOM the same can be achieved via the > built-in ldom:violatesConstraints function, which is basically a way to > check whether a given node *could* be a valid instance of a class. You can > rewrite that last sentence to use "Shape" without much difference. > > But the difference is that classes are already established. rdf:type is > already established. rdfs:subClassOf is already established. The question > is whether we need equivalent terms such as oslc:instanceShape and an > equivalent of sub-shape relationship when we could just hook into the > already existing infrastructure, hook into the already existing ontologies > and be compatible to mainstream OO principles instead of coming up with > some parallel universe. > > > ShEX shapes are just another way to infer class membership >> (Closed World but otherwise basically OWL all over again) >> >> Instead of inferring example:A is a member of an owl:Class you now >> infer that example:A is a member of things that have shape Y. >> Using the word shape instead of Class is good to avoid confusing >> between OWL and this standard, but they are the same thing just >> relabelled. >> >> >> The fact that shapes tries to avoid rdf:type at all cost is >> going to be a real problem in even trivial real world cases. >> e.g. >> >> example:office example:telNo “+41 41 41 41” . >> >> example:person example:name “example person” ; >> example:telNo “+32 32 32 32” . >> >> <officeShape> { >> example:telNo xsd:string >> } >> >> <personShape> { >> example:telNo xsd:string >> example:name xsd:string >> } >> >> Is example:office a member of the <personShape> just without a phone >> number? >> Yes or No. If it is not clear in this trivial example, how can we end >> users, >> reason about it and build stable software? >> >> LDOM, SPIN and OCLS all solve this by depending on the rdf:type. >> Its simple and clear cut. >> >> Now sometimes a direct rdf:type use is not enough or can be confusing. >> Because, in all proposals what is lacking is associating a >> shape/constraint >> with the context in which this constraint should apply. >> Introducing a new predicate _ldom:context_ which links a resource >> describing >> when the constraint could be used. >> >> e.g. >> ex:Rectangle >> a rdfs:Class ; >> rdfs:subClassOf rdfs:Resource ; >> rdfs:label "Rectangle" ; >> ldom:property [ >> a ldom:PropertyConstraint ; # This type declaration >> is optional >> ldom:predicate ex:height ; >> ldom:minCount 1 ; >> ldom:maxCount 1 ; >> ldom:valueType xsd:integer ; >> rdfs:label "height" ; >> rdfs:comment "The height of the Rectangle.” ; >> ldom:context ex:Normal_Geometry ; # Here we say where we >> intent the context to apply >> ] . >> ex:Normal_Geometry rfds:label “Euclidean geometry in 2 dimensions” . >> >> If we give each ldom:property an explicit way to state in which context >> they apply >> we can actually deal with different people using foaf:person in multiple >> manners. >> e.g. the constraints on foaf:person data being submitted to a restaurant >> reservation >> site is different to the constraints on foaf:person data being submitted >> to a car rental >> site. >> >> The LDOM processor can then choose to state which contexts applies to its >> users needs. >> The default would sensibly be all, and allow users to white or black >> lists to include or exclude >> contexts as they want. >> >> This is a much cleaner solution than the shapes one. In shapes we attempt >> to separate the ontologies and >> their constraints to avoid constraint collisions, but we just hope that >> we don’t import them anyway. >> With this context suggestion, constraint collisions become something we >> can deal with. >> >> The advantage of attaching a context to constraints is that you can then >> say something like a >> post request with RDF data to book the rental of a car requires 1 driver, >> 1 driver license and 1 payment method. >> Currently in shapes and ldom, an empty message validates as well :( Plus >> it allows users >> to communicate when constraints should hold and when not. e.g. describing >> the steps in a wizard, >> step 1 has less constraints on the submitted data then after step 2. >> > > This sounds like a brilliant suggestion to me. It is a strong alternative > to what we proposed using named graphs in the past, only that it makes the > graph name explicit, which means that when named graphs get merged into a > single flat graph, they can still be distinguished. > > I believe this approach can solve many use cases in which there were > application-specific or portal-specific extra constraints. In order to make > it a proposal to the group, I have therefore turned it into a corresponding > requirement > > https://www.w3.org/2014/data-shapes/wiki/Requirements# > Grouping_Constraints_into_Contexts This is exactly how RDFUnit deals with constraint discovery and I 'd definitely +1 this approach http://lists.w3.org/Archives/Public/public-data-shapes-wg/2014Nov/0245.html > > > Secondly, I do think that ldom should be able to work from predicates as >> well. >> >> ex:widthIn_cm a rdf:Property ; >> rdfs:label “width in centimetre” ; >> ldom:property [ ldom:valueType xsd:positiveInteger >> ldom:context ex:realSpace ] . >> >> Allowing this kind of construct should help the dc:terms case where >> rdf:types >> are not specified. >> >> While modelling from a predicate is not everyone’s cup of tea I find that >> it meshes >> nicely with the Smalltalk message based OO paradigm, in comparison to the >> conventional >> ADT type OO paradigm of Java&C++. Which is why I believe it should have a >> place in this >> standard. >> > > Again, a very interesting idea that looks implementable. I am undecided > whether I would propose this as a requirement yet, because there might be a > more general solution which is to use a combination of > ldom:GlobalConstraint and templates. Your example above could be turned > into a reusable LDOM Template with an argument ldom:valueType and an > argument ldom:predicate. We (or anyone really) could create a template > superclass for all the property-related global constraint templates. These > would even have an explicit reference to the predicate that tools could use > for display purposes etc. So the following would already work without > requiring changes to the execution engine: > > ex:WidthInCmConstraints > a ldom:GlobalPropertyConstraint ; > ldom:predicate ex:widthIn_cm ; > ldom:valueType xsd:positiveInteger ; > . > > and the ldom:sparql behind this would look for all occurrences of > ex:widthIn_cm in the middle of a triple, regardless of the type of the > subject. > > While I would indeed prefer to align with UML-style OO as much as possible > (and have these properties attached to rdfs:Resource), you have a good > point that sometimes this may be inconvenient. > > I welcome further discussion of this idea. > > >> Sometimes data does not have types associated with them. In this >> relatively rare >> case I humbly suggest that the user use an existing W3C standard to infer >> a type: >> namely OWL. And if OWL doesn’t float their boat then use a SPARQL update >> statement. >> Totally typeless data is rare and should not be the primary use case for >> this WG. >> > > +1 there is the danger of designing a language that is optimized for a > small set of corner cases, while the majority of cases is unnecessarily > complicated (e.g. by having parallel definitions such as ex:Person and > ex:PersonShape). > > Holger > > > > >> e.g. >> >> <officeShape> { >> example:telNo xsd:string >> } >> >> is practically equivalent to >> >> : officeShape a owl:Class ; >> rdfs:subClasOf [ a owl:Restriction ; >> owl:onProperty example:telNo ; >> owl:minCardinality 1 . >> ]. >> >> In both cases some kind of reasoning has to take place to determine if >> the following triple >> >> example:office example:telNo “+41 41 41 41” . >> >> means that triples about example:office meet the criteria of >> <officeShape>. >> >> Now get back to work and standardise something fantastic ! >> >> Sincere regards, >> Jerven Bolleman >> >> [1] If it quacks like a duck and does not carry a shotgun then for all >> practical purposes it is a duck. All though for our favourite instance >> example Dick Cheney its “If it quacks like a duck then its a target” ;) >> even if what quacks wears a bright fluorescent jacket and practices law. >> >> ------------------------------------------------------------------- >> Jerven Bolleman Jerven.Bolleman@isb-sib.ch >> SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 >> CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 >> 1211 Geneve 4, >> Switzerland www.isb-sib.ch - www.uniprot.org >> Follow us at https://twitter.com/#!/uniprot >> ------------------------------------------------------------------- >> >> >> > > -- Dimitris Kontokostas Department of Computer Science, University of Leipzig Research Group: http://aksw.org Homepage:http://aksw.org/DimitrisKontokostas
Received on Monday, 26 January 2015 06:57:46 UTC