Re: AW: Thoughts on validation requirements from Peter F. Patel-Schneider on 2014-07-29 (public-rdf-shapes@w3.org from July 2014)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Tue, 29 Jul 2014 08:01:08 -0700
To: Eric Prud'hommeaux <eric@w3.org>
CC: public-rdf-shapes@w3.org, Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>, "Bosch, Thomas" <Thomas.Bosch@gesis.org>
Message-ID: <53D7B734.5060109@gmail.com>

On 07/29/2014 03:43 AM, Eric Prud'hommeaux wrote:
> * Peter F. Patel-Schneider <pfpschneider@gmail.com> [2014-07-28 07:54-0700]
>> On 07/28/2014 02:20 AM, Eric Prud'hommeaux wrote:
>>> On Jul 28, 2014 12:08 AM, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
>>> wrote:

[...]

>> An RDF document, on the other hand, almost invariably contains
>> multiple somethings, very often not arranged in a tree, and
>> sometimes even without any connection between them.  In RDF it is
>> generally permissable to have any sort of information, whereas XML
>> information is generally required to fit into what is expected.
>
> I agree, but fear this is a sort of selection bias.

Well obviously there is a bias towards using RDF for multiple somethings, 
because RDF is good at that and other formats are not.  Because of this 
virtuous bias, there is the concomitant bias that there is relatively less RDF 
that is used for single somethings.  There is, of course, nothing wrong with 
this so far.

It may be that because RDF is good for multiple somethings, some people think 
that it is not good for single somethings.  If so, this would be somewhat 
unfortunate.

However, this certainly doesn't mean that RDF validation should ignore the 
common situation of multiple somethings, most or all with explicit types.  Nor 
does it mean that RDF validation should be targeted towards single untyped 
somethings.  To do either of these is to ignore RDF's strengths.

So I remain very skeptical that ShEx is a viable start towards RDF validation, 
as it appears to me to be targeted towards an uncommon use of RDF and not 
easily extended to nicely cover the bulk of extant and proposed RDF.

> Perhaps the
> majority of LDP uses include a backend which is not a triple store
> (possibly SQL, possibly state stored in the position of a lightswitch
> on a wall). In these cases, the data one posts must be limited to the
> exact arrangement of somethings that the server expects or data will
> be (silently) dropped. I suspect that the majority of the business use
> cases on the horizon for RDF involve services that are not willing to
> store arbitrary triples.

Even if true this is at best an argument for validation that covers all 
(local) triples.  It still doesn't get one from multiple somethings to single 
somethings.  I'm also still skeptical that covering all (local) triples is a 
good idea even here, as it would prohibit, for example, extra information 
coming from a node belonging to an unexpected (or maybe even expected) subtype.

>> Validation then should work differently in RDF than in XML.  My view
>> of RDF validation is determining whether the instances of a type
>> (not necessarily explicitly signalled by an rdf:type link) meet some
>> constraint, and that RDF validation generally involves multiple
>> types, often unrelated types.  I don't see how ShEx can do this, and
>> thus my questions as to how ShEx can do RDF validation.
>
> What if shapes were types? I think that would meet your definition.

Well, that's the method used in Stardog ICV, and in lots of work on 
constraints over logical formalisms (including description logics).  However, 
just making shapes be types doesn't immediately get one from ShEx to something 
that can nicely handle multiple somethings in RDF.  One also needs machinery 
to require that each instance of a particular type must match a particular 
constraint type.

> There's some language (ShEx, Resource Shapes, Description Set Profiles
> or something else whose name I can't recall) to verify that a node in
> an instance graph matches a declared structure in a schema. Some
> mechanism like oslc:resourceShape associates a graph node with that
> structure. Does that fit your view?

Maybe.  I'm not sure how Resource Shapes 2.0 works, as the description is very 
loose.  It does appear that typed shapes are what is intended to be used for 
what I think of as the usual case of RDF validation - requiring that instances 
of a class have a particular shape.  However, some aspects of Resource Shapes 
2.0 appear to be inimical to type hierarchies.

peter

Received on Tuesday, 29 July 2014 15:01:52 UTC