Re: Analysis of Example in ShEx paper submitted to SWJ from Jose Emilio Labra Gayo on 2016-01-02 (public-data-shapes-wg@w3.org from January 2016)

From: Jose Emilio Labra Gayo <jelabra@gmail.com>
Date: Sat, 2 Jan 2016 08:41:01 +0100
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Cc: "Eric Prud'hommeaux" <eric@w3.org>, RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <CAJadXX+=80DFXKJLruVgSr_r4jiBHDrFRpvqGM_thgohxM+Wpw@mail.gmail.com>

>
> >
> > 1 a little explanatory text to the effect of "The original
> >   representation of the web index included type arcs on every
> >   node. This is not the case for RDF data in general so we are
> >   modifying the use case to illustrate how validation occurs without
> >   discriminating type arcs."
> >
> > 2 abandon the web index use case and cook up something much less
> >   documented.
> >
> > IMO, 1 seems much more satisfactory to readers in general.
>
> I disagree, particularly given the thrust of the submission.


Maybe you are trying to impose a thrust to the submission that is not the
one that the authors intended to be.

Given that the
> paper appears to be about how suitable ShEx is for linked data portals, the
> ideal would be to show this with actual use cases.


The paper is not "just" about how suitable ShEx is for linked data portals:
the paper describes a linked data portal using ShEx, talks about how it can
be used to validate it with some tools, describes the same data model in
SHACL, proposes a tool that can generate that data model on demand as a
benchmarking tool and concludes that "the benefits of validation using
either ShEx or SHACL can help the adoption of RDF based solutions where the
quality of data is an important issue."

Features of ShEx that go
> beyond the actual use cases could be covered in a separate section.


The features have been included because they fit into the use case that is
being described. We have already a section devoted to "Advanced features"
where we talk about other features that didn't fit so well into that use
case or whose introduction was considered less important.


> If
> something other than actual use cases is being employed it seems to me to
> be
> better to make up something and show that this something is illustrative of
> actual use cases.
>
> Using a modified use case just looks like the modifications were added only
> because they can be handled by the technology at hand.
>

All the modifications can be justified by that use case. I have already
said that some modifications were introduced to make the paper more
readable and less repetitive. Other modifications were introduced because
we considered them to be important...for example the introduction of CLOSED
shapes is interesting because at the time that we wrote the original paper
it was not so clear (at least to me) the distinction between closed and
open shapes...in the original data model every shape was closed. However in
the paper we thought that it was more interesting to have open shapes by
default and to define one of the shapes as closed. Something similar
happened to the inclusion of disjunction that we didn't use in the original
data model because we were not sure how to handle it at that time.

With regards to the use of "rdf:type" arcs for every node, although in the
case of countries, they all had "rdf:type" arcs, there were other nodes
like computations that didn't have such a restriction. For example, you can
see that the following node doesn't include the "rdf:type" arc:

http://data.webfoundation.org/webindex/v2013/observation/computed_2009_1386752461095_53574

However, as I have already said, we thought that it was better to simplify
the paper omitting the definitions of the statistical computations.

Jose Labra

>
> peter
>
>


-- 
-- Jose Labra

Received on Saturday, 2 January 2016 07:41:51 UTC