Re: Shapes/ShEx or the worrying issue of yet another syntax and lack of validated vision. from Holger Knublauch on 2014-07-15 (public-rdf-shapes@w3.org from July 2014)

From: Holger Knublauch <holger@topquadrant.com>
Date: Wed, 16 Jul 2014 09:41:58 +1000
To: public-rdf-shapes@w3.org
Message-ID: <53C5BC46.5060309@topquadrant.com>
On 7/16/2014 2:29, Sandro Hawke wrote:
> I agree there should be an RDF representation for data shapes, but I 
> think it has to be somewhat more complicated than the example you 
> provide, so the need for a syntax like ShEx is somewhat greater than 
> you suggest.

Yes this example would become a bit more complex but I believe Jerven's 
point remains valid that introducing a new syntax also has problems.

One thing that is easy to forget is the implementation burden for tool 
vendors. No matter whether we actually believe that a given W3C standard 
makes sense or not, there will always be some customers who believe that 
they urgently need support for exactly that standard. Then it's upon us 
(e.g. my employer) to support all these syntaxes - otherwise we will 
have competitors point fingers that we don't support W3C standards etc.

And sometimes having multiple syntaxes even leads to an unnecessary 
balkanization of the available market - we have seen this with the 
OWL/XML syntax that was introduced with OWL 2. I am sure somebody 
thought this syntax was a good idea for XML tool interoperability. Yet 
the downside is that this is now the default output format of a popular 
open source ontology editor, and a popular commercial ontology editor 
doesn't support that syntax. Not only causes this misunderstandings in 
the user base, but it also means that artifacts shared on the semantic 
web are of limited use.

Finally - and this also applies to the OWL/XML example - any custom 
syntax will lose a key advantage of a triple-based notation such as 
Turtle - it is naturally extensible, and people can attach additional 
triples to constraint definitions, anticipating future needs.

> At some point, if this goes forward, there will be a Working Group 
> which can have opinions, be confused, etc, but for now the thing to be 
> trying to correct is the Charter (and maybe the people editing the 
> charter).

Well it seems to be not only my impression that the charter is built to 
cater for ShEX as the proposed solution. That would of course be a 
self-fulfilling prophesy. I am glad to hear that will not be the case.

> I personally happen to favor less expressive (and thus simpler and 
> more efficient) solutions in this space.  I'm not trying to convince 
> you, but please understand that as often seen in computer science, 
> expressivity is not an unmitigated good, but rather there is a 
> tradeoff.   There's a lot to be said for simple validation.

Yes, but one way to make sure that a certain level of simplicity remains 
in place is to define different "profiles". It would be very well 
possible to define a subset of SPIN "light" that only consists of 
spin:constraints that point to templates from a preselected library. Yet 
the full profile of the language would offer the flexibility that Jerven 
(and I previously) have requested. The trade-off of not starting with a 
clean extensible foundation (for ShEX) would mean that the language 
would be hard-coded against a certain collection of patterns only, and 
limited to those patterns. Other languages (such as OWL DL) have made 
similar choices for the users in the past, and I don't think they have 
been very successful with that. Ivory towers etc.

> So you're not okay with even a syntactic sugar language, what's called 
> a "compact syntax"
>  in the charter?  Something like OWL's Functional Syntax or Manchester 
> Syntax?

If the more general notation (e.g. Turtle) is a reasonably compromise 
then I think no additional syntax should be invented *as an exchange 
format*. However, it would be perfectly reasonable to invent a 
human-readable syntax for presentation purposes and editing tools. For 
example, when you enter Manchester Syntax in an ontology editor, the 
resulting file is still Turtle, even though the user never gets to see 
the details of owl:intersectionOf etc.

> You say "the outcome of a single workshop" as if that were a small 
> thing.    The workshop was widely advertised for months as the time 
> and the place for people who cared about this subject to come forward 
> and be a part of figuring this out.    And many people did.

Not every small semantic web vendor can afford to send their key 
developers on a trip around the world to have him locked up in a W3C 
standardization process for two years. Large corporations and academics 
find that easier. Yet every small semantic web vendor is later 
confronted with having to implement the W3C standards. So please use 
that label carefully. If something gets standardized then it should 
really be the best language for the given problem, and not the language 
that happened to be the favorite brainchild of the people who happened 
to be able to spend 2 years on the process.

The way that I perceived this was similar to Jerven - that the workshop 
took place and then it was already called "Shapes" and work on ShEX 
started instantly. SPIN did not even make it into the requirements 
document even though I submitted prose for that more than a year ago. 
Instead the requirements document showed a very unattractive 
implementation of SPARQL that of course does not meet anyone's 
requirements. I am not getting the impression that all options have been 
sufficiently discussed, and I'd be happy to help with that in the future.

Holger
Received on Tuesday, 15 July 2014 23:43:19 UTC