Re: Two Standards ?

Folks,

Apologies for not catching the call for feedback below.  I very much like Hoger's suggestion, "Another option would be to define a compiler from ShExC into LDOM RDF and back", as it would get us closer to our (Mayo/CIMI's)  primary goal — a formal definition of the semantics of RDF data shapes.  If we can compile back and forth, we are (hopefully) demonstrating semantic equivalence.  The Mayo/CIMI goal is to arrive at:

  1.  A consistent set of semantics for the specification of Shape Expressions
  2.  At least one grammar/syntax that can formally represent these semantics

We (again – Mayo/CIMI)  would hope that the grammar meets some of our own goals in terms of succinctness, understandability and the like but we will be able live with whatever comes out as long as it fulfills our semantic / functional requirements.

It is quite likely that we will end up using other representational forms in some of our projects in any case (one of the representations that we have waiting in the wings is UML). While it might be helpful to have community buy in on those other forms, it isn't essential as long as we can demonstrate that there is an isomorphism between our representation and the (or "a")  standard representational form.   I see uses for both ShExC and LDOM RDF and, as long as we can agree that they are (or share) different representations for the same thing, then we will be quite happy.

Arguing about whether ShExC, LDOM RDF or some other representation is the right way to go is, in my mind, kind of like arguing on the syntax of Turtle vs RDF/XML without first agreeing on the underlying model of RDF itself.  The representations are essential, in the sense that it is danged hard to talk about a model without having a succinct grammar to do so, but we need to use a first approximation of some grammar to discuss the model and, only then, to create final specification(s) for various representational forms.

I would propose that we declare at the outset that we want both ShExC and LDOM RDF to be able to represent the same core semantics (I say "core" because I wouldn't object to either or both of them having additional but optional features that go beyond the core specification).  Lets use whatever formalism makes the most sense in a given context to explain what a given constraint should do and, once we've arrived at some sort of consensus, record the decision using formal logic.  A final step would be to adjust the designs of one or both languages so that we know exactly what an expression means and how the two align.

Harold Solbrig
Mayo Clinic


From: Holger Knublauch <holger@topquadrant.com<mailto:holger@topquadrant.com>>
Date: Friday, February 13, 2015 at 3:30 PM
To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org<mailto:public-data-shapes-wg@w3.org>>
Subject: Re: Two Standards ?
Resent-From: <public-data-shapes-wg@w3.org<mailto:public-data-shapes-wg@w3.org>>
Resent-Date: Friday, February 13, 2015 at 3:30 PM

The upcoming F2F meeting is supposed to deliver the general direction, select editors and deliverables [1]. I don't think my proposal here is premature at all. In fact it touches on the very fundamental questions that Peter suggested we discuss too.

Holger

[1] https://www.w3.org/2014/data-shapes/wiki/F2F2#Objectives


On 2/14/15 7:03 AM, Michel Dumontier wrote:
I think all this discussion premature and counter to the intended focus of this WG. Stay focused on delivering the promised outcomes.

m.

Michel Dumontier, PhD
Associate Professor of Medicine (Biomedical Informatics)
Stanford University
http://dumontierlab.com

On Fri, Feb 13, 2015 at 12:06 PM, Holger Knublauch <holger@topquadrant.com<mailto:holger@topquadrant.com>> wrote:
My concern is not about personal preferences, but about language(s) that end users will actually want to use. We already struggle to understand shapes versus classes within the WG. The separation that I propose would allow us to write two different primers that will be consistent to understand and use.

If the charter does not give us the possibility to define two standards, then this becomes a matter of packaging. One approach is to introduce a small Abstract Syntax for the commonality between LDOM and ShExC. This may include something like the Shape Selectors, but not in RDF but "abstract". Another option would be to define a compiler from ShExC into LDOM RDF and back (I had proposed that before [1] without getting feedback). Both concrete syntaxes could still have a similar name, if that helps with the standardization process.

I also assume that WGs are still allowed to slightly diverge from the original Charter if they justify their reasons for doing so - at least that is what I was told when we wrote the original charter. I believe the discussions over the last half year (and potentially another half a year well into 2015) provide some of those reasons. Also, producing a Compact Syntax has been mentioned in the charter.

Holger

[1] https://lists.w3.org/Archives/Public/public-data-shapes-wg/2015Jan/0223.html




On 2/14/15 5:07 AM, Arnaud Le Hors wrote:
I don't think there is evidence yet that a common solution can't be found. Yesterday's strawpoll tells me there is hope we can find some common ground to build on to produce a standard that we can all live with. This may not be anyone's personal preference but standards are typically not.

It may be that eventually some will seek to define other standards but this won't happen here. Our charter doesn't give us that possibility.
--
Arnaud  Le Hors - Senior Technical Staff Member, Open Web Technologies - IBM Software Group




From:        Dean Allemang <dallemang@workingontologist.com><mailto:dallemang@workingontologist.com>
To:        Holger Knublauch <holger@topquadrant.com><mailto:holger@topquadrant.com>
Cc:        RDF Data Shapes Working Group <public-data-shapes-wg@w3.org><mailto:public-data-shapes-wg@w3.org>
Date:        02/12/2015 08:08 PM
Subject:        Re: Two Standards ?
Sent by:        deanallemang@gmail.com<mailto:deanallemang@gmail.com>
________________________________



I have been talking about Shapes with my FIBO colleagues - we continue to face the expressivity issues around OWL (role intersections and friendly fire seem to come up a lot for us).  We are moving in to things like SPIN/SWRL, and/or FIBO-RIF(a proposal that I worked on  last July that moves everything into a subset of RIF) to solve our expressivity issues.  We are currently going to do all of this in Informative Annexes (as opposed to normative recommendations), because we don't (yet) have a good standard in which to write these things.

An expressive shapes language, based on SPARQL, would satisfy our group's needs quite well.

I wonder a bit about the relationship between the two languages that Holger proposes - is it important that we be able to define how a ShEx shape corresponds to an LDOM definition?  Or are they being used in completely different places?  I guess if we take the XSD/RelaxNG example, there needn't be a deterministic relationship between them.

Looking back, it seems to me that it would have been a good thing if RELAX-NG had been done through the auspices of the W3C instead of OASIS.  As it stands now, it seems as if one has to choose one's standard organization to support one's technology.  If we simply recognize that there could be two different perspectives and develop both standards, we  could actually provide coherent (non-competitive) advice about when each one should be used.  If we don't, and the other perspective has an audience, we'll end up seeing it pursued in some other organization.  Ugh.


Prima facie, it would seem like we are doubling our work, but I don't think that's the case. As Holger said, each group has done enough work now to write up a coherent spec.  It would actually be *more* work to try to reconcile them into a single Recommendation.


This situation seems to me to be a bit different from the profiles of OWL, where we use the same words with different constraints on their usage.  Here, we are solving parallel problems with different mechanisms.  Making two standards, that are well-informed by one another, seems like a good idea to me.



Dean






On Thu, Feb 12, 2015 at 7:25 PM, Holger Knublauch <holger@topquadrant.com<mailto:holger@topquadrant.com>> wrote:
A random thought before the week end:

Can this WG (please!) produce two separate standards?

1) An RDF vocabulary similar to the original LDOM proposal
2) The ShEx Compact Syntax aiming at the data reuse scenarios

We already have RDF Schema. We already have OWL. We would already have a third language (LDOM or whatever). Why not have a forth language?

The situation in very similar to XML Schema vs. DTD. vs RELAX-NG. They all solve similar problems, but from different perspectives.

We are currently trying to mix different paradigms together and risk producing something that nobody will be happy with. People with OO background will wonder what the fuzz is about this parallel structure called "Shapes", raising the implementation costs and creating a mix of parallel semantic webs. And ShEx people don't want to worry about the interactions of the various triple models at all - instead have the ShExC files live outside of the triple store. And that makes sense because even if you introduce ldom:instanceShape to separate shapes from classes, you'd still run into conflicts with other ShEx models that also happen to use ldom:instanceShape. The only proper solution here is to not have triples in the first place.

Another constant source of conflict will be the role of SPARQL. The ShEx camp seems to be more concerned about the balance of expressivity and complexity, while the SPIN camp has plenty of use cases where expressivity is the main concern. Furthermore, a SPIN-like LDOM can more easily be combined with existing RDFS and OWL ontologies, filling gaps in that space.

We have a handful of ShEx supporters in the WG. I am sure they could turn their Member Submission into a formal spec quite rapidly. From an LDOM point of view we have plenty of stuff already implemented, and I'd be happy to wrestle and collaborate with anyone to flesh out the open details. The Requirements document is already being split into "Property constraints" and "Complex constraints", so both camps can harvest from the same catalog of requirements. We can also share test cases and produce a small document explaining how to map from one language to another. But the aforementioned reasons and the endless discussions over the last half a year provide plenty of arguments that justify why the WG chose to create two languages.

Why would this separation of deliverables not work?

Thanks,
Holger

Received on Friday, 13 February 2015 22:07:13 UTC