RDF serialization + discussions

Hi every one,

I am new to discussion group. I am an bio-informatition PHD at the Wageningen university who is integrating different RDF databases and bringing the semantic web technology to the biologist. During this did I have the need to describe the structure of a RDF resource, so that my users known how to query and parse the data.

I found SHEX, which perfectly fits my needs. However the RDF serialization format was not completely finished and did not allow my to capture the class semantics and so
did I start to help to define this and put this onto the wiki of SHEX.
During this work I encountered the following conflict
* SHEX is a Shape expression language and should define shape expression and should not hold any semantic meaning or be restricted by OO related 'rules'
* I have the need to define classes and the related semantics

So that is why I came up with the idea/design of 2 levels.
Level 1 is the original SHEX and defines the shapes and level 2 defines the extra semantics.

For level 1 (the original SHEX) did I make a beginning to define its rdf serialization format, describing in both SHEX itself and SHEX serialized to RDF (using the newly defined serialization format :P)
(https://www.w3.org/2001/sw/wiki/ShEx/RDF_serialization#RDF_serialization).

For this I added a general description, with a lot of discussion points, I hope I have captured most of them.
When creating this I came also up with some possible extensions and discussion points for the SHEX
language itself. (https://www.w3.org/2001/sw/wiki/Discussion_SHEX_format)

Level 2 can be used define your class and property semantics for which the associated definition in level 1 are automatically added. Level 2 is currently only defined in the RDF serialization format.
(https://www.w3.org/2001/sw/wiki/Level_2)

I put all the definition,discussions and ideas on the wiki page (my main sub page = https://www.w3.org/2001/sw/wiki/ShEx/RDF_serialization#RDF_serialization). The overview images containing the class diagram is attached to the
mail as I was not able to upload this to the wiki(I have no permission to do so).

The definition might contain some small imperfections, but we can fix these as the definitions matures.

I would like to have your opinion about
1. on the idea of splitting it in level 1 and level 2.
2. The initial setup and structure of the serialization format (level 1)
Other elements I would like to discus after this
3. The possible extensions and discussion about the SHEX language itself (https://www.w3.org/2001/sw/wiki/Discussion_SHEX_format)
4. The basis of the level 2 definition

Today (7-feb-2014) did I already had a discussion with Eric about these topics.

We had a discussion about the fact that the <child> & <parent> construct should not have an (implicit) subClassOf meaning.
This because of 2 reasons:
1. "shape names don't necessarily correspond to class names" (eric)
2. If we add some semantics to level 1/original SHEX we should add/reuse all other semantics from RDFS, OWL and SKOS

Related to we discussed about a strategy to prevent the
<child> & <parent> construct having a subClassOf meaning.
This can be achieve by defining
'<child> & <parent> { ... }  # <child> inherits <parent>, i.e. includes <parent>'s rules and refs to @<parent> can be fulfilled by something matching <child>'
as
'<child> { & <parent> , ... } # <child> includes <parent>'s rules but ref's to @<parent> cannot be fulfilled by something matching <child>'
So this would characterize <child> as being a "sub shape of" <parent> in <child> & <parent> {  } (though that may frequently align with subclassof relationships)

To define the same thing of the first definition: '<child> & <parent> { ... }  # <child> inherits <parent>, i.e. includes <parent>'s rules and refs to @<parent> can be fulfilled by something matching <child>'.
An explicit rule has to be added:
'<parent> { ... ( & <child1> | &<child2> )? } (parent includes a optional OrRuleGroup that includes shapes/rule groups child1 or child2).
Note when removing the ? (optional) it will have the same effect as marking the <parent> shape as VIRTUAL.

Further I raised the concern: That in the serialized Shex RDF format each rule is referable by its ID, whereas in the SHEX language is not. This would make the RDF serialization version to be more expressive then the SHEX language.
Upon which Eric told my that there was already a discussion going on between Jose and Eric about that.

See here the original discussion
===============================================================
[11:03:56 AM] * Eric Prud'hommeaux invited jesse.van.dam
[11:04:02 AM] Eric Prud'hommeaux: welcome jesse
[11:04:12 AM] jesse.van.dam: ok, hello
[11:04:36 AM] Eric Prud'hommeaux: jesse was just saying he'd like one profile of ShEx to not include inheritance
[11:06:19 AM] Eric Prud'hommeaux: current status:
<child> & <parent> { ... }  # <child> inherits <parent>, i.e. includes <parent>'s rules and refs to @<parent> can be fulfilled by somethign matching <child>
<child> { & <parent> , ... } # <child> includes <parent>'s rules but ref's to @<parent> cannot be fulfilled by something matching <child>
[11:10:31 AM] Eric Prud'hommeaux: by unproven thesis is that inheritance is equiv to disjunction, so e.g.
  <A> { :p1 @<parent> } VIRTUAL <parent> { :p2 . } <child1> & <parent> { :p3 . } <child2> & <parent> { :p4 . }
is the same as
  <A> { :p1 @<child1> | :p1 @<child2> } {<child1> { :p2 . , :p3 . } <child2> { :p2 . , :p4 . }
[11:13:45 AM] Eric Prud'hommeaux: Note that the VIRTUAL keeps { <myA> :p1 <myC> . <myC> :p2 "hi" } from satisfying <A>.
[11:14:49 AM] Eric Prud'hommeaux: It also means that <myC> in { <myA> :p1 <myC> . <myC> :p2 "hi" . <myC> :p3 "there" } will only match <child1> instead of both <parent> and <child1>.
[11:15:09 AM] jesse.van.dam: yes that is true
[11:15:50 AM] Eric Prud'hommeaux: so the question i think is not about expressivity but syntactic complexity of the parser/compiler
[11:18:03 AM] jesse.van.dam: to make the parent match also an explicit rule should be added like:
<parent> { ... ( & <child1> | &<child2> )? }
[11:18:27 AM] jesse.van.dam: parent includes a OrRuleGroup that includes rule groups child1 or child2
[11:18:31 AM] jesse.van.dam: and is optional
[11:18:51 AM] jesse.van.dam: When defined virtual the optional can be left out
[11:18:52 AM] Eric Prud'hommeaux: yeah, that'd work
[11:19:48 AM] jesse.van.dam: using this strategy, you will exclude sugggestion that subClassOf meaning is given to this <child> & <parent> construct
[11:20:30 AM] jesse.van.dam: because if you would ad subClassOf you would also have to add the (some of) the other elemetns of RDFS, OWL, SKOS
[11:23:48 AM] Eric Prud'hommeaux: well, i think that shape names don't necessarily corerspond to class names in e.g.
<bug> { :submitter @<user> , :addressedBy @<employee> }
<user> { rdf:type (foaf:Person) , foaf:name STRING? , foaf:mbox IRI }
<employee> { rdf:type (foaf:Person) , foaf:name STRING, work:phoneNum STRING , work:office STRING }
[11:24:27 AM] jesse.van.dam: yah that exactly correct
[11:25:16 AM] Eric Prud'hommeaux: so i'd characterize <child> as being a "sub shape of" <parent> in <child> & <parent> {  }
[11:25:29 AM] jesse.van.dam: yes
[11:25:31 AM] Eric Prud'hommeaux: (though that may frequently align with subclassof relationshiops)
[11:25:40 AM] jesse.van.dam: yah that is true
[11:26:16 AM] jesse.van.dam: and then the in level2 you can define your class structre will automaticly map to level1 shapes
[11:27:19 AM] Eric Prud'hommeaux: i guess you were concearned that <child> & <parent> {  } might imply a subClassOf instead of just a syntactic inclusion and the <parent>→<child> polymorphism
[11:27:30 AM] jesse.van.dam: yes
[11:27:50 AM] jesse.van.dam: and that we would have to add all the other elements of RDFS,OWL,SKOS
[11:28:59 AM] Eric Prud'hommeaux: right, we don't wnat to imply that, but i guess i'm (perhaps naively) not as worried that it will imply that subClassOf
[11:29:50 AM] jesse.van.dam: second concern comes from the fact if I start to serialize it to RDF where each rule is referencable by its ID, whereas in the SHEX language is not. This would make the RDF serialization version to be more expressive then the SHEX language
[11:30:14 AM] Eric Prud'hommeaux: there is an optional id on each shex rule
[11:30:50 AM] Eric Prud'hommeaux: (jose and i were actually discussing this a couple days ago)
[11:34:10 AM] jesse.van.dam: That would be nice, because that would solve the issues encountered when difining the SHEX rdf serialization format
[11:34:56 AM] Eric Prud'hommeaux: the grammar happens to be a superlanguage of Turtle allowing you to capture e.g. OSLC's use cases like:
PREFIX oslc: <http://oslc.example>
<S> { $<rule1> <p1> .
      [ oslc:length 30 ] # some properties about <rule1>
} # i think there's a way to add more, e.g. referencing <rule1>, but i'm poking in the grammar now to recall
[11:34:57 AM] jesse.van.dam: You can check the rdf serialization format defined both in SHEX itself aswell in RDF SHEX
[11:37:49 AM] jesse.van.dam: what is this superlanguage of turtle?
[11:39:28 AM] Eric Prud'hommeaux: the idea is that you can still write what you want in turtle, but you can also use the ShEx DSL to make it more human friendly
[11:39:58 AM] Eric Prud'hommeaux: the grammar was an attempt at mixing the two so you could e.g. add any extra stuff you wanted to the graph
[11:42:30 AM] Eric Prud'hommeaux: i have to run out the door .
===============================================================

Greetz,
Jesse van Dam

Received on Monday, 10 February 2014 08:23:47 UTC