W3C home > Mailing lists > Public > public-data-shapes-wg@w3.org > January 2015

Re: Language or technology

From: Holger Knublauch <holger@topquadrant.com>
Date: Wed, 28 Jan 2015 17:48:25 +1000
Message-ID: <54C89449.6010902@topquadrant.com>
To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
I really wish we can make progress on this together. Just in time I have 
started to work on an outline for a core specification (here, for LDOM). 
Maybe this helps to illustrate my proposal better. Please take a look at 
the section on ldom:hasValue at

https://w3c.github.io/data-shapes/data-shapes-core/ldom-core.html#property-hasValue

The definition of this term includes

1) A semi-formal textual description of what it does
2) A small example
3) A SPARQL query, with the indication that any engine that supports the 
term ldom:hasValue must implement an algorithm that is equivalent to the 
provided SPARQL query (here: FILTER NOT EXISTS { ?this ?predicate 
?hasValue })

Assuming the rest of the document has set the stage on what things like 
?this and ?predicate mean, this looks unambiguous and sufficiently 
formal to me. Any implementer, even if the target language is not 
SPARQL, could use that definition. But in contrast to other 
specification languages such as Z, this design means that:

- we use the same mechanism that LDOM users also use to define new templates
- we use a specification language that already has a 100% mapping to RDF 
datasets
- we use a language that is already implemented (and optimized) by RDF 
databases

The formal specification document produced by this WG is not for 
end-users, but instead just for a tiny group of implementers that need 
to know all the details. Average users will learn the new language from 
books, papers and examples for copy-and-paste. Many users will just use 
simple things like ldom:property declarations, and those don't need to 
know SPARQL at all. However, advanced users at least have the option to 
express whatever shape they like.

To implement ShExC, you just need a pick an LDOM profile and create a 
parser to-and-from that sub-set of the overall language. As long as your 
own engine implements the behavior specified by the SPARQL query, your 
engine will pass all the same test cases that other LDOM implementations 
also need to pass. And these test cases will be another (large) 
deliverable from the WG.

Please clarify where you see problems with the specification approach 
outlined above.

And to be clear: I don't care how the final language is called - we have 
22 naming proposals on our list right now, and I cannot imagine that it 
will become LDOM ;)

A point where we seem to disagree is that I very much believe that we 
should expose the full expressivity of SPARQL, and not just a sub-set. 
Why throw away so much useful functionality that is already implemented 
by dozens of industrial-strength products? OWL DL already made the 
mistake to pick a (random) subset of logic because it seemingly had nice 
computational characteristics. Unfortunately it turned out that this 
subset is not what most people need in practice, and the theoretical 
benefits did not produce the fast performance that the researchers had 
hoped for. Let's please not fall into this trap again!

Thanks,
Holger


On 1/28/2015 16:57, Jose Emilio Labra Gayo wrote:
>
>>     I have no problem if you call the language LDOM...but whatever
>>     you call it, I think it needs to have a well defined semantics
>>     which could be understood without leaving everything to a full
>>     stack technology that could be much more problematic.
>
>     Fully agreed.
>
>
> Great. I think the trick is to call LDOM the language that I was 
> proposing...so let's assume that LDOM is that language.
>
>     And the stack of LDOM is very minimal.
>
>     Just a syntax on top of SPARQL with a simple execution engine. We
>     still need to write up that little engine, maybe you can help?
>
>
> I will try.
>
> In order to do that, we need to have a clear definition of the basis 
> of LDOM which would be independent of SPARQL.
>
> I propose to identify the constructs that belong the core of the new 
> language from the extension mechanism that allow to write any SPARQL 
> query in a constraint.
>
> The idea is to encapsulate those calls to SPARQL in a safe place so 
> the other language constructs can be understood by people who even 
> don't know SPARQL. In this way, the language will be much more simple 
> and usable.
>
> From my point of view the users of this language should not be 
> semantic web experts with a deep knowledge of SPARQL. We should target 
> an audience that has some basic knowledge of RDF and wants to do 
> useful things with it. That people should not be exposed to the full 
> SPARQL expressiveness when they wanted to define those constraints.
>
> There can be some extension mechanisms to embed external definitions 
> for more advanced users, but I think those mechanisms should be 
> encapsulated and clearly separated from the core language.
>
>>     Maybe, we would not need all the SPARQL functionality but a
>>     subset of it. For example, string comparisons and arithmetic
>>     expressions could be handled by the expressions that appear in
>>     the FILTER expressions of SPARQL, which in fact refer to a subset
>>     of XQuery. But I suppose that this could be part of another thread.
>
>     When you use XQuery why can't you use SPARQL directly? How is that
>     different?
>
>
> I didn't say to use full XQuery, I said that we could just use the 
> same subset of XQuery/XPath that is used by the FILTER expressions in 
> SPARQL. The difference is that it is used in a much more controlled 
> place, which is the evaluation of FILTER expressions so the core of 
> the language can remain being minimal, while we can reuse that part 
> for the string operations and so on...
>
>>
>>     Mostly all of the user stories need to know what a shape means
>>     and how one can differentiate one shape from another. I went
>>     quickly to the wiki and the first story that I met was:
>>
>>     http://www.w3.org/2014/data-shapes/wiki/User_Stories#S12:_App_Interoperability
>>
>>     How could you warrant app interoperability if you don't have a
>>     well defined semantics for the shapes?
>
>     I have no idea what you mean, sorry. Story 12 is addressed via
>     structural declarations such as ldom:property that are sent back
>     from the server in any RDF format to inform other applications
>     which properties (and classes?) a service takes.
>
>
> What I mean is that in most of the user stories you need to have a 
> well defined semantics of the language constructs where someone (a 
> person or a machine) can say if they are the same or not.
>
> If we include the full expressiveness of SPARQL inside the language, 
> then it will become unfeasible, while if you identify the core parts 
> of it, it could be done.
>
>
>     Overall I am not sure what problems you have with LDOM:
>     - it will have a formal specification
>     - it is fully transparent (self-defined)
>     - it is directly executable using SPARQL
>     - it is extensible
>     - it has an RDF syntax
>     - other syntaxes such as ShEx can be mapped into it
>
>
>     I don't understand why you think we need yet another layer in
>     between LDOM and the data layer. This is SPARQL to me.
>
>
> If you keep LDOM with those properties then it is mostly the language 
> that I was asking to have.
>
> I think you misunderstood my point. I was not asking for another 
> layer, I was proposing that the solution should be a language with a 
> core that had a well defined semantics instead of being a full stack 
> technology with all the expressiveness of SPARQL embedded in it.
>
> If you agree with that and you prefer to call that LDOM, no problem 
> for me. The topic of this thread is not about the name of the 
> language, it is about the focus of the development and the possibility 
> to find agreements on some things that are being discussed.
>
> Best regards, Jose Labra
>
>
>
>     Holger
>
>
>
>
> -- 
> Saludos, Labra
Received on Wednesday, 28 January 2015 07:49:01 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 28 January 2015 07:49:01 UTC