RE: Back from the W3C Workshop on Web Standardization for Graph Data from William Van Woensel on 2019-03-28 (public-n3-dev@w3.org from March 2019)

From: William Van Woensel <william.vanwoensel@gmail.com>
Date: Thu, 28 Mar 2019 12:24:01 -0300
To: "'Doerthe Arndt'" <doerthe.arndt@ugent.be>, "'Gregg Kellogg'" <gregg@greggkellogg.net>
Cc: <public-n3-dev@w3.org>
Message-ID: <017d01d4e57a$466fcd50$d34f67f0$@gmail.com>
Hi everyone,

Additionally we also need to fix the meaning of this implicit quantification and in the case of universals I personally think it is more intuitive to quantify on top-level, but I think some kind of testing how  the intuition of most users is could help us here. 

                IMO this would also be in line with how most people view rules—i.e. by universally quantifying all variables at the top level then X=>Y would mean that if X holds then infer Y.

                There’s still some open discussion on how to deal with implicit quantification (e.g., bnode clashes). I updated the  <https://github.com/w3c/N3/issues/5#issuecomment-465149132> summary with where I think we currently stand (see EDIT flags).

- applications of rules are for example the alignment of different data sources using different ontologies, compliance checking (GDPR, fraud detection,...) or data validation.

How do we go further here? We already agreed on collecting concrete examples on git and I still didn't add that many (but I will). But should we additionally keep a list or something with links to concrete rule use cases (not necessary N3)?  

RuleML has a  <http://wiki.ruleml.org/index.php/Introducing_RuleML#RuleML_Uses> number of use cases listed on their website. I took the liberty of uploading the  <https://github.com/w3c/N3/tree/master/files/ruleml%20examples> concrete files I could find (most of them in POSL format) to GitHub. I didn’t have a close look at them yet but it looks like quite a few of them require NAF.

Does anyone in this group use rules (not necessary N3) to map between vocabularies? Thinking about it: I think I do it all the time but don't even realised till it was mentioned in the workshop.

                It would be great if you could upload any examples to GitHub.

Additional advantages of not completely separating rules from rdf triples I see are:

- it is possible to have a proof vocabulary (see http://www.w3.org/2000/10/swap/) in the language itself as we have in N3 

- it is possible to use the same mechanism to search for rules on a dataset as for  triples

- it is possible to generate new rules from existing triples or rules

                Awesome!

I really can't believe that SPARQL is easier to learn than N3 and I also don't think that N3 is difficult to learn if you know turtle. Maybe also something to be tested?

                Yes,  I would agree that I don’t see the added difficulty .. (in fact, having e.g., a built-in list construct facilitates working with RDF imo!) 

 

William

 

From: Doerthe Arndt <doerthe.arndt@ugent.be> 
Sent: March-27-19 5:01 PM
To: William Van Woensel <william.vanwoensel@gmail.com>; 'Gregg Kellogg' <gregg@greggkellogg.net>
Cc: public-n3-dev@w3.org
Subject: Re: Back from the W3C Workshop on Web Standardization for Graph Data

 

Hi William, all,


- a rule language should be kept simple (can be extended)

 

I think that having a lightweight notation (i.e., lacking quantifiers) is an especially interesting aspect of N3. It allows developers to use N3 at the level of complexity that suit their needs. Nevertheless, IMO this also means outfitting N3 with the necessary constructs to suit more complex use cases for developers who need them. 

That is an interesting perspective. When I looked into the current uncertainties with N3 I always thought that it would be a solution to get away from implicit quantification and only use explicit quantifiers. Of course I only thought about formalisation and not about users. So, not supporting implicit quantification is not an option, noted. 

You are right with the constructs. 

Additionally we also need to fix the meaning of this implicit quantification and in the case of universals I personally think it is more intuitive to quantify on top-level, but I think some kind of testing how  the intuition of most users is could help us here. 

 

 

- applications of rules are for example the alignment of different data sources using different ontologies, compliance checking (GDPR, fraud detection,...) or data validation




                Indeed, this is quite interesting. It would be great to collect some concrete use cases here (i.e., real-world examples of alignment problems that are resolved by rules).

 

How do we go further here? We already agreed on collecting concrete examples on git and I still didn't add that many (but I will). But should we additionally keep a list or something with links to concrete rule use cases (not necessary N3)?  

Does anyone in this group use rules (not necessary N3) to map between vocabularies? Thinking about it: I think I do it all the time but don't even realised till it was mentioned in the workshop. So, I refine: Does anyone have a use case where rules are only applied to align vocabularies without further rule-reasoning?

 

There was not a uniformity of opinion if the rules should be done outside of the graph (ala SHACL/ShEx or SPARQL) or as an extension of the graph (ala N3). My personal opinion is that they should be expressed as part of the graph/dataset, which makes them more immediately available to a developer. 

 

+1. Note that this is also in line with one of the major goals of (rule-based) reasoning, i.e., extending a small core ontology with implied knowledge (easier when embedding the rules directly in the ontology).

 

Both good arguments. Additional advantages of not completely separating rules from rdf triples I see are:

- it is possible to have a proof vocabulary (see http://www.w3.org/2000/10/swap/) in the language itself as we have in N3 

- it is possible to use the same mechanism to search for rules on a dataset as for  triples

- it is possible to generate new rules from existing triples or rules

 

Some personal takeaways after a quick read through the report:

 

David Booth: I want rules to be: 1. convenient and concise to write and read; and 2. easy to specify where and when I want to apply them.  I do not want to apply all rules to all data!  For example, I want to apply rules to only a particular named graph, or collection of named graphs.  And then I want to use those result in another named graph, for example, and compose results.

 

Having rules operate on graphs instead of RDF stores may also point towards the usefulness of e.g., scoped negation as failure and limiting the scope of quantifiers (i.e., to a particular graph) (?) We had a  <https://github.com/w3c/N3/issues/9#issuecomment-458874667> discussion on this on the GitHub page.

 

Ivan Herman: If people come to RDF, they will learn Turtle.  From Turtle to SPARQL is relatively easy, because it is based on the same syntax.   CONSTRUCT means that they are within the same syntax.  Problem w N3: too late.  It should have been done years ago, before SPARQL.  If we add n3 we force the user to learn another syntax.  If we cover 70%-80% of use cases with rules then that would be a good start.  Q: Is n3 really hard to learn?  A: Yes, n3 is different from SPARQL/Turtle.  Yet another obstacle.

 

Personally I don’t see it this way since N3 is a superset of RDF. One can start with “standard” RDF triples and, if they are so inclined, make their way up to N3 (or, even better, start out with the much more user-friendly N3 syntax with its support for lists, etc.). In their most basic form N3 rules are sets of triple patterns which is very comparable to SPARQL.

 

I fully support the notion of covering 70-80% of use cases with rules—but IMO this requires going beyond a simple rule language.

The "Q" in that case was me. I really can't believe that SPARQL is easier to learn than N3 and I also don't think that N3 is difficult to learn if you know turtle. Maybe also something to be tested?

 

Kind regards,
Doerthe

 

 

 

William

-- 
Dörthe Arndt
Researcher Semantic Web
imec - Ghent University - IDLab | Faculty of Engineering and Architecture | Department of Electronics and Information Systems
Technologiepark-Zwijnaarde 122, 9052 Ghent, Belgium
t: +32 9 331 49 59 | e: doerthe.arndt@ugent.be <mailto:doerthe.arndt@ugent.be>
Received on Thursday, 28 March 2019 15:24:31 UTC