Re: RDF 1.1 Semantics Implementation Report on the Swertia RDF-Based Reasoner

Dear Working Group,

I have finally managed to apply the official WG test suite to my RDF 1.1 
Semantics reasoner implementation. I have added my original 
implementation report below, so this is now the complete implementation 
report.

You can find the resulting EARL file (in Turtle format) attached to this 
mail. The same file and additional files, including the evaluated 
reasoning software (only the parts that I am the author for), can be 
found at

 
<https://sourceforge.net/projects/swertia/files/reports/evaluations/rdf11mt/>

To understand the results, I would like to mention that my reasoning 
software does not support datatype semantics, and in particular not 
datatypes beyond the RDF(S) built-in datatypes xsd:string and 
rdf:langString. Unfortunately, the majority of tests in the testsuite 
depends on datatype semantics. As a result, many tests failed, and even 
for many of those that succeeded, this was by pure chance (typically for 
positive satisfiability tests and negative entailment tests, for which 
my reasoner happend to produce the expected result due to its lack of 
support for datatype semantics).

Best regards,
Michael

Am 03.12.2013 00:20, schrieb Michael Schneider:
> Dear Working Group,
>
> please find below my implementation report for my experimental Swertia
> RDF-Based Reasoner, a system that tries to be a close implementation of
> the model-theoretic semantics of RDF (unlike the many existing systems
> that are more based on the RDF entailment rules). I still wasn't able to
> run the official RDF 1.1 tests, due to lack of time. I also believe that
> the result for the test suite will not become very good, as many of the
> tests are about datatype reasoning, which is not supported by my system.
> Anyway, I still plan to run the tests, as soon I find the time, and also
> plan to provide the results and the prototypical system, but for now I
> provide you with my implementation experiences only. I hope this will
> already be useful for the Working Group.
>
> Best regards,
> Michael
>
> = RDF 1.1 Semantics Implementation Report Swertia =
>
> Swertia [1], the Semantic Web Entailment Regime
> Translation and Inference Architecture, is intended
> to become a generic Semantic Web reasoning framework.
> The goal is to provide reasoning support for all major
> Semantic Web reasoning standards, including RDF(S),
> OWL 2 (Direct Semantics, RDF-Based Semantics, RL/RDF rules),
> SWRL, RIF (RIF BLD, RIF Core, RIF+RDF and RIF+OWL combinations),
> and Common Logic. Supported reasoning methods are
> entailment checking, consistency checking and query answering
> in the form of SPARQL entailment regimes. Internally,
> Swertia will not provide any reasoning capabilities itself
> but will provide all necessary means to enable the use of
> existing reasoners, such as first-order logic (FOL)
> theorem provers and model finders, to perform reasoning
> in the supported Semantic Web standards.
>
> The framework itself is still in an early phase and no
> initial release has been published. However, as part
> of Swertia, a prototypical reasoner implementation
> for reasoning in the RDF 1.0 semantics and the
> OWL 2 RDF-based semantics has existed for a while now.
> While not in wide use, the reasoner worked quite well
> for the experimental purposes of the author, and had
> been tested successfully with a comprehensive test suite
> for RDF-based reasoning [2], and used for evaluation work
> in a published paper [3].
>
> For the RDF 1.1 Semantics, an attempt was made to adapt
> the existing reasoner into a system that supports
> as much of RDF 1.1 as possible.
>
> == Overview of the Swertia Reasoner ==
>
> The reasoner is mainly a translator of RDF graphs
> into FOL formulae represented in the TPTP language [4],
> which is understood (directly or indirectly via
> translation tools) by the majority of existing
> FOL reasoning systems.
>
> The translator itself only translates the input RDF graphs
> (the premise and possibly a conjecture graph of an
> entailment checking task) into corresponding axiom
> and conjecture TPTP formulae, following RDF Simple semantics:
> IRIs are translated into constant terms, blanknodes into
> existential variable terms, literals into function terms
> (with different functions for plain, language-tagged and
> typed literals), triples into ternary predicates, and
> graphs into conjunctions of such predicates, with globally
> scoped existential quantifiers for all the blank nodes
> occurring in the graph.
>
> The semantics for the different entailment regimes are not
> treated by the RDF translator itself but, rather, the
> corresponding semantic conditions are directly modeled
> as sets of FOL axiom formulae (usually one formula
> per semantic condition).
>
> For reasoning, the axiom formulae that represent the
> semantic conditions of the respective entailment regime
> are combined with the formulae for the translated input
> graphs graphs and given to an FOL theorem prover
> for entailment or inconsistency detection
> and a FOL model finder for non-entailment
> or consistency detection. The final reasoning result
> is the combination of the result of the two systems.
>
> == Support for Basic Semantic Conditions ==
>
> By "basic semantic conditions", I refer to all
> the semantic conditions that are not specifically
> about blank nodes, plain or tagged strings,
> or datatypes and typed literals (I will get
> to these aspects of the RDF 1.1 Semantics below),
> but including all the axiomatic triples
> for RDF and RDFS.
>
> For RDF 1.0 and OWL 2 Full, the Swertia RDF translator
> itself did not have any particular support for
> the basic semantic conditions. Rather, all these
> semantic conditions were represented by FOL formulae.
> In the past, all the basic semantic conditions
> of the RDF 1.0 entailment regimes Simple, RDF, and RDFS
> were easily translated into FOL formulae.
>
> For RDF 1.1, I went through all the semantic conditions
> of the new entailment regimes to see what needed to
> be changed. I found that hardly anything had changed
> from the point of view of semantic conditions,
> except for the order of entailment regimes.
> In fact, for RDF 1.1, all the original basic
> semantic conditions turned out to be there again.
> Hence, I was able to reuse all the original FOL formulae
> for the basic semantic conditions from the old
> implementation without change.
>
> == Support for Blank Nodes ==
>
> For RDF 1.0 and OWL 2 Full, the Swertia RDF translator
> maps blank nodes into existential variables
> that apply to the whole target FOL formula.
> For this, the translator iterated the input RDF graph,
> looked up all the occurring blank nodes,
> and produced a fresh FOL variable name for each
> new blank node, while for blank nodes that
> re-appeared in different positions of the graph,
> the corresponding FOL variable name was reused.
> This operation was technically easy to implement,
> takes at most n*log(n) time, for n the graph size
> (if, for example, a balanced tree representation is used),
> and requires up to linear-size space for the
> resulting mapping structure (which needs to
> be kept throughout the translation process).
>
> For RDF 1.1, nothing relevant had changed wrt.
> blank nodes that would have required a change
> of this treatment. Hence, there were no additional
> or new implementation issues compared to the old
> RDF revision, so the implementation was doable
> without problems for RDF 1.1.
>
> == Support for Plain and Language-Tagged Strings ==
>
> The original Swertia RDF translator came with specific
> support for plain and language-tagged literals
> in the translation output format TPTP.
> As the translator's input, the Model representation
> of the Jena framework [5] was used, which essentially
> provides an implementation of the RDF 1.0 Abstract Syntax.
> In particular, Jena Model's have direct support
> for plain and language-tagged literals.
>
> For both kinds of plain literals, dedicated FOL function terms
> have been used in the translation: Plain literals were
> represented by unary function terms with the literal's lexical
> form represented by a constant term uniquely encoding the string.
> Language-tagged literals were represented by binary function
> terms, where the first argument term was represented like
> that for plain literals, and the second argument term
> was a corresponding representation of the language tag
> as a constant term.
>
> For RDF 1.1, it was an obvious idea to use the same
> FOL functions for representing strings and language-tagged
> strings in the FOL output, because their interpretations
> (or values) are the same as those of RDF 1.0 plain and
> language-tagged literals, respectively. However, I was
> unclear what to expect from the input format for the
> translation, specifically in the case of language-tagged
> strings.
>
> I understand that concrete RDF serialization syntaxes
> are free to represent language tagged strings as they like
> (including the old tagged plain literal format).
> What I do not understand is how they are represented
> in the abstract RDF 1.1 model. Afterall, if I use a
> framework like Jena, I have to rely on the parsing
> from the concrete syntax into the internal representation
> model, and I am unclear what will happen for
> language-tagged literals. If Jena parses them into
> the old representation for language-tagged literals,
> than nothing would need to be changed in my implementation.
> However, if they are mapped into something else,
> I would need to do a change to my translator software
> as well.
>
> According to ยง3.3 of the "Concents and Abstract Syntax"
> document, "a literal is a language-tagged string
> if and only if its datatype IRI is rdf:langString,
> and only in this case the third element is present:..."
> I am not sure if I really understand this. So far,
> my guess was that a language-tagged string would
> be a typed literal, where the lexical form is composed
> of the "plain" lexical form", the "@" sign, and then
> the language tag, i.e.:
>
>      ( "foo@en" , rdf:langString )
>
> But the above definition sounds to me more as if a
> language-tagged string is a /triple/ consisting of
> (1) the lexical form /without/ the language tag; and
> (2) the language tag; and
> (3) the datatype IRI rdf:langString, i.e.
>
>      ( "foo", "en", rdf:langString )
>
> It would be good to clarify the situation to make
> it easier for implementers to decide how to support
> language-tagged strings.
>
> I, for now, decided to stick with the original implementation,
> which mapped Jena representations of language-tagged literals
> into binary function terms. Therefore, no changes have
> been made so far.
>
> == Support for Datatypes and Typed Literals ==
>
> The most obvious deficit of my original translator
> was its almost complete lack of support
> for datatype semantics, as support for datatypes
> has not yet been of much relevance for my work.
> Nevertheless, there have always be plans to support
> some level of datatype reasoning, and some initial
> ideas have been developed. Definitely, I want to
> support datatypes in the future, because without
> datatype support at least for rudimentary types
> like integer numbers, the system, while appropriate
> for some experimental work, will not be of much
> practical usefulness.
>
> For RDF 1.1, given the short time for the
> Call-for-Implementation phase, I have not undergone
> any effort to support datatypes in the RDF 1.1
> implementation. But at least I have checked for
> changes in the RDF 1.1 specification concerning
> datatypes that would have an effect on datatype
> reasoning, in order to be sure that I will not
> meet problems in the future that would have been
> avoidable. This is not only relevant for RDF 1.1,
> which provides pretty rudimentary datatype semantics,
> but also for expressive semantic extensions,
> such as OWL 2 Full.
>
> The obvious way to start was to compare the original
> RDF 1.0 semantics with the new semantics w.r.t.
> datatypes. If there would be no or only marginal
> changes, this would mean that if an implementation
> would work for RDF 1.0, there should not be too
> many surprises with an implementation for RDF 1.1.
> Or put differently: any big problems with the
> RDF 1.1 semantics would have already be problems
> for RDF 1.0.
>
> Comparing the two semantics, it became clear that,
> apart from some reordering of the semantic conditions
> due to the reordering of the entailment regimes,
> the semantics remained technically almost identical:
> essentially the same semantic conditions that were
> present in the old specification in chapter 5
> of datatypes were again present in the new spec,
> although spread over different places.
>
> The only problem that I found was with the new notion
> of "identified datatypes":
> In the original spec, the notion of a datatype map
> was that of a set of pairs, which stated associations
> between URIs and the corresponding datatypes.
> So, for example, if a semantic extension of
> RDF 1.0 D-entailment was meant to include the
> xsd:integer datatype, one was able to state that
> the datatype map D contained the pair consisting
> of the URI "xsd:integer" with the particular datatype
> of integers as defined in the XSD Datatypes spec.
> For implementations, this would make the situation
> sufficiently clear. In RDF 1.1, we only get a
> set of datatype IRIs, and the actual association
> with concrete datatypes is not directly supported.
> So an implementation of a particular semantic extension
> of RDF 1.1 needs to somehow find out what the
> associations are. Of course, a definition of a
> particular semantic extension would tell the
> identified datatypes for the identifying IRIs
> in SOME way, but in any case if WILL
> have to say what the association is, otherwise
> it would be impossible for an implementation to ever
> become compliant. In other words, there must
> _always_ be such an association in order to be
> useful, because just a set of IRIs can be
> interpreted in any arbitrary way. Therefore, the
> RDF spec should, as it did in the past, and
> as several other W3C standards on top of it
> such as RIF, SPARQL 1.1, and OWL 2, do, support
> this idea directly in terms of a set of
> associations, not only as a set of IRIs alone!
>
> == Conclusions ==
>
> For the most part, the adaptation of the existing
> RDF translator was straight-forward and little
> was to be done. There was some confusion about
> the representation of language-tagged strings,
> specifically what their real representation is
> in the RDF 1.1 abstract syntax. The specification
> should be clearer about this. As the original
> RDF translator did not offer explicit support for
> datatype semantics, and there was only very little
> time given by the CfI, I decided not to do
> any implementation effort for datatype semantics,
> and only have a look what /would/ have to change
> if I had datatype support. It turned out that
> technically the semantics has not changed much.
> However, one problem (not so much for RDF(S),
> but for more expressive systems with more datatypes)
> would in my opinion be that the RDF 1.1
> semantics does not support stating explicit associations
> between datatype IRIs and the correspondingdatatypes,
> but leaves it to other specifications to find a way
> to specify these relationships. I consider this
> to be a problem, and it is definitely a deviation
> from the original RDF specification, that should
> not be done.
>
> == References ==
>
> [1] Swertia Home: http://swertia.sourceforge.net/
> (doesn not contain any sources or binaries currently)
>
> [2] Schneider, M., Mainzer, K.: A Conformance Test Suite
> for the OWL 2 RL/RDF Rules Language and the
> OWL 2 RDF-Based Semantics. In: Proceedings of the 6th
> International Workshop on OWL: Experiences and Directions
> (OWLED 2009). CEUR Workshop Proceedings, vol. 529 (2009)
>
> [3] Michael Schneider and Geoff Sutcliffe: Reasoning in
> the OWL 2 Full Ontology Language using First-Order Automated
> Theorem Proving. In: Proceedings of the 23rd International
> Conference on Automated Deduction (CADE 2011), pp. 446-460,
> LNAI 6803 (2011).
>
> [4] TPTP Home and language specification: http://tptp.org/
>
> [5] Jena Home: http://jena.apache.org/

Received on Saturday, 4 January 2014 17:29:52 UTC