[Patterns] Materialize Inferences (was Re: Triple materialization at publisher level)

Hi,

Vasiliy asks an excellent question below about publishing of inferred
data. This happens to be one of the patterns on my short-list, so I
thought I'd share a draft definition here to seek comments and develop
the discussion. But I'm also interested to explore whether a focused
discussion on this list is a good way to mine for extra patterns. I've
amended the subject to clarify things. Let me know what you think.

This one is a Publishing Pattern.

--
PATTERN

Materialize Inferences

PROBLEM

How can data be published for use by clients with limited reasoning
capabilities?

CONTEXT

Linked Data can be consumed by a wide variety of different client
applications and libraries. Not all of these will have ready access to
an RDFS or OWL reasoner, e.g. Javascript libraries running within a
browser or mobile devices with limited processing power. How can a
publisher provide access to data which can be inferred from the
triples they are publishing?

SOLUTION

Publish both the original and inferred (materialized) triples within
the Linked Data.

EXAMPLE

Inferred types; transitive relations for SKOS vocabularies

RATIONALE

Reasoners are not as widely deployed as client libraries for accessing
RDF. Even as deployment spreads there will typically be processing or
performance constraints that may limit the ability for a consuming
application to perform reasoning over some retrieved data. By also
publishing materialized triples a publisher can better support clients
in consuming their data.

Most commonly materialization of the inferred triples would happen
through application of a reasoner to the publishers data. However a
limited amount of materialized data can easily be included in Linked
Data views through simple static publishing of the extra relations.
E.g. by adding extra "redundant" statements in a template.

Materialization may also be targetted in some way, e.g. to address
specific application needs, rather than publish the full set of
inferred relations. For example the publisher of a SKOS vocabulary may
publish transitive relations between SKOS concepts, but opt not to
include additional properties (e.g. that every skos:prefLabel is also
an rdfs:label)

The downside to publishing of materialized triples is that there is no
way for the consuming system to differentiate between the original and
the inferred data. This limits the ability for the client to access
only the raw data, e.g. in order to apply some local inferencing
rules. This is an important consideration as publishers and consumers
may have very different requirements. Clearly materializing triples
also places additional burdens on the publisher.

An alternative approach is to publish the materialized data in some
other way, e.g. in a separate document(s) referenced by a See Also
link.

--

I think this captures the essence of what Vasiliy describes, and
recognises the approach that Sindice have taken with publishing the
materialised data separately.

Cheers,

L.

On 6 April 2010 19:58:53 UTC+1, Vasiliy Faronov <vfaronov@gmail.com> wrote:
> Hi,
>
> The announcement of the Linked Data Patterns book[1] prompted me to
> raise this question, which I haven't yet seen discussed on its own. If
> I'm missing something, please point me to the relevant archives.
>
> The question is: should publishers of RDF data explicitly include
> (materialize) triples that are implied by the ontologies or rules used;
> and if yes, to what extent?
>
> For example, should it be
>        exspecies:14119 skos:prefLabel "Jellyfish" .
>        ex:bob a foaf:Person .
> or
>        exspecies:14119 skos:prefLabel "Jellyfish" ;
>                rdfs:label "Jellyfish" .
>        ex:bob a foaf:Person , foaf:Agent .
> ?
>
> The reason I find this worthy of attention is because there seems to be
> a gap between simple RDF processing and reasoning. It's easy to find an
> RDF library for your favourite language, fill a graph with some data and
> do useful things with it, but it's somewhat harder to set up proper
> RDFS/OWL reasoning over it, not to mention the added requirements for
> computational power.
>
> I think this is one area where a general "best practice" or design
> pattern can be developed.
>
> [1] http://patterns.dataincubator.org/book/
>
> --
> Vasiliy Faronov

Cheers,

L.

-- 
Leigh Dodds
Programme Manager, Talis Platform
Talis
leigh.dodds@talis.com
http://www.talis.com

Received on Wednesday, 7 April 2010 08:55:42 UTC