Re: [Patterns] Materialize Inferences (was Re: Triple materialization at publisher level) from Giovanni Tummarello on 2010-04-07 (public-lod@w3.org from April 2010)

From: Giovanni Tummarello <giovanni.tummarello@deri.org>
Date: Wed, 7 Apr 2010 12:27:22 +0200
To: Leigh Dodds <leigh.dodds@talis.com>
Cc: public-lod <public-lod@w3.org>, Vasiliy Faronov <vfaronov@gmail.com>
Message-ID: <s2k210271541004070327sac5e9b8bm71fd6beb304e2fa5@mail.gmail.com>
My biggest concern with this text is its complexity.

We get it.. but what about people who have less than many years of
specific interest in this :/

in PRACTICE i dont know what the common shared belief is but i think
RDFa embedding or microdata will become the defacto for all (i
recently heard a talk by musicbrainz for example in which they say
yeah we'll add rdfa, why not its easy and this is more or less the
common thinking)

In this casematerialization is likely not going to happen much (you
wouldnt want to materialize inside something visible for the end user
etc).

It would certainly help to have a list of clients and applications
that will/will not require materialization. E.g. st Sindice and
http://Sig.manot needing it but say tabulator and other libraries
needing it. then one could decide if its worth the extra effort (which
is definitely non neglectable).

Gio


On Wed, Apr 7, 2010 at 10:55 AM, Leigh Dodds <leigh.dodds@talis.com> wrote:
> Hi,
>
> Vasiliy asks an excellent question below about publishing of inferred
> data. This happens to be one of the patterns on my short-list, so I
> thought I'd share a draft definition here to seek comments and develop
> the discussion. But I'm also interested to explore whether a focused
> discussion on this list is a good way to mine for extra patterns. I've
> amended the subject to clarify things. Let me know what you think.
>
> This one is a Publishing Pattern.
>
> --
> PATTERN
>
> Materialize Inferences
>
> PROBLEM
>
> How can data be published for use by clients with limited reasoning
> capabilities?
>
> CONTEXT
>
> Linked Data can be consumed by a wide variety of different client
> applications and libraries. Not all of these will have ready access to
> an RDFS or OWL reasoner, e.g. Javascript libraries running within a
> browser or mobile devices with limited processing power. How can a
> publisher provide access to data which can be inferred from the
> triples they are publishing?
>
> SOLUTION
>
> Publish both the original and inferred (materialized) triples within
> the Linked Data.
>
> EXAMPLE
>
> Inferred types; transitive relations for SKOS vocabularies
>
> RATIONALE
>
> Reasoners are not as widely deployed as client libraries for accessing
> RDF. Even as deployment spreads there will typically be processing or
> performance constraints that may limit the ability for a consuming
> application to perform reasoning over some retrieved data. By also
> publishing materialized triples a publisher can better support clients
> in consuming their data.
>
> Most commonly materialization of the inferred triples would happen
> through application of a reasoner to the publishers data. However a
> limited amount of materialized data can easily be included in Linked
> Data views through simple static publishing of the extra relations.
> E.g. by adding extra "redundant" statements in a template.
>
> Materialization may also be targetted in some way, e.g. to address
> specific application needs, rather than publish the full set of
> inferred relations. For example the publisher of a SKOS vocabulary may
> publish transitive relations between SKOS concepts, but opt not to
> include additional properties (e.g. that every skos:prefLabel is also
> an rdfs:label)
>
> The downside to publishing of materialized triples is that there is no
> way for the consuming system to differentiate between the original and
> the inferred data. This limits the ability for the client to access
> only the raw data, e.g. in order to apply some local inferencing
> rules. This is an important consideration as publishers and consumers
> may have very different requirements. Clearly materializing triples
> also places additional burdens on the publisher.
>
> An alternative approach is to publish the materialized data in some
> other way, e.g. in a separate document(s) referenced by a See Also
> link.
>
> --
>
> I think this captures the essence of what Vasiliy describes, and
> recognises the approach that Sindice have taken with publishing the
> materialised data separately.
>
> Cheers,
>
> L.
>
> On 6 April 2010 19:58:53 UTC+1, Vasiliy Faronov <vfaronov@gmail.com> wrote:
>> Hi,
>>
>> The announcement of the Linked Data Patterns book[1] prompted me to
>> raise this question, which I haven't yet seen discussed on its own. If
>> I'm missing something, please point me to the relevant archives.
>>
>> The question is: should publishers of RDF data explicitly include
>> (materialize) triples that are implied by the ontologies or rules used;
>> and if yes, to what extent?
>>
>> For example, should it be
>>        exspecies:14119 skos:prefLabel "Jellyfish" .
>>        ex:bob a foaf:Person .
>> or
>>        exspecies:14119 skos:prefLabel "Jellyfish" ;
>>                rdfs:label "Jellyfish" .
>>        ex:bob a foaf:Person , foaf:Agent .
>> ?
>>
>> The reason I find this worthy of attention is because there seems to be
>> a gap between simple RDF processing and reasoning. It's easy to find an
>> RDF library for your favourite language, fill a graph with some data and
>> do useful things with it, but it's somewhat harder to set up proper
>> RDFS/OWL reasoning over it, not to mention the added requirements for
>> computational power.
>>
>> I think this is one area where a general "best practice" or design
>> pattern can be developed.
>>
>> [1] http://patterns.dataincubator.org/book/
>>
>> --
>> Vasiliy Faronov
>
> Cheers,
>
> L.
>
> --
> Leigh Dodds
> Programme Manager, Talis Platform
> Talis
> leigh.dodds@talis.com
> http://www.talis.com
>
Received on Wednesday, 7 April 2010 10:27:58 UTC