Re: [Patterns] Materialize Inferences (was Re: Triple materialization at publisher level) from Leigh Dodds on 2010-04-10 (public-lod@w3.org from April 2010)

From: Leigh Dodds <leigh.dodds@talis.com>
Date: Sat, 10 Apr 2010 11:03:34 +0100
To: Dan Brickley <danbri@danbri.org>
Cc: public-lod <public-lod@w3.org>
Message-ID: <u2zf323a4471004100303o2651a600kc52267527f3dac07@mail.gmail.com>

Hi,

On 7 April 2010 13:45, Dan Brickley <danbri@danbri.org> wrote:
> This is indeed a good question, and one whose answer is necessarily a
> delicate balance of tradeoffs.

Yes!

> [snip]
>> Linked Data can be consumed by a wide variety of different client
>> applications and libraries. Not all of these will have ready access to
>> an RDFS or OWL reasoner, e.g. Javascript libraries running within a
>> browser or mobile devices with limited processing power. How can a
>> publisher provide access to data which can be inferred from the
>> triples they are publishing?
>
> You might also mention bandwidth here, and the tension with mobile
> devices that if RDF is chunked to heavily, they could find themselves
> consuming a lot of resources (both bandwidth and CPU) that could
> impact end-user responsiveness of an app; particularly if Linked Data
> retrievals are happening in real time.

Good point.

It's occurred to me before that a schema or ontology, when used in
conjunction with a reasoner is (if you squint) a form of data
compression: we can encode less as raw assertions, and "unpack" the
data at the receiving end by applying the schema and a reasoner.

So by publishing Linked Data and suitable vocabularies we can reduce
amount of data copied (pass by reference) and also make those
interactions more stream-lined.

Of course that's the exact opposite of materialising inferences!

Has anyone explored that particular idea?


>> SOLUTION
>>
>> Publish both the original and inferred (materialized) triples within
>> the Linked Data.
>
> Suggest s/and inferred/and some inferred/
>
> This is where the delicate tradeoffs come into play, and where we
> would all benefit if there were conventions for documenting the
> information needs (eg. SPARQL templates) of consuming apps.

Agree.

> If the raw data tells us that _x is a VeganRestaurant, it is probably
> worth also materialising that it is a Restaurant. And probably perhaps
> typically maybe also worth saying it is an Eating or Recreational
> establishment. How many levels or semi-equivalent classes to mention
> here would depend on (a) which popular vocabs have suitable vocabulary
> (b) what consuming apps there are for these kinds of things, and what
> data patterns those apps expect to match. Beyond these mid-level
> concepts, we move towards a level of abstraction where it becomes
> increasingly unlikely that code and services will care. Knowing that
> something is a geo:SpatialThing is not very useful. Knowing also its
> geo:lat and geo:long and some display-oriented properties makes it
> much more useful. Similarly with dcterms:Agent of foaf:Agent; unless
> you have more info, a foaf:Agent could be a bit of software, an animal
> (alive or dead), a historical figure, a Group, etc etc. So I think the
> decision whether or not to publish the inferred type will be quite
> heavily contextual.

Along these lines, Jeni Tennison published some thoughts on included
"Derived Data" in a recent blog post:

http://www.jenitennison.com/blog/node/139

What do people think of that advice w.r.t to this pattern?

> ...
> Ok that's not very friendly text but hope it might be useful.
> Basically "rdf:type owl:Thing" is boring, but "owl:sameAs x:anotherID"
> is very useful...

Thanks Dan I'll try and work in some better advice.

Cheers,

L.
-
Leigh Dodds
Programme Manager, Talis Platform
Talis
leigh.dodds@talis.com
http://www.talis.com

Received on Saturday, 10 April 2010 10:04:07 UTC