- From: Kjetil Kjernsmo <kjetil@kjernsmo.net>
- Date: Wed, 29 Jan 2014 20:30:20 +0100
- To: void-discussion@googlegroups.com
- Cc: HCLS IG <public-semweb-lifesci@w3.org>
On Wednesday 29. January 2014 15.05.14 Richard Cyganiak wrote: > Less is probably more there. Unless you have a very concrete need for the > more complex constructs there (e.g., you have a federation framework that > requires exactly those statistics), then I'd recommend sticking to the > simplest constructs. If there is a particular number you want to include > that cannot be expressed with a simple VoID property, it may be better to > introduce a new property. > > I say this because the more complex constructs (e.g., clever stuff with > class and property partitions) tend to go unused and can be misleading. So, just a quick note from me too, as I'm doing some clever data profiling stuff for my ph.d. ;-) Most of the proposed statistics here is useful for federation, as shown by Olaf Görlitz et al in their SPLENDID paper. However, as I'm computing it in my code, I can only note that it is pretty heavy to compute, and indeed, it is quite unlikely that people will do it unless the data providers have a very compelling reason to do it. I've seen that in the last few days, Philip Stutz have been implementing cardinality caching in their Triplerush triple store. That's one case where it is likely that such statistics can be provided, since it becomes much more affordable to do. See https://github.com/uzh/triplerush Another case where it is likely to exist is when the statistics is used for internal optimizations. For all others, I think the key is to argue for *why* a certain piece of information is important to expose, keeping in mind that it is possibly demanding to produce. Just an IG recommendation is unlikely to suffice, I suspect, it would have to be on the form "to enable $foo, expose $bar". Cheers, Kjetil
Received on Wednesday, 29 January 2014 19:31:07 UTC