- From: Ivan Mikhailov <imikhailov@openlinksw.com>
- Date: Wed, 07 Jul 2010 11:08:13 +0700
- To: Antoine Zimmermann <antoine.zimmermann@deri.org>
- Cc: public-lod@w3.org, Semantic Web <semantic-web@w3.org>
Antoine, all, On Tue, 2010-07-06 at 20:54 +0100, Antoine Zimmermann wrote: > Not only there are volunteers to implement tools which allow literals as > subjects, but there are already implementations out there. > As an example, take Ivan Herman's OWL 2 RL reasoner [1]. You can put > triples with literals as subject, and it will reason with them. > Here in DERI, we also have prototypes processing generalised triples. It is absolutely not a problem to add a support in, e.g., Virtuoso as well. 1 day for non-clustered version + 1 more day for cluster. But it will naturally kill the scalability. Literals in subject position means either outlining literals at all or switch from bitmap indexes to plain, and it the same time it blocks important query rewriting. We have seen triple store benchmark reports where a winner is up to 120 times faster than a loser and nevertheless all participants are in widespread use. With these reports in mind, I can make two forecasts. 1. RDF is so young that even an epic fail like this feature would not immediately throw an implementation away from the market. 2. It will throw it away later. > Other reasoners are dealing with literals as subjects. RIF > implementations are also able to parse triples with literals as > subjects, as it is required by the spec. ... > Some people mentioned scalability issues when we allow literals as > subject. It might be detrimental to the scalability of query engines > over big triple stores, but allowing literals as subjects is perfectly > scalable when it comes to inference materialisation (see recent work on > computing the inference closure of 100 billion triples [2]). > Reasoners should get data from some place and put them to same or other place. There are three sorts of inputs: triple stores with real data, dumps of real data and synthetic benchmarks like LUBM. There are two sorts of outputs: triple stores for real data and papers with nice numbers. Without adequate triple store infrastructure at both ends (or inside), any reasoner is simply unusable. [2] compares a reasoner that can not answer queries after preparing the result with a store that works longer but is capable of doing something for its multiple clients immediately after completion of its work. If this is the best achieved and the most complete result then volunteers are still required. > Considering this amount of usage and use cases, which is certainly meant > to grow in the future, I believe that it is time to standardised > generalised RDF. http://en.wikipedia.org/wiki/Second-system_effect There were "generalised RDFs" before a simple RDF comes to scene. Minsky --- frames and slots. Winston --- knowledge graphs that are only a bit more complicated than RDF. The fate of these approaches is known: great impact on science, little use in industry. > A possible compromise would be to define RDF 2 as /generalised RDF + > named graphs + deprecate stuff/, and have a sublanguage (or profile) > RDF# which forbids literals in subject and predicate positions, as well > as bnodes in predicate position. Breaking a small market in two incompatible parts is as bad as asking my mom what she would like to use on her netbook, ALSA or OSS. She don't know (me either) and she don't want to chose which half of sound applications will crash. > Honestly, it's just about putting a W3C stamp on things that some people > are already using and doing. If people are living in love and happiness without a stamp on a paper, it does not mean living in sin ;) Similarly, people may use literals as subjects without asking others and without any stamp. Best Regards, Ivan Mikhailov OpenLink Software http://virtuoso.openlinksw.com > [2] Jacopo Urbani, Spyros Kotoulas, Jason Maassen, Frank van Harmelen, > and Henri Bal. "OWL reasoning with WebPIE: calculating the closure of > 100 billion triples" in the proceedings of ESWC 2010.
Received on Wednesday, 7 July 2010 04:11:41 UTC