Re: [PORT] SKOS Core validation service

Hi all,

I've changed the validation service to use in memory models, and it is now a bit faster.

The GCL v2.1 (273kb, 4986 statements, [1]) completes the basic integrity test in ~4 seconds and the 
full thesaurus compatibility test in ~16 seconds.

The basic GEMET conceptual structure (~2mb, 38352 statements, [2]) completes the basic integrity 
test in ~370 seconds. The time consuming step here is preparing the test model, which involves using 
a reasoner to perform some inference. I've used the Jena GenericRuleReasoner, which is an 
implementation of the Rete algorithm, I wonder if a backward chaining implementation would be more 
efficient? Note that the full thesaurus compatibility test case has some queries that are relatively 
expensive to execute, and therefore may take a *really* long time for larger datasets.

Cheers,

Al.

[1] http://isegserv.itd.rl.ac.uk/skos/gcl/gcl2.1.rdf
[2] http://isegserv.itd.rl.ac.uk/skos/gemet/gemet_core.rdf

P.s. For attention of GEMET folks I found the following errors in [2]:

Test :: Test A.2. (Associative/Hierarchical Circularity)

This test indicates that a concept has both a hierarchical and an associative relationship with 
another concept.

(Loading query from http://www.w3.org/2004/02/skos/schemarama/test/A2 ... done in 83 ms.)

(Executing test query ... done in 726 ms.)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/8585> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/8582> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/186> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/232> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/12131> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/3400> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/3273> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/3241> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/3591> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/3586> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/6540> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/1162> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/11923> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/2884> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/2911> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/6776> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/7738> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/4183> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/4641> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/78> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/7856> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/2015> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/562> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/560> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/312> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/311> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/4917> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/6033> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/878> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/9043> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/6908> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/2015> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/5626> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/5654> (resource)

[ERROR] Concept [1] is both narrower than, and related to, another concept [2].

    1. <http://www.eionet.eu.int/gemet/concept/2670> (resource)
    2. <http://www.eionet.eu.int/gemet/concept/6446> (resource)

Cheers,

Al.

Alistair Miles wrote:
> 
> Hi all,
> 
> Just to say that I've found that the slowness of the validation service 
> is caused by the time taken to load  statements into a Jena RDBModel and 
> remove all the statements after the testing is done. ... investigating a 
> fix ...
> 
> Al.
> 
> 
> 
> Alistair Miles wrote:
>>
>> Trying the service with the GCL version 2.1 (~270 kb) it is very slow, 
>> I'll have to look into that and see if I can make it better.
>>
>> Cheers,
>>
>> Al.
>>
>>
>> Alistair Miles wrote:
>>>
>>>> If someone has a whole thesaurus converted, 1000s of concepts etc., 
>>>> is it going to bring your server to its knees?
>>>
>>> The service does proper database connection pooling, and drops all 
>>> the stored data after each test run, so it *shouldn't* kill the 
>>> service :) Although large files will take time to upload and query of 
>>> course. I actually haven't tried any large files, I'd better do that 
>>> now ...
>>>
>>> If anyone does have problems using the service, or if the service 
>>> does go down, please let me know.
>>>
>>> Cheers,
>>>
>>> Al.
>>>
>>>>
>>>> Dan
>>>>> Cheers,
>>>>>
>>>>> Al.
>>>>>
>>>>> [1] 
>>>>> http://isegserv.itd.rl.ac.uk/cvs-public/~checkout~/skos/drafts/integrity.html?rev=1.7 
>>>>>
>>>>>
>>>>> -- 
>>>>> Alistair Miles
>>>>> Research Associate
>>>>> CCLRC - Rutherford Appleton Laboratory
>>>>> Building R1 Room 1.60
>>>>> Fermi Avenue
>>>>> Chilton
>>>>> Didcot
>>>>> Oxfordshire OX11 0QX
>>>>> United Kingdom
>>>>> Email: a.j.miles@rl.ac.uk
>>>>> Tel: +44 (0)1235 445440
>>>>
>>>
>>
> 

-- 
Alistair Miles
Research Associate
CCLRC - Rutherford Appleton Laboratory
Building R1 Room 1.60
Fermi Avenue
Chilton
Didcot
Oxfordshire OX11 0QX
United Kingdom
Email: a.j.miles@rl.ac.uk
Tel: +44 (0)1235 445440

Received on Thursday, 2 March 2006 14:51:33 UTC