- From: Christophe Dupriez <christophe.dupriez@destin.be>
- Date: Fri, 16 Jul 2010 13:01:31 +0200
- To: SKOS <public-esw-thes@w3.org>
In the discussion about "validation" (including different KQIs:Key Quality Indicators or Exception listings), one aspect is very important for me as an implementor: computability... I see that SKOS, compared to ISO 25964 or zThes, is very expandable. But will it remain computable in PRACTICE? For the big thesauri (Agrovoc, Mesh including Substances...) we MUST manage? To parse SKOS in RDFS, using (sub-)classes and (sub-)properties definitions possibilities, you need OWL artillery: JENA, Sesame/Elmo/AliBaba, Manchester OWL API / Protege SKOSed, others? I did not tested everything but I am still unaware of an OWL framework able to handle BIG thesauri linked with BIG information databases (with reasonable hardware and response time: my applications are used in medical emergencies). As a (less but still) flexible alternative, I see XSLT as a serialization tool for a SKOS file into an XML representation of this SKOS data. For instance, my test of an XSLT to make a nice presentation of a SKOS file (http://www.askosi.org/xcss/skosrdf2html.xslt), a serialization in HTML not XML, I noticed it is easy to make a transformation for a RDF flavour (usage pattern) but not for all. XSLT itself is not very good for very big data file unless you can split the data in chunks (transform concept by concept). A specialized parser would do better. My proposal: to define "computability levels" for SKOS files (like the one existing for OWL) 1) linear: an XML serialization (ISO 25964, zThes or a SKOS XSD to standardize) is possible in a linear way (by applying simple replacements based on easy pattern matching) 2) serializable but not linear: the whole SKOS file must be read in memory to access the necessary data for XML serialization. A generic XSLT program is able to do the transformation. 3) limited inference: a specialized XSLT program (which is adapted to sub-classes and sub-properties defined in the SKOS file) is able to do an adequate and faithful serialization. 4) OWL Lite 5) OWL DL 6) OWL Full and to implement a tool to check the computability level of any given SKOS file. My opinion is that SKOS is for humans having to efficiently make efficient(Search) User Interfaces. OWL is for humans having to model data to automate (subtle) processes. Computability is IMHO an important issue for SKOS: when you restart an information server, you want it to be ready to serve in seconds, not hours. Java loads (to get an appropriate memory structure to serve users) an faithful and complete AGROVOC XML serialization in 30 seconds (all languages). Can we hope to do that if a reasoner has to infer relations from an unsorted bag of RDF definitions? Does a SKOS validation process should (optionally) generate an XML serialization of SKOS definitions for faster processing? Please find here my proposal for the XML Schema (XSD) definition for SKOS serialization: http://www.askosi.org/ConceptScheme.xsd A readable version is produced using an XSLT from the XS3P project: http://www.askosi.org/example/ConceptScheme.xml This XSLT was very hard to find but the effort was well compensated: http://sourceforge.net/projects/xs3p/ If we could reach the same quality when displaying a SKOS file ! I would be very happy of your suggestions! Wishing you all a very nice day! Christophe
Received on Friday, 16 July 2010 11:02:01 UTC