Re: SKOS tools? Discussion about Protege and Large scale terminologies...

Hi all,

from my point of view the best option is to use Jena persistent model.
SKOS Core API implementation using Jena is very straightforward and
Jena persistent model is able to work with large scale vocabularies
using relational databases,so you can avoid implement your own SQL
layer.

I'm using my own SKOS Core API implementation and persistent model
based on Jena and performance is quite impressive. I must implement
additional test for large scale vocabularies like LCSH, but my actual
implementation for NBII thesaurus have a very good performance. On the
other hand I combine Jena persistent model with Lucene to improve
search performance and functionalities. This combination offer also a
very good way to retrieve concepts using Lucene and apply some kind of
inference using Jena SPARQL engine.

best

jose

On Thu, Jun 18, 2009 at 9:15 AM, Christophe
Dupriez<christophe.dupriez@destin.be> wrote:
> Dear Mr. Cox,
>
> Thanks for the critical point of view: it is useful to underline what needs
> to be done.
>
> I will be at the Protege workshop in Amsterdam next week:
> http://protege.stanford.edu/conference/2009/workshops.html
> "Challenges of Developing and Deploying Large Scale Biomedical
> Terminologies"
> We cannot be nearer of SKOS related questions...
> Do you authorize me to use your conclusions to see what is the position of
> the Protege community toward SKOS?
>
> Personally, I developed a Java layer to access SQL, XML or CSV sources thru
> a SKOS conceptual model (API).
> This means that data is edited in an existing application (which feeds an
> SQL database, for instance) and that a "SKOS view" of this data is available
> to the same or other applications.
> An example of this is the integration within DSpace which can be seen at:
> http://www.windmusic.org/dspace
> This site is in testing state. Official launch in september.
> A DSpace collection is used to manage keywords, another one for authors, one
> for orchestras, etc: all used as SKOS concepts within the (modified) DSpace
> application.
>
> I was thinking that Protege could be used for SKOS/RDF file editing (when no
> existing application provides an SQL database): your negative experience is
> a bad news.
>
> Meanwhile, there is also the XML approach (no RDF): the basic SKOS
> conceptual model (no local extension) can be represented as POJO and
> marshalled/unmarshalled to XML using JAXP.
> This is very fast. The whole Agrovoc thesaurus (28 954 concepts with terms
> in more than 20 languages) can be translated from SKOS/RDF to XML.
> From there, it is loaded in memory in less than 7 seconds: in memory, it is
> also very compact as it is represented as Java objects and not RDF
> relations.
> (little extract at the end of this message)
>
> This is the approach I took because I needed to handle big thesauri for
> precise uses (multilingualism; information retrieval: search expansion,
> faceted browsing, etc.).
> SKOS is also to develop smaller thesaurus where the researchers may want to
> create new relations: this is not in my main use cases.
>
> * I am a long time user of Protege for modelling: for that, it is really
> fine.
> * I also used it with OWL for a dictionary of literary devices (20 thousand
> entries): this was far too heavy and slow.
> It is now in JSPWiki: http://www.destin.be/DIRE/ : In the future, I will use
> SKOS for the backbones (links) and Wiki pages for the examples and other
> content.
> * I was hoping that for medium scale thesaurus, it would be nice: your
> message is cooling this much more than I expected.
>
> This being told, I remained convinced that SKOS/RDF must be supported and I
> will implement RDF marshalling/unmarshalling when requested by one of my
> projects.
> This also means that a SKOS/RDF file could be created starting from any
> existing SQL database.
>
> Have a nice day!
>
> Christophe Dupriez
>
> Small extract of Agrovoc in XML (SKOS conceptual model):
> <?xml version="1.0" encoding="utf-8"?>
> <conceptScheme about="c">
> <title>AGROVOC</title>
> <title lang="en">FAO Multilingual Thesaurus.</title>
> <namespace>http://www.fao.org/aims/aos/agrovoc#c_</namespace>
> <editorialNote>Free to all for non commercial use.</editorialNote>
>
> <concept about="3">
> <prefLabel lang="en">ABA</prefLabel>
> <prefLabel lang="fr">ABA</prefLabel>
> <prefLabel lang="es">ABA</prefLabel>
> <prefLabel lang="ar">آبا</prefLabel>
> <prefLabel lang="zh">脱落酸</prefLabel>
> <prefLabel lang="pt">Aba</prefLabel>
> <prefLabel lang="th">เอบีเอ</prefLabel>
> <prefLabel lang="ja">アブシジン酸</prefLabel>
> <prefLabel lang="sk">ABA</prefLabel>
> <prefLabel lang="de">ABA</prefLabel>
> <prefLabel lang="hu">Aba</prefLabel>
> <prefLabel lang="pl">Aba</prefLabel>
> <prefLabel lang="fa">آ.بی.آ</prefLabel>
> <prefLabel lang="it">ABA</prefLabel>
> <prefLabel lang="hi">एo बीo एo</prefLabel>
> <altLabel lang="cs">kyselina abscisová</altLabel>
> <altLabel lang="de">ABSCISINSAEURE</altLabel>
> <altLabel lang="en">Abscisic acid</altLabel>
> <altLabel lang="es">Ácido abscísico</altLabel>
> <altLabel lang="fa">آبسيسيك اسيد</altLabel>
> <altLabel lang="fr">Acide abscissique</altLabel>
> <altLabel lang="hi">एबसिसिक अम्ल</altLabel>
> <altLabel lang="hu">abszcizinsav</altLabel>
> <altLabel lang="it">Acido abscissico</altLabel>
> <altLabel lang="ja">アブシジン酸</altLabel>
> <altLabel lang="pl">Kwas abscysynowy</altLabel>
> <altLabel lang="pt">Ácido abscísico</altLabel>
> <altLabel lang="sk">kyselina abscisová</altLabel>
> <altLabel lang="th">กรดแอบไซสิค</altLabel>
> <altLabel lang="zh">ABA</altLabel>
> <editorialNote>The last modification for this concept was for the term in
> CS</editorialNote>
> <broader>3397</broader>
> <broader>32543</broader>
> </concept>
>
> <concept about="4">
> <prefLabel lang="en">Abaca</prefLabel>
> ...
> </concept>
>
> <concept about="9001020">
> <prefLabel lang="en">stakeholders</prefLabel>
> <prefLabel lang="fr">partie intéressée</prefLabel>
> <prefLabel lang="es">agentes interesados</prefLabel>
> <prefLabel lang="zh">利益相关者</prefLabel>
> <prefLabel lang="sk">podielnici</prefLabel>
> <altLabel lang="cs">zainteresovaná strana</altLabel>
> <altLabel lang="en">Stakeholder</altLabel>
> <altLabel lang="es">Parte interesada</altLabel>
> <altLabel lang="fr">Parties prenantes</altLabel>
> <altLabel lang="hi">टेक धारी</altLabel>
> <altLabel lang="it">Parti interessate</altLabel>
> <altLabel lang="sk">podielnici</altLabel>
> <editorialNote>The last modification for this concept was for the term in
> CS</editorialNote>
> <broader>50227</broader>
> <related>37968</related>
> </concept>
> </conceptScheme>
>
> Simon.Cox@csiro.au a écrit :
>>
>> Dear SKOS list -
>> The GeoSciML project has been evaluating SKOS to implement its 'controlled
>> concpet' model (see
>> http://www.geosciml.org/geosciml/2.0/doc/GeoSciML/Vocabulary/package-summary.html
>> for the UML representation, and you'll see how SKOS is a close match!). My
>> colleague Steve Richard is the lead editor, on behalf of a consortium
>> including many of the world's leading geological surveys*, for around 25
>> vocabularies related to geology. This is a significant effort in the natural
>> sciences.
>> Being a happy old XML hacker I can tolerate RDF/XML and a text editor for
>> prototyping. But this obviously ain't acceptable for most users, doesn't
>> scale to production work, and fails to provide the consistency checking and
>> visualization that a proper editor would.
>> We are mighty frustrated (and getting worse!) at the state of tool
>> support. In particular, Protégé, even with the SKOS plugin, appears to be
>> fatally flawed. I've used it from time to time for _viewing_ a concept
>> scheme, but have never been able to successfully round trip through
>> export/import, so it doesn't work as an editor. Steve is now finding further
>> flaws - see below - e.g. labels implemented as objectProperty, no literal
>> support or language attributes.
>> This is all very disappointing. What tools are people people using
>> successfully for development and management of SKOS instances?
>>
>> Simon Cox
>>
>> (*) See http://onegeology.org/technical_progress/geosciml.html and
>> http://onegeology.org/participants/graphical_map.html
>> ______
>> Simon.Cox@csiro.au  CSIRO Exploration & Mining
>> 26 Dick Perry Avenue, Kensington WA 6151 PO Box 1130, Bentley WA 6102
>>  AUSTRALIA
>> T: +61 (0)8 6436 8639  Cell: +61 (0) 403 302 672
>> Polycom PVX: 130.116.146.28
>> <http://www.csiro.au>
>>
>> ABN: 41 687 119 230
>>
>> -----Original Message-----
>> From: stephen richard [mailto:steve.richard@azgs.az.gov] Sent: Thursday,
>> 18 June 2009 9:31 AM
>> To: Cox, Simon (E&M, Kensington)
>> Subject: Re: [Auscope-geosciml] Simple lithology vocabulary in MMI
>> repository
>>
>> Right now I'm mostly frustrated-
>> the new version of Protege (v4, released today) doesn't preserve language
>> attributes on prefLabel elements, and the SKOS tool models prefLabel as an
>> ObjectProperty, so you can't populate it with a literal, and it doesn't
>> appear to be consistent with the current SKOS spec.
>> What I started out to do was clean up the hierarchy in standardLithology,
>> which is a mess. The owl/SKOS tools looked like a possible way to do it.
>> Instead I've spun my wheels for 3 days. The idea is simply to be able to
>> round trip between GeologicVocabulary and some brand of SKOS, for which
>> there is a functional tool, build and fix hierarchies in SKOS, and convert
>> back to GeologicVocabulary to update in the BRGM repository. Meanwhile there
>> are the possibilities of vocabulary services that could assist with document
>> validation and better yet query resolution with hierarchical properties....
>>
>> What's AuScope using for SKOS tools?
>>
>> steve
>>
>> Simon.Cox@csiro.au wrote:
>>
>>>
>>> Steve
>>> Good hunting.
>>> A few comments and a bit of an update about where the AuScope
>>> vocabs/vocab-server work is at:
>>>
>>>
>>> i. Terms and Labels -
>>> skos:prefLabel, skos:altLabel, skos:hiddenLabel and skos:notation can and
>>> should be used to support the assignment of multi-lingual terms, synonyms,
>>> misspellings (!) and symbols to concepts, regardless of the encoding (OWL,
>>> SKOS, other RDF languages). The semantics of these are clear and relevant to
>>> our needs, and the rdfs:domain of all of these is unrestricted so they can
>>> be applied to any rdf resource.
>>>
>>>
>>
>> ...
>>
>> --
>> Stephen M. Richard
>> Section Chief, Geoinformatics
>> Arizona Geological Survey
>> 416 W. Congress St., #100
>> Tucson, Arizona, 85701 USA
>>
>> Phone: Office: (520) 209-4127
>> Reception: (520) 770-3500
>> FAX: (520) 770-3505
>>
>> email: steve.richard@azgs.az.gov
>>
>>
>>
>>
>>
>
>



-- 
José Ramón Pérez Agüera

Dept. de Ingeniería del Software e Inteligencia Artificial
Despacho 411 tlf. 913947599
Facultad de Informática
Universidad Complutense de Madrid

Received on Monday, 22 June 2009 08:57:10 UTC