W3C home > Mailing lists > Public > public-esw-thes@w3.org > April 2020

Re: SKOS profiles: Simple vs Structured

From: Osma Suominen <osma.suominen@helsinki.fi>
Date: Wed, 1 Apr 2020 12:28:58 +0300
To: public-esw-thes@w3.org
Message-ID: <dff8e75a-28a8-dfab-aa0e-74211f5448ec@helsinki.fi>
Thanks for the suggestion Vladimir.

There are also other profile-like aspects of SKOS that could be 
formalized. For example

* should the data set include both skos:broader and skos:narrower 
relations, or just one direction (probably broader)
* similar question for hasTopConcept vs. topConceptOf
* should the data set include all the transitive relationships 
(skos:broaderTransitive, skos:narrowerTransitive) or not

Skosify [1] has settings that can be used to enforce/change some of 
these choices globally for the whole vocabulary. But it would be helpful 
to have some sort of shared profiles or guidelines for these.

-Osma

[1] https://github.com/NatLibFi/Skosify/

Vladimir Alexiev kirjoitti 1.4.2020 klo 11.46:
> Hi!
> 
> Skos is provided in one of two formats (profiles):
> 
>   *
> 
>     Simple (SKOS)
> 
>   *
> 
>     Structured (SKOSXL
>     <https://www.w3.org/TR/skos-reference/#xl>+Advanced Documentation
>     Features
>     <https://www.w3.org/TR/skos-primer/#secadvanceddocumentation> withmetadata/provenance
>     props). "Documentation" means notes, definitions, etc
> 
> It's a common practice to publish "structured" with redundancy, to cater 
> to both "simple" consumers and "structured" consumers:
> 
>   *
> 
>     SKOSXL recommends structured labels to be published redundantly: as
>     plain SKOS labels and as skosxl:Label. Dumbing-Down to SKOS Lexical
>     Labels <https://www.w3.org/TR/skos-reference/#L780>defines how to
>     provide structured and plain labels together.
> 
>   *
> 
>     Not sure about notes, as neither SKOS nor SKOSXL defines a class
>     Note (Getty defines gvp:Note), nor separate properties. So
>     skos:definition and friends would carry both a string and a
>     resource, which will complicate consumption.
> 
> 
> Currently a SKOS dataset or API does not have a way to declare its profile.
> https://github.com/NatLibFi/Skosmos/issues/477 describes some troubles 
> related to this:
> 
>   * Skosmos uses "duplicate label matching logic" to display the label
>     below just once, and assumes label redundancy.
> 
> <concept> skos:prefLabel "foo"@en; skosxl:prefLabel [skosxl:literalForm 
> "foo"@en]
> 
>   * However, there is no similar logic for notes, so it would display
>     labels in duplicate.
> 
> 
> To avoid complicated duplicate matching logic at the consumer, I think 
> we should define two SKOS profiles: simple vs structured.
> 
>   * Should "structured" subsume "simple", i.e. redundantly provide the
>     same strings as simple labels/notes? That will simplify life for
>     data providers
>   * Do we need the two aspects separately: structured labels vs
>     structured notes?
>   *
> 
>     The profile should be communicated:
> 
>       o
> 
>         In HTTP request: client should be able to request "simple" or
>         "structured"
> 
>       o
> 
>         In HTTP response
> 
>       o
> 
>         In the description of ConceptScheme and VOID/DCAT Dataset
>         (property dct:conformsTo)
> 
>   *
> 
>     ConceptSchemes should provide completeness guarantees: if one label
>     or note is structured, then all labels respectively notes are
>     available as structured. I think these SPARQL tests should be used:
> 
>       o
> 
>         Some SKOSXL label exists:
> 
> <scheme> 
> ^skos:inScheme/(skosxl:prefLabel|skosxl:altLabel|skosxl:hiddenLabel) ?label
> 
>   *
> 
>     Some skos:definition or skos:scopeNote is non-literal. I exclude:
>     skos:changeNote, skos:historyNote, skos:editorialNote because these
>     may be structured without the "business payload" notes being
>     structured; skos:example because conceivably it can point to a
>     resource; skos:note because that's a super-prop of excluded props
>     (but many people use it directly, so I'm not sure):
> 
> <scheme> ^skos:inScheme/(skos:definition|skos:scopeNote) ?definition
> 
> 
> Assuming subsumption/redundancy (that "structured" includes "simple") 
> and that the client can use SPARQL then "duplicate matching" can be done 
> easily in SPARQL. Eg something like this:
> 
> select ?lab ?prop ?propLabel ?metadata {
> 
>    <concept> skos:prefLabel ?lab.
> 
>    optional {
> 
>      <concept> skosxl:prefLabel ?label.
> 
>      ?label skosxl:literalForm ?label; ?prop ?metadata
> 
>      filter (?prop != skosxl:literalForm)
> 
>      optional {?prop (rdfs:label|skos:prefLabel) ?propLabel} # need lang 
> preferencing here!
> 
>    }
> 
> }
> 
> 
> select ?def ?prop ?propLabel ?metadata {
> 
>    <concept> skos:definition ?def.
> 
>    optional {
> 
>      <concept> skos:definition ?definition.
> 
>      ?definition rdf:value ?def; ?prop ?metadata
> 
>      filter (?prop != rdf:value)
> 
>      optional {?prop (rdfs:label|skos:prefLabel) ?propLabel} # need lang 
> preferencing here!
> 
>    }
> 
> }
> 
> 
> Are there any takers to formalize SKOS profiles?
> -- 
> Vladimir Alexiev, PhD, PMP
> Chief Data Architect
> Sirma AI, trading as Ontotext: https://www.ontotext.com 
> <https://www.ontotext.com/>, LinkedIn 
> <https://www.linkedin.com/company-beta/208070>,Twitter 
> <https://twitter.com/ontotext>,Rate GraphDB 
> <http://www.capterra.com/database-management-software/reviews/157533/Graph%20DB/Ontotext/new>
> Email: vladimir.alexiev@ontotext.com 
> <mailto:vladimir.alexiev@ontotext.com>, skype:valexiev1
> Mobile: +359 888 568 132, SMS: 359888568132@sms.mtel.net 
> <mailto:359888568132@sms.mtel.net>
> Calendar: 
> https://www.google.com/calendar/embed?src=vladimir.alexiev@ontotext.com
> Publications and CV: https://github.com/VladimirAlexiev/my

-- 
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suominen@helsinki.fi
http://www.nationallibrary.fi
Received on Wednesday, 1 April 2020 09:29:16 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 1 April 2020 09:29:17 UTC