- From: Andy Seaborne <andy.seaborne@topquadrant.com>
- Date: Mon, 12 Sep 2016 11:07:56 +0100
- To: public-data-shapes-wg@w3.org
On 12/09/16 00:30, Holger Knublauch wrote: > Taking this and Andy's input into consideration, maybe sh:langShape is > an overkill and all we really need is a new parameter such as > sh:languageIn which takes a node and, if it has a language tag, verifies > that it matches one of the provided languages following the SPARQL > langMatches semantics. For example: > > ex:MyShape > a sh:Shape ; > sh:property [ > sh:predicate skos:prefLabel ; > sh:or ( [ sh:datatype xsd:string ] [ sh:datatype rdf:langString > ] ) ; > sh:langMatches ( "en" "fr" "de" ) . A note: this is a slightly different operation to sparql:langMatches which takes a language tag and a language match, not a literal and language match. Some people prefer that local names are not reused to mean slightly different things where possible. > ] . > > langMatches could be for just a single language, but having a list is > shorter for this (apparently) common case in multi-lingual countries > such as Belgium. I didn't know the RFC supports wildcards - this should > hopefully flexible enough to cover all given use cases, but others may > need to confirm. > > Regards, > Holger > > PS: Andy, I prefer sh:datatype rdf:langString because it would be one > thing less to check (by form builders etc), and furthermore I believe > the semantics of sh:langMatches needs to be that it only does something > if the literal really has a language tag. Otherwise it would be harder > to express mixed cases of either string or langString (which I believe > is quite common). Consider sh:property [ sh:predicate skos:prefLabel ; sh:langMatches ( "en" "fr" "de" ) . ] . with data: <uri> skos:prefLabel 123 . which is a violation when sh:langMatches requires the language tag but passes if sh:langMatches only triggers if there is a language tag at all. I find the latter a strange natural interpretation of the shape. String or language match would be: sh:or ( [ sh:datatype xsd:string ] [ sh:langMatches ( "en" "fr" "de" ) ] ) ; There is no need to test for [ sh:datatype rdf:langString ] as well as it is implicit in having any language tag so it happens when sh:langMatches requires the language tag. For error checking: This data: "abcde"^^rdf:langString is malformed and not in the value-space of rdf:langString; it is like writing "abcde"^^xsd:integer It does have the datatype - it does not represent a legal value. Another way: make language match "" mean xsd:string. (c.f XML where xml:lang="" means no language tag althouhg with slightly different implications). sh:property [ sh:or ( [ sh:datatype xsd:string ] [ sh:langMatches ( "en" "fr" "de" ) ] ) ; ] . vs sh:property [ sh:predicate skos:prefLabel ; sh:langMatches ("" "en" "fr" "de" ) . ] . Andy > > > On 9/09/2016 23:02, Dimitris Kontokostas wrote: >> What Holger proposes is flexible and we have the option to reuse some >> existing constructs but I have some concerns about this design >> >> the reason is that we currently have focus node constraints and >> property (path) constraints >> with this approach we create a new construct only for languages that >> is not clear what it is and how it operates e.g. >> - if there are any differences in the meaning of e.g. sh:in when it >> is used in a language context and when not >> - how sh:langShape inter-operates with the extension mechanism and >> - what does it mean to have e.g. sh:class in a sh:langShape (does all >> constraints apply in all places?) >> >> I would prefer the creation of a few new constraint components e.g. >> sh:languageIn that allows us to enable (if we want) the RFCs Andy >> suggested. >> >> Another option would be to generalize the mechanism Holger suggested >> and provide transformation functions on the focus nodes / values a >> shapes selects >> This way we would be able to e.g. create a sets/lists of language >> tags, unwrap RDF lists, etc and apply the shacl core components on the >> transformed values >> However, I think it is a bit late to try something in this direction >> >> Best, >> Dimitris >> >> On Fri, Sep 9, 2016 at 2:58 AM, Holger Knublauch >> <holger@topquadrant.com <mailto:holger@topquadrant.com>> wrote: >> >> I was given the task of writing up sh:langShape today. I already >> did a few months back: >> >> https://lists.w3.org/Archives/Public/public-data-shapes-wg/2016Mar/0262.html >> <https://lists.w3.org/Archives/Public/public-data-shapes-wg/2016Mar/0262.html> >> >> From the list of requirements at >> >> https://www.w3.org/2014/data-shapes/wiki/Proposals#ISSUE-137_Missing_constraint_for_language_tag >> <https://www.w3.org/2014/data-shapes/wiki/Proposals#ISSUE-137_Missing_constraint_for_language_tag> >> >> * In SKOS, there can be only one prefLabel per language tag >> >> Already exists: sh:uniqueLang true >> >> * Constrain the valid language tags to a provided set, e.g. >> (@en, @de, @fr) >> >> See my email, sh:langShape [ sh:in ( "en" "de" "fr" ) ] >> >> * Require that all literals have/do not have a language tag >> >> Already exists: sh:datatype rdf:langString >> >> * Require that a particular property have a set of literals, one >> each language tag, e.g. "there must be 3 instances of >> dct:abstract; the values must be literals; there must be one >> literal for each valid language code (@en, @de, @fr)" >> >> Can be expressed through a combination of sh:minCount = 3, >> sh:maxCount = 3, sh:uniqueLang. (What are "instances of >> dct:abstract"?) >> >> * Check that the language tag is 2-letter | 3-letter | does/does >> not have hyphens >> >> sh:langShape [ sh:minLength 2 ; sh:maxLength 2 ; or: sh:pattern >> "... regex ..." ] >> >> * Check that the 2 or 3-letter tag is valid >> >> >> Assuming that the list of valid tags is stored somewhere, e.g. in >> an rdf:List iso:ValidLanguages: >> >> sh:langShape [ sh:in iso:ValidLanguages ] >> >> I don't think maintaining such a list ourselves is within the >> scope of the WG, yet it could be expressed in the Core language. >> >> >> PROPOSAL: Add sh:langShape as outlined. Meaning: if a value node >> has a language tag then the string of the language tag itself >> needs to have the given sh:Shape. >> >> >> Holger >> >> >> >> >> -- >> Dimitris Kontokostas >> Department of Computer Science, University of Leipzig & DBpedia >> Association >> Projects: http://dbpedia.org, http://rdfunit.aksw.org, >> http://aligned-project.eu >> Homepage: http://aksw.org/DimitrisKontokostas >> Research Group: AKSW/KILT http://aksw.org/Groups/KILT >> >
Received on Monday, 12 September 2016 10:08:28 UTC