- From: Holger Knublauch <holger@topquadrant.com>
- Date: Thu, 15 Sep 2016 10:25:33 +1000
- To: public-data-shapes-wg@w3.org
As resolved today, I have integrated sh:languageIn into the spec: http://w3c.github.io/data-shapes/shacl/#LanguageInConstraintComponent Any glitches in there? Holger On 13/09/2016 14:34, Holger Knublauch wrote: > > > On 12/09/2016 20:07, Andy Seaborne wrote: >> >> >> On 12/09/16 00:30, Holger Knublauch wrote: >>> Taking this and Andy's input into consideration, maybe sh:langShape is >>> an overkill and all we really need is a new parameter such as >>> sh:languageIn which takes a node and, if it has a language tag, >>> verifies >>> that it matches one of the provided languages following the SPARQL >>> langMatches semantics. For example: >>> >>> ex:MyShape >>> a sh:Shape ; >>> sh:property [ >>> sh:predicate skos:prefLabel ; >>> sh:or ( [ sh:datatype xsd:string ] [ sh:datatype rdf:langString >>> ] ) ; >>> sh:langMatches ( "en" "fr" "de" ) . >> >> A note: this is a slightly different operation to sparql:langMatches >> which takes a language tag and a language match, not a literal and >> language match. Some people prefer that local names are not reused >> to mean slightly different things where possible. > > Oops, yes. I intended to use sh:languageIn but forgot to update the > example. So here it is again: > > ex:MyShape > a sh:Shape ; > sh:property [ > sh:predicate skos:prefLabel ; > sh:or ( [ sh:datatype xsd:string ] [ sh:datatype > rdf:langString ] ) ; > sh:languageIn ( "en" "fr" "de" ) ; > ] . > > >> >>> ] . >>> >>> langMatches could be for just a single language, but having a list is >>> shorter for this (apparently) common case in multi-lingual countries >>> such as Belgium. I didn't know the RFC supports wildcards - this should >>> hopefully flexible enough to cover all given use cases, but others may >>> need to confirm. >>> >>> Regards, >>> Holger >>> >>> PS: Andy, I prefer sh:datatype rdf:langString because it would be one >>> thing less to check (by form builders etc), and furthermore I believe >>> the semantics of sh:langMatches needs to be that it only does something >>> if the literal really has a language tag. Otherwise it would be harder >>> to express mixed cases of either string or langString (which I believe >>> is quite common). >> >> Consider >> >> sh:property [ >> sh:predicate skos:prefLabel ; >> sh:langMatches ( "en" "fr" "de" ) . >> ] . >> >> with data: >> >> <uri> skos:prefLabel 123 . >> >> which is a violation when sh:langMatches requires the language tag >> but passes if sh:langMatches only triggers if there is a language tag >> at all. I find the latter a strange natural interpretation of the >> shape. >> >> String or language match would be: >> >> sh:or ( [ sh:datatype xsd:string ] >> [ sh:langMatches ( "en" "fr" "de" ) ] ) ; >> >> There is no need to test for [ sh:datatype rdf:langString ] as well >> as it is implicit in having any language tag so it happens when >> sh:langMatches requires the language tag. >> >> For error checking: >> >> This data: >> >> "abcde"^^rdf:langString >> >> is malformed and not in the value-space of rdf:langString; it is like >> writing >> >> "abcde"^^xsd:integer >> >> It does have the datatype - it does not represent a legal value. >> >> >> Another way: make language match "" mean xsd:string. (c.f XML where >> xml:lang="" means no language tag althouhg with slightly different >> implications). >> >> sh:property [ >> sh:or ( [ sh:datatype xsd:string ] >> [ sh:langMatches ( "en" "fr" "de" ) ] ) ; >> ] . >> >> vs >> >> sh:property [ >> sh:predicate skos:prefLabel ; >> sh:langMatches ("" "en" "fr" "de" ) . >> ] . > > So the change you seem to be advocating is to make sh:languageIn > produce violations if the value node is not a literal, or a literal > that does not have any language tag. As you point out, this would lead > to situations in which the sh:datatype rdfs:langString can be omitted > in an sh:or. The meaning of sh:datatype would not change, and people > can still state sh:datatype rdf:langString for the (common) case in > which any language is permitted. I believe I would be OK with that > interpretation. > > Here is a SPARQL ASK validator query that is passing the English and > Francais cases below: > > > ASK { > BIND (lang($value) AS ?valueLang) . > FILTER (bound(?valueLang) && EXISTS { > GRAPH $shapesGraph { > $languageIn (rdf:rest*)/rdf:first ?lang . > FILTER (langMatches(?valueLang, ?lang)) > } } ) > } > > > ex:TestShape > rdf:type sh:Shape ; > rdfs:label "Test shape" ; > sh:languageIn ( > "en" > "fr" > ) ; > sh:targetNode "English"@en ; > sh:targetNode "Francais"@fr ; > sh:targetNode rdfs:Resource ; # Fails > sh:targetNode "Deutsch"@de ; # Fails > sh:targetNode "Plain String" ; # Fails > . > > Holger >
Received on Thursday, 15 September 2016 00:26:04 UTC