W3C home > Mailing lists > Public > public-data-shapes-wg@w3.org > November 2016

Re: shapes-ISSUE-198 (rdf:langString): rdf:langString not included in datatypes [SHACL Spec]

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Wed, 30 Nov 2016 15:16:10 +0200
Message-ID: <CA+u4+a1-bxMQqNejRm1uSDEQzOiWHNb9Lrx+SQzDONAqcmij-Q@mail.gmail.com>
To: Karen Coyle <kcoyle@kcoyle.net>
Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
Current definition says:
A validation result must be produced for each value node that is not a
literal, or is a literal with a mismatching datatype, whereby the datatype
of a literal is determined following the datatype function of SPARQL 1.1

The datatype links to the SPARQL datatype function which has the following
definition

17.4.2.7 datatype

Returns the datatype IRI of a literal.

If the literal is a typed literal, return the datatype IRI.
If the literal is a simple literal, return xsd:string
If the literal is literal with a language tag, return rdf:langString

so sh:datatype covers rdf:langString without the clarification added by
Holger

On Wed, Nov 30, 2016 at 3:02 PM, Karen Coyle <kcoyle@kcoyle.net> wrote:

> What I see in the SPARQL document [1] is:
>
> 17.4.2.6 lang
>  simple literal  LANG (literal ltrl)
> Returns the language tag of ltrl, if it has one. It returns "" if ltrl has
> no language tag. Note that the RDF data model does not include literals
> with an empty language tag.
>
> SPARQL uses the function "lang" not rdf:langString, and it is not a
> datatype, it's a separate function (will look for that in the doc). So
> something needs to be said here.
>
> kc
>
> On 11/29/16 11:14 PM, Dimitris Kontokostas wrote:
>
>> Hi Karen, the current wording is inline with the recent discussions we
>> had (and with Ted)
>> I see the "(This also implies that using rdf:langString as value of
>> sh:datatype can be used to test if value nodes have a language tag.) "
>> as reduntant can be removed as it is covered by the SPARQL and RDF
>> definitions
>>
>> On Tue, Nov 22, 2016 at 12:44 AM, Karen Coyle <kcoyle@kcoyle.net
>> <mailto:kcoyle@kcoyle.net>> wrote:
>>
>>
>>
>>     On 11/21/16 1:59 PM, Holger Knublauch wrote:
>>
>>
>>
>>         On 22/11/2016 1:43, Karen Coyle wrote:
>>
>>
>>
>>             On 11/19/16 2:57 PM, Holger Knublauch wrote:
>>
>>
>>
>>                 On 20/11/2016 3:16, Karen Coyle wrote:
>>
>>
>>
>>                     On 11/17/16 10:50 PM, Holger Knublauch wrote:
>>
>>                         Hi Karen,
>>
>>                         - RDF 1.1 *does* mention rdf:langString (see the
>>                         NOTE in
>>                         https://www.w3.org/TR/rdf11-co
>> ncepts/#section-Datatypes
>>                         <https://www.w3.org/TR/rdf11-c
>> oncepts/#section-Datatypes>)
>>
>>
>>                     Yes, and it says there:
>>                     "Language-tagged strings have the datatype IRI
>>                     http://www.w3.org/1999/02/22-rdf-syntax-ns#langString
>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#langString>.
>>
>>                     No datatype is
>>                     formally defined for this IRI because the definition
>>                     of datatypes does
>>                     not accommodate language tags in the lexical space.
>>                     The value space
>>                     associated with this datatype IRI is the set of all
>>                     pairs of strings
>>                     and language tags."
>>
>>                     So it treats it as an exception, and says that it is
>>                     not defined as a
>>                     datatype.
>>
>>                     SKOS also describes language strings as an
>>                     exception, of sorts:
>>                     "Formally, a lexical label is an RDF plain literal
>>                     [RDF-CONCEPTS]. An
>>                     RDF plain literal is composed of a lexical form,
>>                     which is a string of
>>                     UNICODE characters, and an optional language tag,
>>                     which is a string of
>>                     characters conforming to the syntax defined by
>> [BCP47]."
>>
>>                     This says to me that RDF plain literals are NOT
>>                     included in datatypes.
>>                     xsd has xsd:string but that is not the same as the
>>                     rdf literal.
>>
>>                     I can't say that this is crystal clear to me, but
>>                     language strings
>>                     will be very important so I want it to be clear how
>>                     they are handled
>>                     in SHACL.
>>
>>
>>                 I have added a clarification to highlight the
>>                 sh:datatype rdf:langString
>>                 trick:
>>
>>                 https://github.com/w3c/data-sh
>> apes/commit/94e68840b9d11e6ce0abdc79e296b607f8c024be
>>                 <https://github.com/w3c/data-s
>> hapes/commit/94e68840b9d11e6ce0abdc79e296b607f8c024be>
>>
>>
>>
>>                 HTH
>>                 Holger
>>
>>
>>             Digging into datatypes, I'm given to understand that RDF
>> allows
>>             arbitrary datatypes to be defined and used.
>>
>>             "The datatype abstraction used in RDF is compatible with XML
>>             Schema
>>             [XMLSCHEMA11-2]. Any datatype definition that conforms to this
>>             abstraction may be used in RDF, even if not defined in terms
>>             of XML
>>             Schema."
>>
>>             I assume SHACL will be limited to the datatypes in
>>             XMLSchema, which
>>             are the datatypes listed as "XML Schema Built-in Types".
>>
>>
>>         No, sh:datatype works for every datatype, including user-defined
>>         ones.
>>         We compare the IRIs.
>>
>>
>>     OK, I see that it says: "A literal matches a datatype if the
>>     literal's datatype has the same IRI and, for the datatypes supported
>>     by SPARQL 1.1, is not an ill-typed literal." So the datatypes should
>>     be defined as the ones supported by SPARQL and the comparison is the
>>     one provided by SPARQL. That it is the SPARQL-defined datatypes
>>     (even though possibly the same as the RDF ones) should be the
>>     definition in the terminology section.
>>
>>     I think we should talk about this in the group and see if this meets
>>     peoples' needs. I believe that Ted had strong feelings about this
>>     one. Also, it does seem like a good idea to begin reviewing document
>>     changes. We have now identified two that may have been different to
>>     what was actually said at the meeting. Not that either is wrong, but
>>     it is useful to see the detail of how those decisions can be
>>     implemented.
>>
>>     kc
>>
>>
>>
>>         Holger
>>
>>
>>             The spec should state that it is only those built-in
>>             datatypes that
>>             can be validated. The clause ", and datatype" needs to be
>>             removed from
>>             the terminology section, and the datatype constraint should
>> say:
>>
>>             "The values of sh:datatype must be the IRIs of datatypes
>>             from the list
>>             of XML Schema built-in types in RDF Concepts 1.1, plus the
>>             rdf:langString, which can be used to test if value nodes
>>             that are
>>             strings also have a language tag.
>>
>>             kc
>>
>>
>>
>>                     We do have a use case (U21) that requires that SHACL
>>                     can be used to
>>                     validate SKOS vocabularies. I will try to find
>>                     someone from the SKOS
>>                     community who has more knowledge of this.
>>
>>                     kc
>>
>>
>>
>>
>>                         - I see no need to explicitly enumerate all
>>                         datatypes, because RDF 1.1
>>                         itself allows arbitrary IRIs to be used,
>>                         including user-defined
>>                         datatypes. I don't see why rdf:langString would
>>                         be special.
>>
>>                         - I noticed however that with our recent edit to
>>                         the semantics of
>>                         sh:datatype we have lost an important detail,
>>                         namely that the
>>                         definition
>>                         of what is the datatype of a literal must follow
>>                         the semantics of the
>>                         datatype operator in SPARQL [1]. I have added
>>                         this clarification:
>>
>>                         https://github.com/w3c/data-sh
>> apes/commit/eb8eca7d23a91ab884949bc337b5e1a0cee2f747
>>                         <https://github.com/w3c/data-s
>> hapes/commit/eb8eca7d23a91ab884949bc337b5e1a0cee2f747>
>>
>>
>>
>>
>>                         If you follow the SPARQL 1.1 link below, you
>>                         will see that this
>>                         explicitly mentions rdf:langString, so I think
>>                         we are covered.
>>
>>                         Please let me know if this addresses your issue.
>>
>>                         Thanks,
>>                         Holger
>>
>>                         [1]
>>                         https://www.w3.org/TR/sparql11
>> -query/#func-datatype
>>                         <https://www.w3.org/TR/sparql1
>> 1-query/#func-datatype>
>>
>>
>>                         On 18/11/2016 8:34, RDF Data Shapes Working
>>                         Group Issue Tracker wrote:
>>
>>                             shapes-ISSUE-198 (rdf:langString):
>>                             rdf:langString not included in
>>                             datatypes [SHACL Spec]
>>
>>                             http://www.w3.org/2014/data-sh
>> apes/track/issues/198
>>                             <http://www.w3.org/2014/data-s
>> hapes/track/issues/198>
>>
>>                             Raised by: Karen Coyle
>>                             On product: SHACL Spec
>>
>>                             >From email of 31 October 2016:[2]
>>
>>                                     *Karen*
>>                                     This checks the ^^xsd:X literals.
>>                                     sh:nodeKind checks for IRI,
>>                                     bnode,
>>                                     or literal. There's one more type in
>>                                     RDF 1.1 [1] which is the
>>                                     "language-tagged string". We have
>>                                     sh:uniqueLang and sh:languageIn,
>>                                     but
>>                                     is there also a need to check that a
>>                                     literal is language-tagged?
>>
>>                                 *Holger*
>>                                 Being language-tagged is already checked
>>                                 via sh:datatype
>>                                 rdf:langString.
>>                                 So I think that's handled OK.
>>
>>                             OK, but the terminology entry for "datatype"
>>                             cites RDF 1.1 concepts,
>>                             and
>>                             rdf:langString doesn't appear in that
>>                             document. It is defined in RDF
>>                             Schema 1.1, though.[1] Does that mean it
>>                             should be listed
>>                             specifically
>>                             with RDFS as its reference?
>>
>>                             kc
>>                             [1]
>>                             https://www.w3.org/TR/2014/REC
>> -rdf-schema-20140225/#ch_langstring
>>                             <https://www.w3.org/TR/2014/RE
>> C-rdf-schema-20140225/#ch_langstring>
>>                             [2]https://lists.w3.org/Archiv
>> es/Public/public-data-shapes-wg/2016Nov/0001.html
>>                             <https://lists.w3.org/Archives
>> /Public/public-data-shapes-wg/2016Nov/0001.html>
>>
>>
>>
>>
>>                             ***Proposal***
>>
>>                             Modify definition of datatypes in SHACL to
>>                             include rdf:langString
>>                             from
>>                             RDF schema. Also, is rdfs:Literal also needed?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>     --
>>     Karen Coyle
>>     kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net
>>     m: 1-510-435-8234
>>     skype: kcoylenet/+1-510-984-3600 <tel:%2B1-510-984-3600>
>>
>>
>>
>>
>> --
>> Dimitris Kontokostas
>> Department of Computer Science, University of Leipzig & DBpedia
>> Association
>> Projects: http://dbpedia.org, http://rdfunit.aksw.org,
>> http://aligned-project.eu
>> Homepage: http://aksw.org/DimitrisKontokostas
>> Research Group: AKSW/KILT http://aksw.org/Groups/KILT
>>
>>
> --
> Karen Coyle
> kcoyle@kcoyle.net http://kcoyle.net
> m: 1-510-435-8234
> skype: kcoylenet/+1-510-984-3600
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig & DBpedia Association
Projects: http://dbpedia.org, http://rdfunit.aksw.org,
http://aligned-project.eu
Homepage: http://aksw.org/DimitrisKontokostas
Research Group: AKSW/KILT http://aksw.org/Groups/KILT
Received on Wednesday, 30 November 2016 13:17:15 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 30 November 2016 13:17:16 UTC