- From: Markus Lanthaler <markus.lanthaler@gmx.net>
- Date: Mon, 18 Nov 2013 15:32:34 +0100
- To: "'Richard Cyganiak'" <richard@cyganiak.de>
- Cc: "'RDF Working Group WG'" <public-rdf-wg@w3.org>
On Thursday, November 14, 2013 10:46 PM, Richard Cyganiak wrote: > Markus, > > The section in the way it's currently written is the result of long and > protracted arguments. If you think you can improve the wording, it > would be helpful if you could make a concrete proposal. Fair enough. What about replacing it with: --------------%<----------------------- Literals are used for values such as strings, numbers, and dates. A literal in an RDF graph consists of two or three elements: - a lexical form, being a Unicode [UNICODE] string, which SHOULD be in Normal Form C [NFC], - a datatype IRI, being an IRI identifying a datatype that determines how the lexical form maps to a literal value, and - if and only if the datatype IRI is rdf:langString, optionally a non-empty language tag as defined by [BCP47]. The language tag MUST be well-formed according to section 2.2.9 of [BCP47]. A literal is a language-tagged string if the third element is present. Lexical representations of language tags MAY be converted to lower case. The value space of language tags is always in lower case. Please note that concrete syntaxes MAY support simple literals consisting of only a lexical form without any datatype IRI or language tag. Simple literals are syntactic sugar for abstract syntax literals with the datatype IRI rdf:string. Similarly, most concrete syntaxes represent language-tagged strings without the datatype IRI because it always equals rdf:langString. The literal value associated with a literal is: 1. If the literal is a language-tagged string, then the literal value is a pair consisting of its lexical form and its language tag, in that order. 2. If the literal's datatype IRI is in the set of recognized datatype IRIs, let d be the referent of the datatype IRI. a) If the literal's lexical form is in the lexical space of d, then the literal value is the result of applying the lexical-to-value mapping of d to the lexical form. b) Otherwise, the literal is ill-typed and no literal value can be associated with the literal. Such a case produces a semantic inconsistency but is not syntactically ill-formed. Implementations MUST accept ill-typed literals and produce RDF graphs from them. Implementations MAY produce warnings when encountering ill-typed literals. 3. If the literal's datatype IRI is not recognized by an implementation, then the literal value is not defined by this specification. Literal term equality: Two literals are term-equal (the same RDF literal) if and only if the two lexical forms, the two datatype IRIs, and the two language tags (if any) compare equal, character by character. Thus, two literals can have the same value without being the same RDF term. For example: "1"^^xs:integer "01"^^xs:integer denote the same value, but are not the same literal RDF terms and are not term-equal because their lexical form differs. --------------%<----------------------- Hopefully this makes everything a bit easier to understand and more consistent. I tried to change as little as possible. The only notable changes are that I removed "A badly formed language tag MUST be treated as a syntax error." as I don't believe this belongs into Concepts and also duplicates the other normative statement "[a] language tag MUST be well-formed". I also removed "Multiple literals may have the same lexical form" as it doesn't add anything and falls out naturally of the definition of literal term equality. I'm not sure about statement 3) above: If the literal's datatype IRI **is not recognized by an implementation**, then the literal value is not defined by this specification. Wouldn't it be better to say "... is not in the set of recognized datatype IRIs" with "recognized datatype" being linked to http://www.w3.org/TR/rdf11-concepts/#dfn-recognized-datatype-iris > Concrete syntaxes need to say that the datatype of a literal is > implicitly rdf:langString if a language tag is present, and that it is > implicitly xsd:string if neither datatype nor language string are > present. Right, but apart from Turtle and JSON-LD none of the syntaxes does so. This needs to be fixed. > I agree that the regex is entirely counterproductive and should be > removed. OK, removed in the proposal above. -- Markus Lanthaler @markuslanthaler
Received on Monday, 18 November 2013 14:33:09 UTC