- From: David Booth <david@dbooth.org>
- Date: Mon, 29 Dec 2014 15:08:59 -0500
- To: Pat Hayes <phayes@ihmc.us>
- CC: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>, "public-rdf-comments@w3.org" <public-rdf-comments@w3.org>
On 12/29/2014 02:38 PM, Pat Hayes wrote: > > On Dec 29, 2014, at 12:52 PM, David Booth <david@dbooth.org> wrote: > >> P.S. Or to put it differently, it would be harmful if anyone >> interpreted the existing ambiguity to be intentional. > > Well, there is no actual ambiguity. In RDF 1.1, the datatype of plain > literals (without a language tag) is xsd:string, unambiguously. That > type URI appears explictly in the RDF 1.1 abstract (graph) syntax, > unambiguously. But the RDF specs do not define all possible surface > syntaxes for RDF, and they explicitly allow a surface syntax to omit > the xsd:string typing URI as a form of syntactic sugar, since it is > implied in all cases, so its omission does not introduce any > ambiguity. Sure, but the question was about the Canonical N-Triples *serialization* -- not the RDF 1.1 abstract graph. As Stian Soiland-Reyes pointed out, the definition of Canonical N-Triples is currently ambiguous about whether xsd:string literals should be serialized like "foo" or like "foo"^^<http://www.w3.org/2001/XMLSchema#string> . As both Andy and Gregg recalled, it appears that this detail was discussed (in favor of "foo") but was omitted from the spec. David Booth > > Pat > >> >> On 12/29/2014 01:36 PM, David Booth wrote: >>> FWIW, it certainly seems to me like this detail was omitted >>> unintentionally and would be helpful to include in the errata. >>> >>> David Booth >>> >>> On 12/29/2014 12:50 PM, Stian Soiland-Reyes wrote: >>>> OK, thank you all for recollecting! So I'll settle for the >>>> "naked" literal in output of an xsd:string. >>>> >>>> Should this go into an errata or is it too much of a change? >>>> >>>> On 29 Dec 2014 07:41, "Andy Seaborne" <andy@apache.org >>>> <mailto:andy@apache.org>> wrote: >>>> >>>> On 29/12/14 06:31, Pat Hayes wrote: >>>> >>>> >>>> On Dec 28, 2014, at 6:10 PM, Gregg Kellogg >>>> <gregg@greggkellogg.com <mailto:gregg@greggkellogg.com>> >>>> wrote: >>>> >>>> On Dec 28, 2014, at 3:32 PM, Pat Hayes <phayes@ihmc.us >>>> <mailto:phayes@ihmc.us>> wrote: >>>> >>>> >>>> >>>> On Dec 28, 2014, at 5:40 AM, Andy Seaborne <andy@apache.org >>>> <mailto:andy@apache.org>> wrote: >>>> >>>> On 28/12/14 05:04, Pat Hayes wrote: >>>> >>>> On Dec 27, 2014, at 9:24 PM, Stian Soiland-Reyes >>>> <soiland-reyes@cs.manchester.__ac.uk >>>> <mailto:soiland-reyes@cs.manchester.ac.uk>> wrote: >>>> >>>> No, for once I am not coming from OWL :) >>>> >>>> I'm just writing a simple n-triples serializer, and I am not >>>> sure if I should simply always include the type if there is no >>>> @lang (e.g. ^^xsd:string) >>>> >>>> >>>> It was certainly the intention of the RDF 1.1 WG that every >>>> literal should have a type. We even provided a special 'type' >>>> for the @lang case, to preserve this intention. It seems to me >>>> that one should not ever go wrong by including the >>>> ^^xsd:string, which was semantically correct even in original >>>> RDF, whereas really plain plain literals now have the shadow of >>>> deprecation hanging over them, at the very least. >>>> >>>> Hope this helps. >>>> >>>> Pat Hayes >>>> >>>> >>>> And for serialization, the WG intention IIRC was that all >>>> ^^xsd:strings should be written without the ^^xsd:string in all >>>> formats where possible. >>>> >>>> >>>> Really? I have no recollection of that, but I may have missed >>>> some discussions. Can you find this in the minutes or emails >>>> anywhere? >>>> >>>> >>>> I share Andy's recollection >>>> >>>> >>>> OK, two is enough :-) I bow to your superior recollection, and >>>> withdraw my implicit advice to use explicit xsd:string typing. >>>> Apologies to all concerned. >>>> >>>> >>>> I went looking (OK, a bit of looking) the first time but >>>> couldn't find spec text except the MAY. This discussion was >>>> over an extended period. >>>> >>>> The examples for Turtle are without xsd:string (except to show >>>> they are the same). >>>> >>>>> From memory, the line of argument was that simple literals >>>>> were >>>> more common than explicit ^^xsd:string though the community of >>>> use is going to be a major factor. >>>> >>>> Like Gregg, Jena outputs without explicit datatype as the best >>>> choice overall. >>>> >>>> Andy >>>> >>>> >>>> Pat >>>> >>>> , and that is how my serializer behaves. Shame that the >>>> spec-text doesn't cspture that. >>>> >>>> Gregg >>>> >>>> It look nicer. >>>> >>>> >>>> Maybe, but it also can produce uncertainty, as for example: >>>> >>>> "Before rdf 1.1 the norm tended to be to NOT express xsd:string >>>> unless it really was a character-by-character string (e.g. a >>>> genome identifier), and not when it was human text (but in >>>> unknown or mixed language)." >>>> >>>> Even in RDF 1.0, plain literals were specified to be >>>> semantically identical to xsd:string-typed literals, but this >>>> was buried in the semantics dociument which nobody read, and >>>> because the syntactic distinction was available, people assumed >>>> it meant something. As long as a syntax offers both choices, >>>> this misreading process will continue to operate, even now RDF >>>> 1.1 has said explicitly that plain literals are only syntactic >>>> sugar for the typed version. >>>> >>>> >>>> http://www.w3.org/TR/rdf11-__concepts/#section-Graph-__Literal >>>> >>>> <http://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal> >>>> only says "MAY" -- that is mainly so as not to suggest much RDF >>>> 1.0 data output by pre-existing software is suddenly >>>> invalidated, which it isn't. >>>> >>>> >>>> Certainly, plain literal surface syntax is not *invalidated* by >>>> RDF 1.1. Sorry if I gave that impression. >>>> >>>> Pat >>>> >>>> >>>> >>>> Andy >>>> >>>> >>>> >>>> ..Or if I should have a special case to output anything with >>>> type xsd:string as a classic "plain literal", e.g. no @ or ^^. >>>> >>>> Surely just one of these should be in the canonical version ? >>>> My guts says to always include the type for non-lang, but the >>>> spec is ambigous on this - if xsd:string is implied, should I >>>> then prefer to generate this implied version? >>>> >>>> Before rdf 1.1 the norm tended to be to NOT express xsd:string >>>> unless it really was a character-by-character string (e.g. a >>>> genome identifier), and not when it was human text (but in >>>> unknown or mixed language). >>>> >>>> As we SHOULD be generating the Canonical N-Triples, then it >>>> would be good to know if there already is a silent de facto >>>> agreement that is just not expressed in the spec. >>>> >>>> You might know the code base - >>>> >>>> https://github.com/stain/__commons-rdf/blob/tests/src/__test/java/com/github/__commonsrdf/dummyimpl/__LiteralImpl.java#L99 >>>> >>>> >>>> >>>> <https://github.com/stain/commons-rdf/blob/tests/src/test/java/com/github/commonsrdf/dummyimpl/LiteralImpl.java#L99> >>>> >>>> >>>> On 27 Dec 2014 17:14, "Peter Ansell" <ansell.peter@gmail.com >>>> <mailto:ansell.peter@gmail.com>> wrote: Hi Stian, >>>> >>>> RDF-1.1 does not have the concept of plain literals [1]. Hence, >>>> it is difficult to map the OWL-WG-derived rdf:PlainLiteral set >>>> to RDF-1.1, if that is where you are coming at the issue from >>>> [2]. >>>> >>>> Cheers, >>>> >>>> Peter >>>> >>>> [1] >>>> >>>> http://www.w3.org/TR/2014/REC-__rdf11-concepts-20140225/#__section-Graph-Literal >>>> >>>> >>>> >>>> <http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/#section-Graph-Literal> >>>> >>>> [2] https://github.com/owlcs/__owlapi/issues/172 >>>> <https://github.com/owlcs/owlapi/issues/172> >>>> >>>> On 27 December 2014 at 16:37, Stian Soiland-Reyes >>>> <soiland-reyes@cs.manchester.__ac.uk >>>> <mailto:soiland-reyes@cs.manchester.ac.uk>> wrote: >>>> >>>> In >>>> >>>> http://www.w3.org/TR/n-__triples/#canonical-ntriples >>>> >>>> <http://www.w3.org/TR/n-triples/#canonical-ntriples> I read: >>>> >>>> Canonical N-Triples has the following additional constraints >>>> on layout: >>>> >>>> The whitespace following subject, predicate, and object MUST be >>>> a single space, (U+0020). All other locations that allow >>>> whitespace MUST be empty. There MUST be no comments. HEX MUST >>>> use only uppercase letters ([A-F]). Characters MUST NOT be >>>> represented by UCHAR. Within STRING_LITERAL_QUOTE, only the >>>> characters U+0022, U+005C, U+000A, U+000D are encoded using >>>> ECHAR. ECHAR MUST NOT be used for characters that are allowed >>>> directly in STRING_LITERAL_QUOTE. >>>> >>>> >>>> >>>> and in >>>> >>>> http://www.w3.org/TR/n-__triples/#sec-parsing-terms >>>> >>>> <http://www.w3.org/TR/n-triples/#sec-parsing-terms> >>>> >>>> If neither a language tag nor a datatype IRI is provided, the >>>> literal has a datatype of xsd:string. >>>> >>>> >>>> >>>> and in >>>> >>>> http://www.w3.org/TR/n-__triples/#sec-literals >>>> >>>> <http://www.w3.org/TR/n-triples/#sec-literals> >>>> >>>> If there is no datatype IRI and no language tag it is a simple >>>> literal and the datatype is >>>> >>>> http://www.w3.org/2001/__XMLSchema#string >>>> >>>> <http://www.w3.org/2001/XMLSchema#string>. >>>> >>>> >>>> Example 3 <http://example.org/show/218> >>>> >>>> <http://www.w3.org/2000/01/__rdf-schema#label >>>> >>>> <http://www.w3.org/2000/01/rdf-schema#label>> "That Seventies >>>> >>>> Show"^^<http://www.w3.org/__2001/XMLSchema#string >>>> >>>> <http://www.w3.org/2001/XMLSchema#string>> . # literal with XML >>>> Schema string datatype <http://example.org/show/218> >>>> >>>> <http://www.w3.org/2000/01/__rdf-schema#label >>>> >>>> <http://www.w3.org/2000/01/rdf-schema#label>> "That Seventies >>>> Show" . # same as above >>>> >>>> >>>> >>>> So I am not any wiser with regards to how to serialize plain >>>> literals in RDF 1.1 Canoical N-Triples.. >>>> >>>> >>>> Are both of the two examples allowed in Canonical N-Triples? >>>> (it seems so by the spec.. :-( ). >>>> >>>> Which variant should I generate? >>>> >>>> >>>> -- Stian Soiland-Reyes, myGrid team School of Computer Science >>>> The University of Manchester >>>> http://soiland-reyes.com/__stian/work/ >>>> <http://soiland-reyes.com/stian/work/> >>>> http://orcid.org/0000-0001-__9842-9718 >>>> <http://orcid.org/0000-0001-9842-9718> >>>> >>>> >>>> >>>> ------------------------------__------------------------------ >>>> IHMC (850)434 8903 home 40 South Alcaniz St. >>>> (850)202 4416 office Pensacola >>>> (850)202 4440 fax FL 32502 >>>> (850)291 0667 mobile (preferred) phayes@ihmc.us >>>> <mailto:phayes@ihmc.us> http://www.ihmc.us/users/__phayes >>>> <http://www.ihmc.us/users/phayes> >>>> >>>> >>>> >>>> ------------------------------__------------------------------ >>>> IHMC (850)434 8903 home 40 >>>> South Alcaniz St. (850)202 4416 office Pensacola >>>> (850)202 4440 fax FL 32502 >>>> (850)291 0667 mobile (preferred) phayes@ihmc.us >>>> <mailto:phayes@ihmc.us> http://www.ihmc.us/users/__phayes >>>> <http://www.ihmc.us/users/phayes> >>>> >>>> >>>> ------------------------------__------------------------------ >>>> IHMC (850)434 8903 home 40 >>>> South Alcaniz St. (850)202 4416 office Pensacola >>>> (850)202 4440 fax FL 32502 >>>> (850)291 0667 mobile (preferred) phayes@ihmc.us >>>> <mailto:phayes@ihmc.us> http://www.ihmc.us/users/__phayes >>>> <http://www.ihmc.us/users/phayes> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >>> >>> >> >> > > ------------------------------------------------------------ IHMC > (850)434 8903 home 40 South Alcaniz St. (850)202 4416 > office Pensacola (850)202 4440 fax FL > 32502 (850)291 0667 mobile > (preferred) phayes@ihmc.us http://www.ihmc.us/users/phayes > > > > > > > > >
Received on Monday, 29 December 2014 20:09:28 UTC