- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Mon, 29 Dec 2014 11:50:23 -0600
- To: Andy Seaborne <andy@apache.org>
- Cc: Gregg Kellogg <gregg@greggkellogg.com>, Pat Hayes <phayes@ihmc.us>, "public-rdf-comments@w3.org" <public-rdf-comments@w3.org>
- Message-ID: <CAPRnXtnZqdRszoiik2p42gMwyVRSY2wLYH0_TqyVQRELUmeaFA@mail.gmail.com>
OK, thank you all for recollecting! So I'll settle for the "naked" literal in output of an xsd:string. Should this go into an errata or is it too much of a change? On 29 Dec 2014 07:41, "Andy Seaborne" <andy@apache.org> wrote: > On 29/12/14 06:31, Pat Hayes wrote: > >> >> On Dec 28, 2014, at 6:10 PM, Gregg Kellogg <gregg@greggkellogg.com> >> wrote: >> >> On Dec 28, 2014, at 3:32 PM, Pat Hayes <phayes@ihmc.us> wrote: >>> >>>> >>>> >>>> On Dec 28, 2014, at 5:40 AM, Andy Seaborne <andy@apache.org> wrote: >>>>> >>>>> On 28/12/14 05:04, Pat Hayes wrote: >>>>>> >>>>>> On Dec 27, 2014, at 9:24 PM, Stian Soiland-Reyes < >>>>>>> soiland-reyes@cs.manchester.ac.uk> wrote: >>>>>>> >>>>>>> No, for once I am not coming from OWL :) >>>>>>> >>>>>>> I'm just writing a simple n-triples serializer, and I am not sure if >>>>>>> I should simply always include the type if there is no @lang (e.g. >>>>>>> ^^xsd:string) >>>>>>> >>>>>> >>>>>> It was certainly the intention of the RDF 1.1 WG that every literal >>>>>> should have a type. We even provided a special 'type' for the @lang case, >>>>>> to preserve this intention. It seems to me that one should not ever go >>>>>> wrong by including the ^^xsd:string, which was semantically correct even in >>>>>> original RDF, whereas really plain plain literals now have the shadow of >>>>>> deprecation hanging over them, at the very least. >>>>>> >>>>>> Hope this helps. >>>>>> >>>>>> Pat Hayes >>>>>> >>>>> >>>>> And for serialization, the WG intention IIRC was that all >>>>> ^^xsd:strings should be written without the ^^xsd:string in all formats >>>>> where possible. >>>>> >>>> >>>> Really? I have no recollection of that, but I may have missed some >>>> discussions. Can you find this in the minutes or emails anywhere? >>>> >>> >>> I share Andy's recollection >>> >> >> OK, two is enough :-) I bow to your superior recollection, and withdraw >> my implicit advice to use explicit xsd:string typing. Apologies to all >> concerned. >> > > I went looking (OK, a bit of looking) the first time but couldn't find > spec text except the MAY. This discussion was over an extended period. > > The examples for Turtle are without xsd:string (except to show they are > the same). > > From memory, the line of argument was that simple literals were more > common than explicit ^^xsd:string though the community of use is going to > be a major factor. > > Like Gregg, Jena outputs without explicit datatype as the best choice > overall. > > Andy > > >> Pat >> >> , and that is how my serializer behaves. >>> Shame that the spec-text doesn't cspture that. >>> >>> Gregg >>> >>> It look nicer. >>>>> >>>> >>>> Maybe, but it also can produce uncertainty, as for example: >>>> >>>> "Before rdf 1.1 the norm tended to be to NOT express xsd:string unless >>>> it really was a character-by-character string (e.g. a genome identifier), >>>> and not when it was human text (but in unknown or mixed language)." >>>> >>>> Even in RDF 1.0, plain literals were specified to be semantically >>>> identical to xsd:string-typed literals, but this was buried in the >>>> semantics dociument which nobody read, and because the syntactic >>>> distinction was available, people assumed it meant something. As long as a >>>> syntax offers both choices, this misreading process will continue to >>>> operate, even now RDF 1.1 has said explicitly that plain literals are only >>>> syntactic sugar for the typed version. >>>> >>>> http://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal only says >>>>> "MAY" -- that is mainly so as not to suggest much RDF 1.0 data output by >>>>> pre-existing software is suddenly invalidated, which it isn't. >>>>> >>>> >>>> Certainly, plain literal surface syntax is not *invalidated* by RDF >>>> 1.1. Sorry if I gave that impression. >>>> >>>> Pat >>>> >>>> >>>> >>>>> Andy >>>>> >>>>> >>>>> >>>>>> ..Or if I should have a special case to output anything with type >>>>>>> xsd:string as a classic "plain literal", e.g. no @ or ^^. >>>>>>> >>>>>>> Surely just one of these should be in the canonical version ? My >>>>>>> guts says to always include the type for non-lang, but the spec is ambigous >>>>>>> on this - if xsd:string is implied, should I then prefer to generate this >>>>>>> implied version? >>>>>>> >>>>>>> Before rdf 1.1 the norm tended to be to NOT express xsd:string >>>>>>> unless it really was a character-by-character string (e.g. a genome >>>>>>> identifier), and not when it was human text (but in unknown or mixed >>>>>>> language). >>>>>>> >>>>>>> As we SHOULD be generating the Canonical N-Triples, then it would be >>>>>>> good to know if there already is a silent de facto agreement that is just >>>>>>> not expressed in the spec. >>>>>>> >>>>>>> You might know the code base - >>>>>>> https://github.com/stain/commons-rdf/blob/tests/src/ >>>>>>> test/java/com/github/commonsrdf/dummyimpl/LiteralImpl.java#L99 >>>>>>> >>>>>>> On 27 Dec 2014 17:14, "Peter Ansell" <ansell.peter@gmail.com> wrote: >>>>>>> Hi Stian, >>>>>>> >>>>>>> RDF-1.1 does not have the concept of plain literals [1]. Hence, it is >>>>>>> difficult to map the OWL-WG-derived rdf:PlainLiteral set to RDF-1.1, >>>>>>> if that is where you are coming at the issue from [2]. >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Peter >>>>>>> >>>>>>> [1] http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/# >>>>>>> section-Graph-Literal >>>>>>> [2] https://github.com/owlcs/owlapi/issues/172 >>>>>>> >>>>>>> On 27 December 2014 at 16:37, Stian Soiland-Reyes >>>>>>> <soiland-reyes@cs.manchester.ac.uk> wrote: >>>>>>> >>>>>>>> In http://www.w3.org/TR/n-triples/#canonical-ntriples I read: >>>>>>>> >>>>>>>> Canonical N-Triples has the following additional constraints on >>>>>>>>> layout: >>>>>>>>> >>>>>>>>> The whitespace following subject, predicate, and object MUST be >>>>>>>>> a single space, (U+0020). All other locations that allow whitespace MUST be >>>>>>>>> empty. >>>>>>>>> There MUST be no comments. >>>>>>>>> HEX MUST use only uppercase letters ([A-F]). >>>>>>>>> Characters MUST NOT be represented by UCHAR. >>>>>>>>> Within STRING_LITERAL_QUOTE, only the characters U+0022, >>>>>>>>> U+005C, U+000A, U+000D are encoded using ECHAR. ECHAR MUST NOT be used for >>>>>>>>> characters that are allowed directly in STRING_LITERAL_QUOTE. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> and in http://www.w3.org/TR/n-triples/#sec-parsing-terms >>>>>>>> >>>>>>>> If neither a language tag nor a datatype IRI is provided, the >>>>>>>>> literal has a datatype of xsd:string. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> and in http://www.w3.org/TR/n-triples/#sec-literals >>>>>>>> >>>>>>>> If there is no datatype IRI and no language tag it is a simple >>>>>>>>> literal and the datatype is http://www.w3.org/2001/ >>>>>>>>> XMLSchema#string. >>>>>>>>> >>>>>>>> >>>>>>>> Example 3 >>>>>>>>> <http://example.org/show/218> <http://www.w3.org/2000/01/ >>>>>>>>> rdf-schema#label> "That Seventies Show"^^<http://www.w3.org/ >>>>>>>>> 2001/XMLSchema#string> . # literal with XML Schema string datatype >>>>>>>>> <http://example.org/show/218> <http://www.w3.org/2000/01/ >>>>>>>>> rdf-schema#label> "That Seventies Show" . # same as above >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> So I am not any wiser with regards to how to serialize plain >>>>>>>> literals >>>>>>>> in RDF 1.1 Canoical N-Triples.. >>>>>>>> >>>>>>>> >>>>>>>> Are both of the two examples allowed in Canonical N-Triples? (it >>>>>>>> seems >>>>>>>> so by the spec.. :-( ). >>>>>>>> >>>>>>>> Which variant should I generate? >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Stian Soiland-Reyes, myGrid team >>>>>>>> School of Computer Science >>>>>>>> The University of Manchester >>>>>>>> http://soiland-reyes.com/stian/work/ http://orcid.org/0000-0001- >>>>>>>> 9842-9718 >>>>>>>> >>>>>>> >>>>>> ------------------------------------------------------------ >>>>>> IHMC (850)434 8903 home >>>>>> 40 South Alcaniz St. (850)202 4416 office >>>>>> Pensacola (850)202 4440 fax >>>>>> FL 32502 (850)291 0667 mobile >>>>>> (preferred) >>>>>> phayes@ihmc.us http://www.ihmc.us/users/phayes >>>>>> >>>>> >>>> ------------------------------------------------------------ >>>> IHMC (850)434 8903 home >>>> 40 South Alcaniz St. (850)202 4416 office >>>> Pensacola (850)202 4440 fax >>>> FL 32502 (850)291 0667 mobile (preferred) >>>> phayes@ihmc.us http://www.ihmc.us/users/phayes >>>> >>> >> ------------------------------------------------------------ >> IHMC (850)434 8903 home >> 40 South Alcaniz St. (850)202 4416 office >> Pensacola (850)202 4440 fax >> FL 32502 (850)291 0667 mobile (preferred) >> phayes@ihmc.us http://www.ihmc.us/users/phayes >> >> >> >> >> >> >> > >
Received on Monday, 29 December 2014 17:50:52 UTC