- From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Date: Tue, 07 May 2013 10:09:36 +0200
- To: andy.seaborne@epimorphics.com
- CC: public-rdf-wg@w3.org
Le 07/05/2013 09:26, Andy Seaborne a écrit : > > > On 07/05/13 04:27, Pat Hayes wrote: >> >> On May 6, 2013, at 2:15 PM, Antoine Zimmermann wrote: >> >>> Le 06/05/2013 19:47, Pat Hayes a écrit : >>>> (I think we may have decided this already, but can't find the >>>> decision.) >>>> >>>> If some RDF has a language-tagged literal with a bad language tag >>>> (not conforming to section 2.2.9 of BPC 47), is that >>>> >>>> 1. an RDF syntax error > > What's an "RDF syntax error"? Concrete or Abstract Syntax? I assume that what Pat is interested in at the moment is abstract syntax. For his editing of RDF Semantics, it's the only thing that matters. > Real data may well have legal, correctly canonicalised language tags > ... which are then not legal RDF due to case. > > @en-US This is not a problem. Language tags do not have @ either. The concrete syntax can allow many things as long as it's clear how it maps to a valid RDF Graph. Reasoning on a particular concrete syntax like this would lead you to the conclusion that an integer does not need to have a datatype IRI because you can write it like that in Turtle. >>>> 2. syntactically legal but inconsistent (because the literal has >>>> no legal value) > >>> 3. legal and consistent (because even a bad >>>> language tag is still an RDF language tag) ? >>> >>> RDF concepts says that a language-tagged string has a lexical form >>> (a UNICOD string), a datatype IRI (rdf:langString) and a language >>> tag (a non-empty language tag as defined by [BCP47]. The language >>> tag must be well-formed according to section 2.2.9 of [BCP47], and >>> must be normalized to lowercase). > > The lower case requirement is in the abstract syntax. Right. > Many processors implement RFC3066 as does the Turtle grammar. It's alright. The lower case requirement is simply to defined what is the identity of a language tag. If you write @en-US or @en-us in Turtle, you are using the same language tag. It does not matter how the parser deals with this, as long as they compare equal. >>> >>> Anything else is not a language-tagged string. >>> So, it's answer 1. > > By that argument "@en-US" is a syntax error yet it is the canonical form. In the abstract syntax "@en-US" would be strongly wrong because of the @ character. It does not need be a syntax error in Turtle, but it's an error in RDF/XML or JSON-LD. One could imagine a syntax where en-US is a syntax error. > >> Well, that is how I would interpret that MUST as well, but I think >> that it would be better if it were to say this explicitly, because >> this being a syntax error requires all conformant RDF parsers to know >> all about wellformedness of language tags. I actually think this is a >> very bad decision, if it really is what the WG intended to do. Which >> is why I wanted to make sure that the text was very clear on exactly >> what is intended here. > > well-formedness of languiage tags isn't too bad - it's the grammar in > 2.2.9 although in Turtle, RFC 3066 is used. If any RFC3066-valid tag can be mapped in a non-ambiguous way to a BCP47-valid tag, then it's not a contradiction (but maybe a remark on this should be put somewhere in the Turtle spec). > > Concepts says > > """ > 5. Otherwise, the literal is ill-typed, and no literal value can be > associated with the literal. Such a case, while in error, is not > syntactically ill-formed. > """ Language-tagged strings cannot be ill-typed since they do not have a lexical space, and they are interpreted in their own special way. AZ. > > +1 to 3. > > Andy > >> >> Pat >> >>> There has been discussion about it, and I think this was what we >>> came to agree on, but I don't remember if it has been reflected in >>> a WG resolution. >>> >>> >>> AZ >>> >>> >>>> >>>> Pat >>>> >>>> PS. If we have to decide this, I vote for 3 as being less work >>>> to implement, and on the grounds that RDF's job isn't to check on >>>> bad data. >>>> >>>> ------------------------------------------------------------ IHMC >>>> (850)434 8903 or (650)494 3973 40 South Alcaniz St. >>>> (850)202 4416 office Pensacola >>>> (850)202 4440 fax FL 32502 >>>> (850)291 0667 mobile phayesAT-SIGNihmc.us >>>> http://www.ihmc.us/users/phayes >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >>> -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École >>> Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel >>> 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03 >>> Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/ >>> >>> >> >> ------------------------------------------------------------ IHMC >> (850)434 8903 or (650)494 3973 40 South Alcaniz St. >> (850)202 4416 office Pensacola (850)202 >> 4440 fax FL 32502 (850)291 0667 >> mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes >> >> >> >> >> >> > > -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03 Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
Received on Tuesday, 7 May 2013 08:10:11 UTC