Re: tightenting up the Turtle grammar

On 26/03/13 21:01, Eric Prud'hommeaux wrote:
> The Turtle spec says that parsing the PNAME_NS and PNAME_LN terminals
> produces an IRI as defined in RDF Concepts.
>    http://www.w3.org/TR/turtle/#handle-IRI
>    http://www.w3.org/TR/turtle/#handle-PNAME_LN
>    http://www.w3.org/TR/2013/WD-rdf11-concepts-20130115/#dfn-iri
> RDF Concepts says that IRI is "a Unicode string [UNICODE] that
> conforms to the syntax defined in RFC 3987 [RFC3987]." In sum, we
> provide a pretty liberal grammar and then point to a hilariously
> complex grammar, but don't expect anyone to enforce it.

Don't we? :-)

> Comments c23 "IRIREF production less restrictive than RFC3987" and c26
> "PN_CHARS_BASE outside of IRI range" indicate some frustration with our
> grammar which permits characters which aren't allowed anywhere in IRIs.
>
>    <http://www.w3.org/2011/rdf-wg/wiki/Turtle_Candidate_Recommendation_Comments#c23>
>    <http://www.w3.org/2011/rdf-wg/wiki/Turtle_Candidate_Recommendation_Comments#c26>
>
> One approach would be to trim the bogus chars off of PN_CHARS_BASE and
> include a note below the grammer which points directly at 3987 and
> states that the IRIs constructed by either IRIREF or PNAME_LN are 3987
> IRIs. This would would supplement the note about valid literal ranges
> proposed to address c27.
>
>    <http://www.w3.org/2011/rdf-wg/wiki/Turtle_Candidate_Recommendation_Comments#c27>
>    <http://www.w3.org/mid/20130324145153.GN14139@w3.org>
>
> I have spoken to those acting as W3C director. They consider this to
> be a clarification and nothing that would require another LC.

The PN_CHARS_BASE rule is the same as the XML rule for NameStartChar 
without the ':'

If we alter PN_CHARS_BASE won't there be ways to write in RDF/XML 
something that can't be written in the Turtle grammar?  Sure - it may 
lead to a illegal IRI but it means we already depend on IRI checking for 
that if it is "not enforced" we have IRI strings via RDF/XML that can't 
be written in the similar way in Turtle.

(I'm not adverse to a change - including filing a SPARQL errata - but we 
do have to fit everything together.  SPARQL 1.0 took the character 
ranges because of RDF/XML.)

	Andy

Received on Wednesday, 27 March 2013 08:41:17 UTC