W3C home > Mailing lists > Public > public-rdf-wg@w3.org > February 2014

Re: agenda 12 Feb telecon

From: Eric Prud'hommeaux <eric@w3.org>
Date: Wed, 12 Feb 2014 06:10:20 -0500
To: Guus Schreiber <guus.schreiber@vu.nl>
Cc: RDF WG <public-rdf-wg@w3.org>
Message-ID: <20140212111017.GA21726@w3.org>
* Guus Schreiber <guus.schreiber@vu.nl> [2014-02-11 20:07+0100]
> We will have our penultimate telecon tomorrow. Agenda is at:
> 
>     https://www.w3.org/2011/rdf-wg/wiki/Meetings:Telecon2014.02.12
> 
> Editors: pls check the draft static REC versions [1].

Inspecting Turtle's BNF, I noticed that while addressing DBooth's req for comments describing the numeric characters, I moved an "EXPONENT)" from DOUBLE to STRING_LITERAL_SINGLE_QUOTE:
[[
[21] 	DOUBLE 	::= 	[+-]? ([0-9]+ '.' [0-9]* EXPONENT | '.' [0-9]+ EXPONENT | [0-9]+
[23] 	STRING_LITERAL_SINGLE_QUOTE 	::= 	"'" ([^#x27#x5C#xA#xD] | ECHAR | UCHAR)* "'" EXPONENT) /* #x27=' #x5C=\ #xA=new line #xD=carriage return */
]] — <https://dvcs.w3.org/hg/rdf/raw-file/c3830fb585f1/rdf-turtle/turtle-bnf.html#grammar-production-DOUBLE>

I think this is pretty non-controversial so I've fixed this in editors' draft and REC in <https://dvcs.w3.org/hg/rdf/rev/53130fe8be3b>.

More interestingly, I noticed that we deviated from SPARQL's definition of strings 20 months ago when a re-gen of the HTML grammar stripped some ()s, going from:

[157s] STRING_LITERAL_LONG1 ::= "'''" (("'" | "''")? ([^'\] | ECHAR | UCHAR))* "'''"
[158s] STRING_LITERAL_LONG2 ::= '"""' (('"' | '""')? ([^"\] | ECHAR | UCHAR))* '"""'
to:
[25]   STRING_LITERAL_LONG1 ::= "'''" (("'" | "''")?  [^'\] | ECHAR | UCHAR) * "'''"
[26]   STRING_LITERAL_LONG2 ::= '"""' (('"' | '""')?  [^"\] | ECHAR | UCHAR) * '"""'
— <https://dvcs.w3.org/hg/rdf/raw-file/b40e79fe8bbc/rdf-turtle/turtle-bnf.html>

In the former language, <s> <p> """ "\u0061 """ . is legal and in the latter, an embedded quote must not be followed by ECHAR (e.g. \") or UCHAR (e.g. \u0061). Unfortunately, this change was pre-Trig so the issue exists there as well.

I looked for tests with long (triple-quoted) strings with one or two quotes followed by a backslash. We have none, but SPARQL does:
data-r2/syntax-sparql1/syntax-lit-17.rq:3:SELECT * WHERE { :x :p '''Long''\\Literal with '\\ single quotes ''' }
data-r2/syntax-sparql1/syntax-lit-20.rq:3:SELECT * WHERE { :x :p """Long""\\Literal with "\\ single quotes""" }

The closest we have is
LITERAL_LONG2_with_1_squote.ttl: <http://a.example/s> <http://a.example/p> """x""y""" .
but the nested ""s can be parsed by taking the longer of alternatives of ('"' | '""').

What to do:

I propose the bold step of restoring the SPARQL grammar, noting that it doesn't change any of our test results.


> Guus
> 
> [1] https://www.w3.org/2011/rdf-wg/wiki/Main_Page#REC_drafts
> 

-- 
-ericP

office: +1.617.599.3509
mobile: +33.6.80.80.35.59

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

There are subtle nuances encoded in font variation and clever layout
which can only be seen by printing this message on high-clay paper.
Received on Wednesday, 12 February 2014 11:10:51 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 12 February 2014 11:10:51 UTC