Re: Turtle quotation idea unicode

On 25 Feb 2012, at 22:28, David Robillard wrote:

> On Fri, 2012-02-24 at 16:11 +0100, Henry Story wrote:
>> I noticed that there are two long quotation mechanism that do the same thing """ and '''
>> 
>> Now that unicode tooling is widespread - it was certainly not so 15 years ago - why not make
>> one of those be a pure unicode string? This is something I was thinking would be useful in Scala
>> for regexps for example, where having to decode the number of backslashes makes a nice notation
>> unreadable. I'll suggest it in Scala too. But it seems that this would be useful here too.
> 
> I am confused.  Turtle documents are by definition always UTF-8.  What
> do you propose would be the difference between """ and '''?

STRING_LITERAL1 ::= "'" ( ( [^'\\\n\r] ) | ECHAR | UCHAR )* "'" 
STRING_LITERAL2 ::= '"' ( ( [^\"\\\n\r] ) | ECHAR | UCHAR )* '"' 
ECHAR ::= "\\" [tbnrf\\\"'] 
UCHAR ::= ( "\\u" HEX HEX HEX HEX ) 
        | ( "\\U" HEX HEX HEX HEX HEX HEX HEX HEX ) 
 
So my proposal would have been something like this

[89s] STRING_LITERAL_LONG1 ::= "'''" ( ( "'" | "''" )? ( [^'] ) )* "'''" 
[90s] STRING_LITERAL_LONG2 ::= '"""' ( ( '"' | '""' )? ( [^\"\\] | ECHAR | UCHAR ) )* '"""' 

ie: one of the does just unicode.

One could use other symbols of course

STRING_LITERAL_ULong ::= "❝" ( ( [^❞] ) )* "❞"

where ❞ is \u275e

> 
> I do however agree that adding ' and ''' with no distinct purpose other
> than to complicate parsers and bloat the grammar was a poor idea and a
> waste of potential.  It came from SPARQL and we are probably stuck with
> it for that reason.


> -dr
> 
> 

Social Web Architect
http://bblfish.net/

Received on Saturday, 25 February 2012 21:43:21 UTC