Re: one more comment on STRING_LITERAL2 Re: review comments of N-Triples in the Turtle document from Zhe Wu on 2012-03-20 (public-rdf-wg@w3.org from March 2012)

From: Zhe Wu <alan.wu@oracle.com>
Date: Tue, 20 Mar 2012 13:34:42 -0700
To: public-rdf-wg@w3.org
Message-ID: <4F68E9E2.3030200@oracle.com>
Hi Andy,

>
>>
>> One small comment. The STRING_LITERAL2 is defined as follows.
>>
>> STRING_LITERAL2   ::= '"' ( ( [^\"\\\n\r] ) | ECHAR | UCHAR )* '"'
>>
>> If I read it correctly, this allows a single quote, among many other
>> things, to be used (as is) inside a pair of double quotes.
>> A user can also put a character of ASCII code 0x12 inside a pair of
>> double quotes.
>
>
> Did you mean 0x22, a double quote?  0x12 is a control character.
>

I actually meant 0x12.

>>
>> Maybe we want to restrict it a little bit?
>
> I think the confusion is editorial not a technical change.  It does not mean to exclude \" (2 characters).
>
> [^\"\\\n\r] should be [^"\#xA#xD]
>
> BNF does not have it's own escape character rules.  You need to write the hex for NL and CR.
>
> It actually says you can't put the letters 'n' and 'r' in directly and it excludes \ 5 separate times.  \n is not NL.  It's '\' and a 'n'.
>

I see. It is actually \ five times :)

I thought that expression was meant to be a regular expression. If interpreted as a regular
expression, it would exclude double quote, back slash, NL, and CR.

My question is the same though. Even with your proposed change, we will still
allow a single quote to be used as is in a literal, right?  According to the definition
of ECHAR, we also allow \'

Thanks,

Zhe

>> I am wondering what do you
>> think of using a table, similar to the table in 3.2 in the old test spec?
>> We can add a new column for the new N-Triple encoding. That way, users
>> can see the difference/*side by side*/.
>>
>> I am not convinced that having text/turtle, and application/ntriples on
>> top of the existing
>> text/plain for the old style encoding is a good thing. What /*new
>> */features are we achieving?
>>
>> Thanks,
>>
>> Zhe
>>
>>
>> On 3/20/2012 11:23 AM, Gavin Carothers wrote:
>>> On Tue, Mar 20, 2012 at 11:05 AM, Zhe Wu<alan.wu@oracle.com>  wrote:
>>>> Hi Gavin,
>>>>
>>>> Please see my comments inline.
>>>>
>>>>
>>>>>> - Replace
>>>>>>           "N-Triples may also be provided as text/plain. When used in this
>>>>>> way N-Triples must
>>>>>>           use the escaped form of any character outside US-ASCII"
>>>>>>    with
>>>>>>           "When encoded using US-ASCII as specified in section 3 [REF1],
>>>>>> N-Triples should
>>>>>>            be provided as text/plain."
>>>>> This isn't exactly true. There is nothing wrong with encoding an
>>>>> N-Triples file using US-ASCII and serving as application/ntriples. The
>>>>> relationship goes the other direction. If you want to provide
>>>>> text/plain N-Triples you MUST use US-ASCII. If you want to provide
>>>>> US-ASCII you can use either text/plain, text/turtle, or
>>>>> application/ntriples.
>>>>>
>>>> I guess my question really is what do we gain from encoding using US-ASCII
>>>> and serving
>>>> as application/ntriples?
>>> The same bytes can served as application/ntriples, text/turtle, and
>>> text/plain and have exactly the same meaning. This is a good thing,
>>> UTF-8 is awesome like that.
>>>
>>>>
>>>>
>>>>>> - Add the following to the end of "See N-Triples Media Type for the media
>>>>>> type registration form."
>>>>>>
>>>>>>    For maximum backward compatibility, users or applications may want to
>>>>>> choose US-ASCII
>>>>>>    encoding to serialize N-Triples.
>>>>> I don't think we should recommend providing any format in US-ASCII over
>>>>> UTF-8.
>>>>>
>>>> I don't think that sentence truly recommends US-ASCII over UTF-8.  It is
>>>> important, in my opinion,
>>>> for us to point out non-trivial consequences caused by the changes we
>>>> propose.
>>>>
>>>> Assume a user serializes using UTF-8 encoding for non ASCII characters and
>>>> the
>>>> new \ encoding for ', \b, and \f. Such a serialization will not work
>>>> with some of the existing tools, rapper 2-1.9.0 for example.
>>>>
>>>> The proposed new sentence simply makes clear one important consequence.
>>> Okay, I think I agree not sure on the exact phrasing but expanding the
>>> differences section seems like a good idea.
>>>
>>> Thanks very much for the feedback, I'll see if I can get some or all
>>> of it in to the document before the next meeting.
>>>
>>> --Gavin
>>>
>>
>
Received on Tuesday, 20 March 2012 20:35:13 UTC