Re: one more comment on STRING_LITERAL2 Re: review comments of N-Triples in the Turtle document from Andy Seaborne on 2012-03-20 (public-rdf-wg@w3.org from March 2012)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Tue, 20 Mar 2012 19:47:20 +0000
To: public-rdf-wg@w3.org
Message-ID: <4F68DEC8.5000607@epimorphics.com>
On 20/03/12 18:53, Zhe Wu wrote:
> Hi Gavin,
>
> Thanks very much for your quick response!
>
> One small comment. The STRING_LITERAL2 is defined as follows.
>
> STRING_LITERAL2   ::= '"' ( ( [^\"\\\n\r] ) | ECHAR | UCHAR )* '"'
>
> If I read it correctly, this allows a single quote, among many other
> things, to be used (as is) inside a pair of double quotes.
> A user can also put a character of ASCII code 0x12 inside a pair of
> double quotes.

Did you mean 0x22, a double quote?  0x12 is a control character.

>
> Maybe we want to restrict it a little bit?

I think the confusion is editorial not a technical change.  It does not 
mean to exclude \" (2 characters).

[^\"\\\n\r] should be [^"\#xA#xD]

BNF does not have it's own escape character rules.  You need to write 
the hex for NL and CR.

It actually says you can't put the letters 'n' and 'r' in directly and 
it excludes \ 5 separate times.  \n is not NL.  It's '\' and a 'n'.

> I am wondering what do you
> think of using a table, similar to the table in 3.2 in the old test spec?
> We can add a new column for the new N-Triple encoding. That way, users
> can see the difference/*side by side*/.
>
> I am not convinced that having text/turtle, and application/ntriples on
> top of the existing
> text/plain for the old style encoding is a good thing. What /*new
> */features are we achieving?
>
> Thanks,
>
> Zhe
>
>
> On 3/20/2012 11:23 AM, Gavin Carothers wrote:
>> On Tue, Mar 20, 2012 at 11:05 AM, Zhe Wu<alan.wu@oracle.com>  wrote:
>>> Hi Gavin,
>>>
>>> Please see my comments inline.
>>>
>>>
>>>>> - Replace
>>>>>           "N-Triples may also be provided as text/plain. When used in this
>>>>> way N-Triples must
>>>>>           use the escaped form of any character outside US-ASCII"
>>>>>    with
>>>>>           "When encoded using US-ASCII as specified in section 3 [REF1],
>>>>> N-Triples should
>>>>>            be provided as text/plain."
>>>> This isn't exactly true. There is nothing wrong with encoding an
>>>> N-Triples file using US-ASCII and serving as application/ntriples. The
>>>> relationship goes the other direction. If you want to provide
>>>> text/plain N-Triples you MUST use US-ASCII. If you want to provide
>>>> US-ASCII you can use either text/plain, text/turtle, or
>>>> application/ntriples.
>>>>
>>> I guess my question really is what do we gain from encoding using US-ASCII
>>> and serving
>>> as application/ntriples?
>> The same bytes can served as application/ntriples, text/turtle, and
>> text/plain and have exactly the same meaning. This is a good thing,
>> UTF-8 is awesome like that.
>>
>>>
>>>
>>>>> - Add the following to the end of "See N-Triples Media Type for the media
>>>>> type registration form."
>>>>>
>>>>>    For maximum backward compatibility, users or applications may want to
>>>>> choose US-ASCII
>>>>>    encoding to serialize N-Triples.
>>>> I don't think we should recommend providing any format in US-ASCII over
>>>> UTF-8.
>>>>
>>> I don't think that sentence truly recommends US-ASCII over UTF-8.  It is
>>> important, in my opinion,
>>> for us to point out non-trivial consequences caused by the changes we
>>> propose.
>>>
>>> Assume a user serializes using UTF-8 encoding for non ASCII characters and
>>> the
>>> new \ encoding for ', \b, and \f. Such a serialization will not work
>>> with some of the existing tools, rapper 2-1.9.0 for example.
>>>
>>> The proposed new sentence simply makes clear one important consequence.
>> Okay, I think I agree not sure on the exact phrasing but expanding the
>> differences section seems like a good idea.
>>
>> Thanks very much for the feedback, I'll see if I can get some or all
>> of it in to the document before the next meeting.
>>
>> --Gavin
>>
>
Received on Tuesday, 20 March 2012 19:47:53 UTC