Re: review comments of N-Triples in the Turtle document from Gavin Carothers on 2012-03-20 (public-rdf-wg@w3.org from March 2012)

From: Gavin Carothers <gavin@carothers.name>
Date: Tue, 20 Mar 2012 08:04:01 -0700
To: Zhe Wu <alan.wu@oracle.com>
Cc: public-rdf-wg@w3.org
Message-ID: <CAPqY83yO3YFS86paL42p_bvqXevGsRZ7NURuGf6M-KVBcnijSQ@mail.gmail.com>

On Mon, Mar 19, 2012 at 9:25 PM, Zhe Wu <alan.wu@oracle.com> wrote:
> Hi,
>
> Here it comes. Overall, it is very well written. Kudos to the editors.
>
> Thanks,
>
> Zhe
>
> ------------------------------------------------------
>
> - "Default encoding is UTF-8 rather than US-ASCII only"
>   This reads a bit strange because we say at the beginning that "The content
> encoding of
>   N-Triples is always UTF-8"
>
>   Suggested change: "Character encoding is UTF-8 rather than US-ASCII only"
>
>  (Note: thanks to Dan from Oracle who pointed this out.)

Sounds fine.

>
>
> - Replace
>          "N-Triples may also be provided as text/plain. When used in this
> way N-Triples must
>          use the escaped form of any character outside US-ASCII"
>   with
>          "When encoded using US-ASCII as specified in section 3 [REF1],
> N-Triples should
>           be provided as text/plain."

This isn't exactly true. There is nothing wrong with encoding an
N-Triples file using US-ASCII and serving as application/ntriples. The
relationship goes the other direction. If you want to provide
text/plain N-Triples you MUST use US-ASCII. If you want to provide
US-ASCII you can use either text/plain, text/turtle, or
application/ntriples.

>
>
> - Add the following to the end of "See N-Triples Media Type for the media
> type registration form."
>
>   For maximum backward compatibility, users or applications may want to
> choose US-ASCII
>   encoding to serialize N-Triples.

I don't think we should recommend providing any format in US-ASCII over UTF-8.

>
>
> - The language in [REF1] does not cover [A-Z] while the new grammar supports
> upper case.
>    Is this necessary?

Yes. All script tag parts are mixed case and regions are uppercase.
Some text explaining normalization for the test case format vs. dump
format could be used, or we can simply reference the older N-Triples
for test cases document.

az-Latn-IR

>
>
> - For some specific characters (within ASCII range), [REF1] uses the
> following encoding:
>   \t   \n   \r   \"   \\
>
>    The new grammar seems to cover more. In particular,  \b  \f  \'  are
> added.
>    A consequence of this change is that a previously illegal syntax, like
> the
>    triple below, is now legal.
>
>    <urn:s> <http://abc.com/p> "aa\'b" .
>
>   It is important to have some text clearly explaining this new behavior.

Sure.

>
>
> [REF1] http://www.w3.org/TR/rdf-testcases/#ntriples
>
>
>

Received on Tuesday, 20 March 2012 15:04:39 UTC