W3C home > Mailing lists > Public > www-international@w3.org > October to December 2012

RE: Final ping on turtle i18n issues

From: Phillips, Addison <addison@lab126.com>
Date: Mon, 10 Dec 2012 08:41:47 -0800
To: Sandro Hawke <sandro@w3.org>, "www-international@w3.org" <www-international@w3.org>
CC: "Eric Prud'hommeaux" <eric@w3.org>, RDF WG <team-rdf-chairs@w3.org>
Message-ID: <131F80DEA635F044946897AFDA9AC34773A90CC2BE@EX-SEA31-D.ant.amazon.com>
Hello Sandro,

Thanks for this ping. I regret that I didn't see all of these messages at the time (we responded to a few of them, but the bulk of your responses appear to have come while I was on vacation, and so this is, alas, my first look at them). The complete list of our comments is at [1].

In reviewing your responses, I have the following comments, which are based on the assumption that your current editor's draft is:

   http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/index.html 


I18N-ISSUE-178: I'm *not* satisfied with the resolution of this issue.

In Section 6.5 you added a directly link to the IANA Language Subtag registry, calling the contents "registered language tags" (which is incorrect, these are language subtags, used to form language tags). No reference is made to BCP 47. TURTLE effectively does not specify any standard for language tags and does not require validity or even "well-formedness" of language tags. Please address this issue by adding an explicit reference to BCP 47 (preferably a normative reference).

The link to the language subtag registry, please note, occurs in this sentence:

   The strings @prefix and @base match the pattern for LANGTAG, though neither "prefix" nor "base" are registered language tags.

Shouldn't the EBNF include the magick strings "prefix" and "base" in the LANGTAG production? While that production "permits" these strings, it is often customary to call out reserved values.

I18N-ISSUE-180:  you have added the Unicode references requested, but these take the wrong form. For example, at #turtle-literals in your draft, you say:

      Literals delimited by ' (U+27), may not contain the characters ', LF (U+0A), or CR (U+0D).

Unicode literals should always be at least four hex digits (U+0027, U+000D, etc.). Please search your document for "U+" and fix each one (this is a quick job).

I18N-ISSUE-183: After your proposal, there were several comments from I18N and from other people. You didn't change the example so that the data types would be a good example. I think we're not satisfied with this one, although I would class this as an editorial comment and recognize that the point of the example is to show the different data types, not to model any real-life data. Still... bad examples like this are the bane of my existence.

I18N-ISSUE-184: See issue 180 (above). The format of your U+ syntax is invalid.

I18N-ISSUE-187: We're okay with \u and \U syntaxes, but you didn't address part of our comment, which is that the \u syntax doesn't address surrogate pair handling. You might do so by saying "Unicode character" instead of "Unicode codepoint" (not all code points represent characters).

I18N-ISSUE-188: I'm satisfied by Eric's response.

I18N-ISSUE-189: We requested that you incorporate the obs-language-tag production directly or by reference. I'm satisfied with your reasons for not modifying the EBNF, but not with the text that describes the handling of language tags (see issue 178 above).

I18N-ISSUE-190: Eric explained that PN_CHARS_BASE was derived from [2] and "presumably leveraged the wisdom that went into XML identifiers". However, the XML reference is incomplete in "erasing" combining marks (which was, in fact, the purpose) and it was created a Long Time Ago. The additional complexity that having this production introduces to TURTLE is probably unnecessary. However, I have no objection to keeping things as they are, as EBNF contains no ready means of doing anything better and it doesn't hurt anything to keep some combining marks from being used badly.

===

We have a small timing problem, given that our next WG teleconference is scheduled for 10a.m. EST. Internationalization Working Group members should comment on-list if they feel my responses above are not consistent with working group consensus. If your WG has concerns with the any of the above, it seems that our teleconferences are at the same time. Perhaps we can resolve things "live"?

Thanks,

Addison

Addison Phillips
Globalization Architect (Lab126)
Chair (W3C I18N WG)

Internationalization is not a feature.
It is an architecture.



[1] http://www.w3.org/International/track/products/34 
[2] http://www.w3.org/TR/2008/REC-xml-20081126/#NT-NameStartChar 

> -----Original Message-----
> From: Sandro Hawke [mailto:sandro@w3.org]
> Sent: Friday, December 07, 2012 8:45 AM
> To: www-international@w3.org
> Cc: Eric Prud'hommeaux; RDF WG
> Subject: Final ping on turtle i18n issues
> 
> We still haven't heard back from you on our proposals [1] for addressing your
> review comments [1] on the LC WD of Turtle.  It's been almost two months
> since our responses to your comments, and we need to move
> forward.   If we don't hear from you by 10am ET on Wed 12 Dec, we'll
> assume our responses are satisfactory.  If they're not, please let us know ASAP,
> since we'd like to resolve things and move forward at our 12 Dec meeting.
> 
>        -- Sandro
> 
> [1] everything by Gavin Carothers or Eric Prud'hommeaux in
> http://lists.w3.org/Archives/Public/www-international/2012OctDec/author.html

> [2] http://www.w3.org/International/track/products/34

> 
> 

Received on Monday, 10 December 2012 16:42:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 10 December 2012 16:42:39 GMT