W3C home > Mailing lists > Public > public-rdf-wg@w3.org > August 2011

Re: Oracle's stand regarding N-TRIPLES

From: Jeremy Carroll <jeremy@topquadrant.com>
Date: Tue, 23 Aug 2011 09:18:44 -0700
Message-ID: <4E53D2E4.2050907@topquadrant.com>
To: public-rdf-wg@w3.org
On 8/19/2011 6:34 PM, Zhe Wu wrote:
> I don't see how adding UTF8 encoding can make N-TRIPLES much more useful.

Dear Zhe

The simple answer is that several groups of experts on making the 
internet work world wide have considered the general problem for many 
years and come up with an answer that almost everyone seems happy enough 
with.

Please have your manager and your AC rep read
http://www.w3.org/TR/charmod/#sec-Background

and RFC 2277


_*Charmod*_
The choice of Unicode was motivated by the fact that Unicode:

  * is the only universal character repertoire available,
  * provides a way of referencing characters independent of the encoding
    of the text,
  * is being updated/completed carefully,
  * is widely accepted and implemented by industry.

Characters outside the US-ASCII [ISO/IEC 646] 
<http://www.w3.org/TR/charmod/#iso646>[MIME-charset] 
<http://www.w3.org/TR/charmod/#MIME-charset> repertoire are being used 
in more and more places.

With the international Internet follows an absolute requirement to 
interchange data in a multiplicity of languages, which in turn utilize a 
bewildering number of characters.

_*RFC 2277*_

Internationalization is for humans. This means that protocols are not 
subject to internationalization; text strings are. Where protocol 
elements look like text tokens, such as in many IETF application layer 
protocols, protocols MUST specify which parts are protocol and which are 
text. [WR 2.2.1.1] Names are a problem, because people feel strongly 
about them, many of them are mostly for local usage, and all of them 
tend to leak out of the local context at times. RFC 1958 [RFC 1958] 
recommends US-ASCII for all globally visible names. This document does 
not mandate a policy on name internationalization, but requires that all 
protocols describe whether names are internationalized or US-ASCII.

***

Jeremy's note: in RDF the names are explicitly IRIs i.e. internationalized.

_*RFC 2277*_
Protocols MUST be able to use the UTF-8 charset


****



Zhe - I currently believe Oracle is threatening a formal objection if this
WG follows mandated practice from IETF and W3C policy documents.
Is this the intent?

Jeremy
Received on Tuesday, 23 August 2011 16:19:03 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:44 GMT