W3C home > Mailing lists > Public > www-i18n-comments@w3.org > August 2001

Re: rdfms-literals-as-resources

From: Aaron Swartz <aswartz@upclink.com>
Date: Sun, 5 Aug 2001 12:07:57 -0500
Message-Id: <200108051710.f75HAcR24297@theinfo.org>
Cc: www-rdf-comments@w3.org, www-i18n-comments@w3.org
To: Devon Smith <devon@taller.pscl.cwru.edu>
On Wednesday, August 1, 2001, at 03:41  PM, Devon Smith wrote:

> Another concern is how strings encoded
> in UTF-8, UTF-16 and other non-ascii, non-latin encodings would be
> dealt with.

I'd suggest that we use what the RDF spec says as a starting 
point, namely:

"""Note: Although non-ASCII characters in URIs are not allowed 
by [URI], [XML] specifies a convention to avoid unnecessary 
incompatibilities in extended URI syntax. Implementors of RDF 
are encouraged to avoid further incompatibility and use the XML 
convention for system identifiers. Namely, that a non-ASCII 
character in a URI be represented in UTF-8 as one or more bytes, 
and then these bytes be escaped with the URI escaping mechanism 
(i.e., by converting each byte to %HH, where HH is the 
hexadecimal notation of the byte value)."""

The i18n WD suggests similarly: http://www.w3.org/TR/charmod/#sec-URIs

I'm not sure what to do about UTF-18 characters, so I'm CCing 
the i18n for suggestions.

--
       "Aaron Swartz"      |           Blogspace
  <mailto:me@aaronsw.com>  |  <http://blogspace.com/about/>
<http://www.aaronsw.com/> |     weaving the two-way web
Received on Sunday, 5 August 2001 13:08:06 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 October 2009 08:32:28 GMT