Re: rdfms-literals-as-resources

On Wednesday, August 1, 2001, at 03:41  PM, Devon Smith wrote:

> Another concern is how strings encoded
> in UTF-8, UTF-16 and other non-ascii, non-latin encodings would be
> dealt with.

I'd suggest that we use what the RDF spec says as a starting 
point, namely:

"""Note: Although non-ASCII characters in URIs are not allowed 
by [URI], [XML] specifies a convention to avoid unnecessary 
incompatibilities in extended URI syntax. Implementors of RDF 
are encouraged to avoid further incompatibility and use the XML 
convention for system identifiers. Namely, that a non-ASCII 
character in a URI be represented in UTF-8 as one or more bytes, 
and then these bytes be escaped with the URI escaping mechanism 
(i.e., by converting each byte to %HH, where HH is the 
hexadecimal notation of the byte value)."""

The i18n WD suggests similarly: http://www.w3.org/TR/charmod/#sec-URIs

I'm not sure what to do about UTF-18 characters, so I'm CCing 
the i18n for suggestions.

--
       "Aaron Swartz"      |           Blogspace
  <mailto:me@aaronsw.com>  |  <http://blogspace.com/about/>
<http://www.aaronsw.com/> |     weaving the two-way web

Received on Sunday, 5 August 2001 13:08:06 UTC