Re: Allowing \u escaped surrogate pairs

Yes, that's part of the weirdness of the request.  Turtle is UTF-8.  RDF 
literals are Unicode character strings.  UTF-16 doesn't appear at all.  The 
request is to require Turtle processors, which may not use UTF-16 anywhere, to 
recognize UTF-16 surrogate pairs and turn them into Unicode characters.

So it's not a requirement for Turtle processors to use UTF-16, just to 
recognize UTF-16 surrogate pairs as an alternative escape for Unicode 
characters that are not in the BMP.  Which is weird because Turtle already has 
a better escape mechanism for these characters.

peter

On 4/30/26 5:18 AM, James Anderson wrote:
> the notion of surrogate pairs exists for utf-16 only.
> 
>> On 30. Apr 2026, at 00:56, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
>>
>> I don't believe that this is at all what is being decided or even discussed.
>>
>> peter
>>
>> On 4/29/26 5:24 PM, James Anderson wrote:
>>> is the working group's intent to retain the restriction, that documents be encoded as utf-8 or to relax that restriction to permit utf-16?
>>> ---
>>> james anderson | james@dydra.com | https://dydra.com
>>
>>
> 
> ---
> james anderson | james@dydra.com | https://dydra.com
> 
> 
> 

Received on Thursday, 30 April 2026 10:20:03 UTC