- From: James Anderson <anderson.james.1955@gmail.com>
- Date: Thu, 30 Apr 2026 12:26:11 +0200
- To: RDF-star Working Group <public-rdf-star-wg@w3.org>
i try to keep my model of program behaviours simple. surrogate pairs appear as an element of the utf-16 encoding only. "just to recognize UTF-16 surrogate pairs as an alternative escape for Unicode characters that are not in the BMP" is to support an aspect of utf-16 which distinguishes it from the other encodings, which it to permit utf-16. utf-8 provides a different encoding for those which surrogate pairs would encode. that is what the recommendation should require. best regards, from berlin, > On 30. Apr 2026, at 12:19, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote: > > Yes, that's part of the weirdness of the request. Turtle is UTF-8. RDF literals are Unicode character strings. UTF-16 doesn't appear at all. The request is to require Turtle processors, which may not use UTF-16 anywhere, to recognize UTF-16 surrogate pairs and turn them into Unicode characters. > > So it's not a requirement for Turtle processors to use UTF-16, just to recognize UTF-16 surrogate pairs as an alternative escape for Unicode characters that are not in the BMP. Which is weird because Turtle already has a better escape mechanism for these characters. > > peter > > On 4/30/26 5:18 AM, James Anderson wrote: >> the notion of surrogate pairs exists for utf-16 only. >>> On 30. Apr 2026, at 00:56, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote: >>> >>> I don't believe that this is at all what is being decided or even discussed. >>> >>> peter >>> >>> On 4/29/26 5:24 PM, James Anderson wrote: >>>> is the working group's intent to retain the restriction, that documents be encoded as utf-8 or to relax that restriction to permit utf-16? >>>> --- >>>> james anderson | james@dydra.com | https://dydra.com >>> >>> >> --- >> james anderson | james@dydra.com | https://dydra.com > > --- james anderson | james@dydra.com | https://dydra.com
Received on Thursday, 30 April 2026 10:26:30 UTC