- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Tue, 28 Apr 2026 13:03:04 -0400
- To: "ddooss@wp.pl" <ddooss@wp.pl>, "public-rdf-star-wg@w3.org" <public-rdf-star-wg@w3.org>
I'm for not allowing surrogates at all, keeping the situation unchanged. My view is that software that allowed surrogates was non-compliant and should remain non-compliant. Adding a test for "correct" surrogate pairs is optional. peter On 4/28/26 9:58 AM, ddooss@wp.pl wrote: > Hi all, > > > It seems to preserve the RDF 1.2 model - strings still denote Unicode scalar > values - while allowing the common UTF-16-style escape form for non-BMP > characters, e.g. \uD83C\uDCA1, when the surrogate pair is well-formed. > > So my mild preference would be: > > accept a valid high-surrogate + low-surrogate pair and interpret it as the > corresponding scalar value; > > reject lone surrogates, reversed pairs, or malformed surrogate sequences. > > That said, I would also be fine with option 1, since it is simpler, stricter, > and seems closer to the conservative reading of the current text. Option 2 > only seems preferable to me if we want to avoid rejecting data that is > probably intended to represent a valid Unicode character. > > > Best, > > Dominik > > *Dnia 28 kwietnia 2026 14:16* Peter F. Patel-Schneider > <mailto:pfpschneider@gmail.com> < pfpschneider@gmail.com > napisaĆ(a): > > [I'm deliberately not putting this in the issue, because I want the issue to > look clean.] > > As far as I can tell, surrogates are not allowed at all in RDF 1.1 Turtle. > The reason is that numeric escape sequences represent Unicode code points > that > are Unicode characters. This appears to be only stated in Section 6.4. > > So "\uD83C\uDCA1" is not valid in RDF 1.1 Turtle. > > Again as far as I can tell, RDF 1.2 Turtle liberalizes RDF 1.1 Turtle because > it allows any non-surrogate Unicode code point for numeric escape sequences, > not just Unicode characters. > > So "\uFFFE" is valid in RDF 1.2 Turtle, but not valid in RDF 1.1 Turtle. > > Does anyone disagree with my conclusions? > > peter > > > > > On 4/28/26 4:26 AM, Andy Seaborne wrote: > > As promised at the last telecon, I put together a position for > responding to > the i18n wide review comment [1] > > https://github.com/w3c/rdf-turtle/issues/138 <https://github.com/w3c/ > rdf-turtle/issues/138> > > Summary: support valid surrogate pairs written as \u escape sequences. > > Andy > > [1] https://github.com/w3c/rdf-turtle/issues/131 <https://github.com/ > w3c/rdf-turtle/issues/131> > https://github.com/w3c/rdf-trig/issues/60 <https://github.com/w3c/rdf- > trig/issues/60> > >
Received on Tuesday, 28 April 2026 17:03:09 UTC