Re: Allowing \u escaped surrogate pairs

Hi all,   It seems to preserve the RDF 1.2 model - strings still denote Unicode scalar values - while allowing the common UTF-16-style escape form for non-BMP characters, e.g. \uD83C\uDCA1, when the surrogate pair is well-formed. So my mild preference would be: accept a valid high-surrogate + low-surrogate pair and interpret it as the corresponding scalar value; reject lone surrogates, reversed pairs, or malformed surrogate sequences. That said, I would also be fine with option 1, since it is simpler, stricter, and seems closer to the conservative reading of the current text. Option 2 only seems preferable to me if we want to avoid rejecting data that is probably intended to represent a valid Unicode character.  Best,  Dominik 
       
        
         
           Dnia 28 kwietnia 2026 14:16 
            Peter F. Patel-Schneider  < pfpschneider@gmail.com >  napisaƂ(a):
         
      
         [I'm deliberately not putting this in the issue, because I want the issue to  
 look clean.] 
  
 As far as I can tell, surrogates are not allowed at all in RDF 1.1 Turtle.  
 The reason is that numeric escape sequences represent Unicode code points that  
 are Unicode characters.  This appears to be only stated in Section 6.4. 
  
 So "\uD83C\uDCA1" is not valid in RDF 1.1 Turtle. 
  
 Again as far as I can tell, RDF 1.2 Turtle liberalizes RDF 1.1 Turtle because  
 it allows any non-surrogate Unicode code point for numeric escape sequences,  
 not just Unicode characters. 
  
 So "\uFFFE" is valid in RDF 1.2 Turtle, but not valid in RDF 1.1 Turtle. 
  
 Does anyone disagree with my conclusions? 
  
 peter 
  
  
  
  
 On 4/28/26 4:26 AM, Andy Seaborne wrote: 
 
 As promised at the last telecon, I put together a position for responding to  
 the i18n wide review comment [1] 
  
 github.com https://github.com/w3c/rdf-turtle/issues/138 
  
 Summary: support valid surrogate pairs written as \u escape sequences. 
  
      Andy 
  
 [1]  github.com https://github.com/w3c/rdf-turtle/issues/131 
       github.com https://github.com/w3c/rdf-trig/issues/60

Received on Tuesday, 28 April 2026 13:58:29 UTC