Re: test-29: special characters in Turtle IRIs

On Mon, 2012-03-05 at 10:37 -0500, Alex Hall wrote:
> Numeric Unicode escape sequences (\uxxxx) and percent-encoding serve
> two different purposes.
> 
> Percent-encoding sequences (%xx) are part of the IRI/URI specs, and
> allow you to encode characters, e.g. into the path section of an IRI,
> that would otherwise be illegal in that position.
[...]
> these are not processed as part of Turtle parsing
[...]
> Unicode escapes are allowed in IRIs and strings, primarily to allow
> Turtle authors to write Unicode characters in other
> languages/alphabets where they don't have good keyboard or font
> support.
[...]
> Unicode escapes are processed as part of Turtle parsing, so the
> resulting IRI or string contains the escaped character, not the \uxxxx
> sequence.

I see.  This makes sense, thank you.

> We recognize that the description of character escapes in Turtle has
> been confusing, and the editor has been working on new text clarify
> the various types of escapes.

Yes, a blurb in the text highlighting the above points would be helpful.

-dr

Received on Monday, 5 March 2012 21:45:59 UTC