W3C home > Mailing lists > Public > public-rdf-comments@w3.org > March 2013

Which characters are allowed in IRIREF in Turtle 2013?

From: Dave Beckett <dave@dajobe.org>
Date: Mon, 4 Mar 2013 09:40:20 -0800 (PST)
To: public-rdf-comments@w3.org
Message-ID: <alpine.DEB.2.02.1303040938260.5167@xyzzy.dajobe.org>
http://www.w3.org/TR/2013/CR-turtle-20130219/#grammar-production-IRIREF

What characters (Unicode code points) are allowed in an IRIREF in turtle?

the IRIREF grammar rule is:   [^#x00-#x20<>\"{}|^`\] | UCHAR)

implies that for example U+007F is allowed since it's not in the
escaped range.  Taking a look at the IRI RFC 3987 it has a more
restricted range and taking the example U+007F is not allowed.
There are many other Unicode codepoints that are not allowed.

See the RFC987 rule 'ipchar' and it's expansion to 'ucschar'

This rule should probably be completed so either it lists all the allowed 
characters or lists all the excluded ones (if the [^...] form remains)

Dave
Received on Monday, 4 March 2013 17:40:48 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:29:55 UTC