[Turtle] Please show/explain hexadecimal-encoded characters in comments from David Booth on 2013-05-08 (public-rdf-comments@w3.org from May 2013)

From: David Booth <david@dbooth.org>
Date: Wed, 08 May 2013 14:45:11 -0400
To: public-rdf-comments <public-rdf-comments@w3.org>
Message-ID: <518A9D37.2020706@dbooth.org>

Regarding
http://www.w3.org/TR/2013/CR-turtle-20130219/

As an RDF author I frequently refer to the EBNF syntax rules in section 
6.5 to check a detail of the Turtle syntax, such as figuring out whether 
a particular character is permitted in a local name.  For the most part 
the rules are easy to read.  But several of the rules specify unicode 
characters using hexadecimal, such as:

[161s] 	WS 	::= 	#x20 | #x9 | #xD | #xA
[163s] 	PN_CHARS_BASE 	::= 	[A-Z] | [a-z] | [#x00C0-#x00D6] | 
[#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | 
[#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | 
[#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
[166s] 	PN_CHARS 	::= 	PN_CHARS_U | '-' | [0-9] | #x00B7 | 
[#x0300-#x036F] | [#x203F-#x2040]

Clearly it is necessary for clarity to use the hexadecimal notation in 
the production rules, so I certainly don't object to their use.  But as 
a reader, it drives me bananas trying to figure out what those 
hexadecimal characters are -- searching the web, etc.

Please add some simple comments to the production rules, indicating what 
the hexadecimal-encoded characters are, so that readers don't have to go 
searching to figure it out.  Something like the following would be a big 
help:

/* See @@ add link to unicode table @@ */
/*  #x20 = SPACE, #x9 = TAB, #xD = Carriage return, #xA = Line feed */
[161s] 	WS 	::= 	#x20 | #x9 | #xD | #xA

/* #x00B7 = Middle dot, #x0300 = ??? (couldn't find that one) */
[166s] 	PN_CHARS 	::= 	PN_CHARS_U | '-' | [0-9] | #x00B7 | 
[#x0300-#x036F] | [#x203F-#x2040]

Thanks,
David

Received on Wednesday, 8 May 2013 18:45:39 UTC