W3C home > Mailing lists > Public > public-rdf-comments@w3.org > November 2013

Re: [Turtle] Please show/explain hexadecimal-encoded characters in comments

From: Eric Prud'hommeaux <eric@w3.org>
Date: Wed, 6 Nov 2013 12:45:05 -0500
To: David Booth <david@dbooth.org>
Cc: public-rdf-comments <public-rdf-comments@w3.org>
Message-ID: <20131106174504.GF25913@w3.org>
* David Booth <david@dbooth.org> [2013-11-06 10:51-0500]
> On 11/02/2013 07:46 PM, Eric Prud'hommeaux wrote:
> >* David Booth <david@dbooth.org> [2013-05-08 14:45-0400]
> >>Regarding
> >>http://www.w3.org/TR/2013/CR-turtle-20130219/
> >>
> >>As an RDF author I frequently refer to the EBNF syntax rules in
> >>section 6.5 to check a detail of the Turtle syntax, such as figuring
> >>out whether a particular character is permitted in a local name.
> >>For the most part the rules are easy to read.  But several of the
> >>rules specify unicode characters using hexadecimal, such as:
> >>
> >>[161s] 	WS 	::= 	#x20 | #x9 | #xD | #xA
> >>[163s] 	PN_CHARS_BASE 	::= 	[A-Z] | [a-z] | [#x00C0-#x00D6] |
> >>[#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] |
> >>[#x037F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] |
> >>[#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] |
> >>[#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
> >>[166s] 	PN_CHARS 	::= 	PN_CHARS_U | '-' | [0-9] | #x00B7 |
> >>[#x0300-#x036F] | [#x203F-#x2040]
> >>
> >>Clearly it is necessary for clarity to use the hexadecimal notation
> >>in the production rules, so I certainly don't object to their use.
> >>But as a reader, it drives me bananas trying to figure out what
> >>those hexadecimal characters are -- searching the web, etc.
> >>
> >>Please add some simple comments to the production rules, indicating
> >>what the hexadecimal-encoded characters are, so that readers don't
> >>have to go searching to figure it out.  Something like the following
> >>would be a big help:
> >>
> >>/* See @@ add link to unicode table @@ */
> >>/*  #x20 = SPACE, #x9 = TAB, #xD = Carriage return, #xA = Line feed */
> >>[161s] 	WS 	::= 	#x20 | #x9 | #xD | #xA
> >>
> >>/* #x00B7 = Middle dot, #x0300 = ??? (couldn't find that one) */
> >>[166s] 	PN_CHARS 	::= 	PN_CHARS_U | '-' | [0-9] | #x00B7 |
> >>[#x0300-#x036F] | [#x203F-#x2040]
> >
> >To really help people wondering if some character is permitted in some
> >terminal, such a listing would have to include the character ranges,
> >not just the boundaries. As a compromise, I provisionally included
> >comments for ascii characters, specifically:
> >
> >[18] 	IRIREF 	::= 	'<' ([^#x00-#x20<>\"{}|^`\] | UCHAR)* '>' /* #x00=NULL #01-#x1F=control codes #x20=space */
> >[22] 	STRING_LITERAL_QUOTE 	::= 	'"' ([^#x22#x5C#xA#xD] | ECHAR | UCHAR)* '"' /* #x22=" #x5C=\ #xA=new line #xD=carriage return */
> >[23] 	STRING_LITERAL_SINGLE_QUOTE 	::= 	"'" ([^#x27#x5C#xA#xD] | ECHAR | UCHAR)* "'" EXPONENT) /* #x27=' #x5C=\ #xA=new line #xD=carriage return */
> 
> Excellent!  That's a big help.  But there seems to be a rendering
> problem, because when I view the spec draft
> https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/index.html#sec-grammar
> in my browser (Firefox 25.0), all the "x"s have disappeared from the
> hex character codes in the comments, so "#x20" has become "x20", for
> example:
> [[
> [22] 	STRING_LITERAL_QUOTE 	::= 	'"' ([^#x22#x5C#xA#xD] | ECHAR |
> UCHAR)* '"' /* #22=" #5C=\ #A=new line #D=carriage return */
> [23] 	STRING_LITERAL_SINGLE_QUOTE 	::= 	"'" ([^#x27#x5C#xA#xD] |
> ECHAR | UCHAR)* "'" EXPONENT) /* #27=' #5C=\ #A=new line #D=carriage
> return */
> ]]
> 
> Maybe this is a respec problem?

It was a mercurial being its usual obstructionist self problem. I had
originally tried without the 'x's 'cause the lines were wide, changed
my mind, someone pushed a commit in the, my repo ended up with two
heads, `hg glog` didn't reveal what I should merge against, and I
eventually gave up frustrated. On your prompting, wiped out my repo
and diffed in my changes for the 100th time.

Does this get me a "[RESOLVED]"?


> David
> 
> 
> >[161s] 	WS 	::= 	#x20 | #x9 | #xD | #xA /* #x20=space #x9=character tabulation #xD=carriage return #xA=new line */
> >
> >and will ask if the WG considers this extra text to be a net help.
> >
> >If this comment addresses your comment, please reply with the subject
> >prefixed by "[RESOLVED]".
> >
> >
> >>Thanks,
> >>David
> >>
> >

-- 
-ericP

office: +1.617.599.3509
mobile: +33.6.80.80.35.59

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

There are subtle nuances encoded in font variation and clever layout
which can only be seen by printing this message on high-clay paper.
Received on Wednesday, 6 November 2013 17:45:37 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:59:43 UTC