- From: Ernest Cline <ernestcline@mindspring.com>
- Date: Thu, 13 May 2004 10:30:28 -0400
- To: "Jukka K. Korpela" <jkorpela@cs.tut.fi>, www-html@w3.org
> [Original Message] > From: Jukka K. Korpela <jkorpela@cs.tut.fi> > > On Thu, 13 May 2004, Ernest Cline wrote: > > > In addition XML 1.1 only allows > > NEL, the rest of the C1's must be present thru the presence > > of character references only, so any future XML spec will > > handle the C1's in a manner you think is appropriate. > > <snip> > > Anyway, at http://www.w3.org/TR/xml11/#charsets > I read that > > [2] Char ::= [#x1-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] > /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. > */ > > which means a change: all Ascii controls except the nul character #x0 > are allowed. (I wonder why nul is forbidden. It should be especially > harmless, ignorable.) And "XML processors MUST accept any character in > the range specified for Char". > > Some characters are listed as "discouraged", but as far as I can see, > XML 1.1 very much _allows_ the entire C1 Controls range. Altho the spec does not make it clear at that point. rule [2a] [2a] RestrictedChar ::= [#x1-#x8] | [#xB-#xC] | [#xE-#x1F] | [#x7F-#x84] | [#x86-#x9F] lists those characters that are allowed only as character references and not as actual characters in the file, so for example if one wanted to encode a form feed in the content of some element, you could do it in XML 1.1, but you would have to use either  or  but not an actual form feed character. The same applies for all of the C1 controls except NEL.
Received on Thursday, 13 May 2004 10:30:27 UTC