- From: Ian Hickson <ian@hixie.ch>
- Date: Fri, 22 Jun 2007 07:55:06 +0000 (UTC)
- To: Mike Brown <mike@skew.org>
- Cc: public-html@w3.org
On Fri, 22 Jun 2007, Mike Brown wrote: > > > > > > HTML 5 seems to now allow the entire U+0001..U+001F range, whereas > > > HTML 4.x, 3.2, and I think 2.0, as defined by their "document > > > character set" and SGML profile, have long forbidden all of that > > > range except for tab, LF, CR, and, inexplicably, FF. > > > > > > Why is HTML 5 different, and what are the expectations for the > > > processing of the now-allowed BEL, BS, VT, DEL, and so on? If it was > > > deliberate, why not put a note of explanation in the spec? > > > > It was deliberate only insofar as I didn't come across any reason to > > disallow them. The expectations for their processing are unaffected by > > whether they are allowed or not. > > > > What would the note explain? > > The note would explain why you feel it's important to include those > codes in HTML 5 I don't feel it's important either way. I don't really have an opinion. > and the fact that there are no expectations of how they're interpreted; Well, there are expectations, they're the same expectations as for any other character. > they're just no longer disallowed. Perhaps I'm just spoiled by the HTML > 4 spec which mentions things like that. Generally the spec doesn't have notes for changes from previous versions, there are just so many of them. However, we should indeed note it; Anne, would this be something for the "changes since HTML4" doc? > I'm guessing those control codes were previously disallowed out of a > fear that there may have been some concern, at the time, for > console-based browsers: you don't want such a browser to blindly pass > control codes to the user's terminal. Right, but disallowing them doesn't affect this at all. I mean, even if they're disallowed, people might still include them. So you still have to handle them whether they're allowed or not (that's what I meant when I said that "the expectations for their processing are unaffected by whether they are allowed or not"). > [...] why let the language permit it? Why disallow it? I don't know, I don't really have a good reason one way or the other. > If you do allow all of U+0001..U+001F then you might as well allow > U+0080..U+009F range as well, no? Sure, they're allowd too. > Do you have any plans to acknowledge the Windows-1252 confusion for NCRs > in that range, such as € being treated as Euro by many (most?) > browsers? That's already covered in the spec. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 22 June 2007 07:59:26 UTC