- From: Anne van Kesteren <annevk@opera.com>
- Date: Sun, 09 Sep 2007 20:26:08 +0200
- To: "HTML WG" <public-html@w3.org>
On Sun, 09 Sep 2007 18:20:03 +0200, Julian Reschke <julian.reschke@gmx.de> wrote: > Anne van Kesteren wrote: >> On Sun, 09 Sep 2007 16:11:34 +0200, Julian Reschke >> <julian.reschke@gmx.de> wrote: >>> We really should answer the question we asked before: why would it be >>> conforming to include those characters in the first place? >> I can see a good reason to prohibit U+0000 (and that's done), but what >> is the reason for making these other characters non-conforming? They >> are not posing any interoperability problem and are also supported by >> the DOM. I'm not sure why we should limit the HTML serialization here. > > So what's the semantics of these characters when they occur inside HTML? > What is a recipient supposed to do with them, for instance, when they > appear inside <p> or a <pre> element? They should do the same as whenever someone inserts them through the DOM. Seems that browsers display some type of placeholder character: http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%3Cscript%3Ew(%22%01%22%20%3D%3D%20%22%5C1%22)%3C%2Fscript%3E It's not entirely clear to me whether that's in scope of HTML though. We just need to define the "byte stream -> tree" mapping. Although maybe it could be part of the rendering chapter, dunno. -- Anne van Kesteren <http://annevankesteren.nl/> <http://www.opera.com/>
Received on Sunday, 9 September 2007 18:26:24 UTC