- From: Anne van Kesteren <annevk@opera.com>
- Date: Tue, 28 Feb 2012 22:54:51 +0100
- To: "Jeni Tennison" <jeni@jenitennison.com>, "David Carlisle" <davidc@nag.co.uk>
- Cc: "public-xml-er@w3.org Community Group" <public-xml-er@w3.org>
On Tue, 28 Feb 2012 21:09:31 +0100, David Carlisle <davidc@nag.co.uk> wrote: >> Does that throw everything else in Anne's algorithm out somehow? > > Anne? No, you can change individual character handling in each tokenizer state quite easily. The question is whether divergence from HTML for tokenizing <foo<bar> is desirable. Is it our gut feeling that this is likely better or is there some data to back that up? In the end we want deterministic error handling. Making as little decisions as to how that should go and deferring to what went before us seems like a nice way out. There's still plenty of room for that around colon and namespace handling. So overall I do not feel too strongly about what to do in each tokenizer state, but if we are going to change things around in a way that diverges from HTML we might want to have a system for it (such as data). -- Anne van Kesteren http://annevankesteren.nl/
Received on Tuesday, 28 February 2012 21:55:35 UTC