- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 20 Jun 2007 23:22:30 +0000 (UTC)
On Wed, 30 May 2007, Julian Reschke wrote: > Anne van Kesteren wrote: > > > > Whether or not it should be conforming is a different question. How a > > document is to be parsed is best agreed upon between browser vendors I > > think. We already have enough differences as it is. > > Again, you're making the assumption that any consumer of HTML content is > a browser. On Wed, 30 May 2007, Anne van Kesteren wrote: > > I think the primary consumer is. Content is written mostly against > browsers, not parsing libraries. Parsing libraries should just follow > the specification (like html5lib tries to do). On Wed, 30 May 2007, Henri Sivonen wrote: > > No, the assumption isn't that any consumer is a browser. The assumption > is that browsers need to do what they do based on browser-specific > constraints and the other consumers need to follow what browsers do in > order to be compatible. On Wed, 30 May 2007, Julian Reschke wrote: > > ...to be compatible with what? The browsers? Yes, so that all consumers consume the HTML interoperably. > So let's rephrase this question: will there be a conformance class for > HTML5 consumers that *only* accept conforming documents? (Keep in mind > that these consumers may not even have a DOM or a Javascript engine). Assuming you are referring only to parse errors, and not other kinds of conformance errors, then yes, the spec already allows you to abort when you hit a parse error. On Wed, 30 May 2007, Philip Taylor wrote: > > Perhaps it would be better to rephrase as: Will there be a conformance > class for HTML5 consumers that process conforming documents according > the spec, but process non-conforming documents in an undefined way? No. You can reject, and you can do what the spec says. But you can't be conforming while doing something that contradicts what the spec says. > At least that's how I interpret the original intent - it means tools in > systems with guaranteed document conformance (i.e. not taking input from > the general web) could be simplified while still claiming to be > conformant and still being interoperable with other such tools. Yes, you can do that. > http://www.whatwg.org/specs/web-apps/current-work#non-scripted already > defines UA conformance when there's no scripting, which seems to cover > those cases. Indeed. On Wed, 30 May 2007, Julian Reschke wrote: > > Thinking of which, they may not even want to build a tree of the > document. So how does the HTML5 parsing model help consumers that just > want to consume a stream of tokens similarly to a Sax parser? If you want a generic system that handles all HTML content, you can't do it with a pure streaming SAX system. On Wed, 30 May 2007, Henri Sivonen wrote: > > I think it could be useful to allow markup editors to coerce > non-conforming documents into conforming in an implementation-defined > way because then the editor could limit UI representations to conforming > cases. That's allowed (it's out of scope, in fact). It's just a post-process. > The parsing spec allows a Draconian response to parse errors. Hence, if you > want SAX events, you have two conforming options: > > 1) Build a tree in its entirety first and then emit the events based on > the tree. > > 2) Emit events as the parse progresses and halt on errors that require > non-streamable recovery. Indeed. On Wed, 30 May 2007, Michel Fortin wrote: > > Or, assuming the spec changes to no longer move head-elements (like > <link>) to the head when they're found in body, there is a third option: > > 3) Emit events until you reach a point where it may be possible that > some events should be reordered, in which case you build a local > DOM-like tree and wait until you can emit all pending events with a > certainty they don't need to be reordered. <title> still gets moved back, but yeah, in certain cases this might work. (You could also do it by having specific events for <title>.) -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 20 June 2007 16:22:30 UTC