- From: Andrew Fedoniouk <news@terrainformatica.com>
- Date: Wed, 07 Jan 2009 21:04:37 -0800
- To: Ian Hickson <ian@hixie.ch>
- CC: Martin Atkins <mart@degeneration.co.uk>, Jonas Sicking <jonas@sicking.cc>, Julian Reschke <julian.reschke@gmx.de>, public-html@w3.org
Ian Hickson wrote: > On Wed, 7 Jan 2009, Martin Atkins wrote: >> 3. Write a generic parser that can be used to parse HTML markup of any >> version (>= 5) into a DOM. > > I don't think we'll ever be able to do this. For example, there is no way > I could have predicted how we were going to add <ruby> parsing to the spec > before I added it. This would be possible if we could guarantee that for > all time, all new inventions would always be done in a regular way, but > history has shown that we would be naive to assume this. > CSS used to have display-model[1] attribute: display-model: inline-inside | block-inside | table | ruby While it cannot be used in CSS associated with particular page but it is possible to use it in so called default or master style sheets to define rendering and parsing model of generic html alike grammar. It is possible to define parsers for html versions 3.2, 4 and 5 by using following attributes: display-model: inline-inside | block-inside | table | ruby; parsing-model: empty | mixed | pre; can-contain: <list of element types>; cannot-contain: <list of element types>; that can be used by some generic HTML parser (accepting subset of SGML). E.g. img { display: inline-block; parsing-model: empty; foreground-image: attr(src); } select { display: inline-block; display-model: block-inside; parsing-model: closed; can-contain: option optgroup; /* ... other primordial styles for the element ... */ background-image: url(system-shape:select); ... } option { display: block; parsing-model: mixed; cannot-contain: *; /* cannot contain any sub elements - only text */ ... color: windowtext; } option:selected { ... } etc. I use parser that is based on similar table of declarations and have strong feeling that it is possible to define html5 parser in these terms. Thus to have table driven declaration (something close to DTD) that define html4,5, etc. I would even give up some too smart error handling of html5 in favor to be able to define various validating, processing and data mining tools by using generic configurable parsers. [1] http://www.w3.org/TR/2002/WD-css3-box-20021024/#L706 -- Andrew Fedoniouk. http://terrainformatica.com
Received on Thursday, 8 January 2009 05:05:09 UTC