Re: An HTML language specification

On Sun, Nov 23, 2008 at 11:47 PM, Ian Hickson <ian@hixie.ch> wrote:
> On Sun, 23 Nov 2008, Jim Jewett wrote:
>> Ian wrote:
>> > For example, a WYSIWYG editor would need to
>> > know both the syntax and vocabulary conformance
>> > requirments, to output valid documents, as well
>> > as the parsing and rendering requirements, to
>> > show the right output.

>> It would only need the parsing requirements if it
>> imported existing non-conformant HTML.

> No, anything that parses HTML, even if it only
> parses compliant HTML, needs the parsing rules,
> since there are certain things that are
> surprising even with conforming HTML (e.g. how
> to determine whether a <script> block is in the
> <head> or the <body> when tags are omitted).

Circling back to this -- it isn't clear why such
omitted tags should be conforming (as opposed
to "accepted and corrected by full parsers").

The editor that doesn't import doesn't have to
worry about omitted tags.

And the editor that doesn't import invalid HTML
doesn't have to worry keeping a list of active
formatting elements.

> The definitions of what is valid and what isn't can
> be quite involved, but yes. So?

For historical reasons, they are.  It isn't clear that
they should be, for static documents.  (as opposed to
snapshots of the script-modified DOM)

>> The (error-recovery portion of the) parsing rules would
>> allow it to recover more gracefully and continue to
>> provide additional useful errors on the same run --
>> but they aren't strictly required.

> I might be more sympathetic to your position
> here if we had any validators at all that didn't use
> the error-recovery rules.

They tend to be lightweight debugging tools, rather
than published products.  I'll agree that an internal
testing tool doesn't *need* to be fully conformant,
but I see no reason to make that harder than it needs
to be.

-jJ

Received on Monday, 24 November 2008 22:46:56 UTC