Re: Entities (part of detailed review) from Thomas Broyer on 2007-08-02 (public-html@w3.org from August 2007)

From: Thomas Broyer <t.broyer@gmail.com>
Date: Thu, 2 Aug 2007 15:53:51 +0200
To: public-html@w3.org, "Ian Hickson" <ian@hixie.ch>
Message-ID: <a9699fd20708020653w7ea22c32ydd002834a0be1f44@mail.gmail.com>

2007/8/1, Thomas Broyer:
> >
> > Because of this, I think that there should be note clarifying why some
> > entities are presented twice in the table and pointing to an appropriate
> > part of parsing algorithm, probably 8.2.3.1 Tokenising entities
> > (http://www.w3.org/html/wg/html5/#tokenising).
>
> Can someone remind me why this hasn't be done with a third "Is
> semi-colon required" column?

Proposed text for the "anything else" case's first three paragraphs of
the 8.2.3.1 Tokenizing entities section:
"Consume the maximum number of characters possible, with the consumed
characters case-sensitively matching one of the identifiers in the
first column of the entities[#entities0] table and either the next
input character being a U+003B SEMICOLON (;) or the third column of
the entities[#entities0] table indicating a recoverable missing
semicolon for the matched entity name.

If no match can be made, then this is a parse error. No characters are
consumed, and nothing is returned.

If the next input character is a U+003B SEMICOLON (;), consume it;
otherwise, there is a parse error."

-- 
Thomas Broyer

Received on Thursday, 2 August 2007 13:54:13 UTC