- From: Randy Waki <rwaki@sunscreen.whizbang.com>
- Date: Sat, 20 Nov 1999 17:10:20 -0700
- To: "Dave Raggett" <dsr@w3.org>
- Cc: <html-tidy@w3.org>
On Sat, 20 Nov 1999, Dave Raggett wrote: > SGML/XML says: > > good <!----> > bad <!-----> > bad <!------> > bad <!-------> > good <!--------> > > weird isn't it! > > I will adjust the parser to trim trailing hyphens to the > nearest legal number. I believe this would be insufficient for XML. XML's comment syntax is a subset of SGML/HTML's. Production 15 in XML 1.0 says: Comment ::= '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->' and the text says: For compatibility, the string "--" (double-hyphen) must not occur within comments. This means that the characters between the opening <!-- and the closing --> cannot contain two consecutive hyphens. Also they cannot end in a hyphen (as per the BNF even though the text fails to mention it). So for XML (as opposed to SGML/HTML): <!----> good (empty comment) <!-----> bad (trailing hyphen) <!------> bad (consecutive hyphens, trailing hyphen) <!-------> bad (consecutive hyphens, trailing hyphen) <!--------> bad (consecutive hyphens, trailing hyphen) <!--- --> good <!-- - - --> good For XML, Tidy could fix consecutive hyphens by examining the characters between the <!-- and the --> and replacing the first, third, etc. hyphen with a space and also replacing any trailing hyphen with a space. This should preserve much of the visual effect intended by people who use consecutive hyphens as dividers. If you wanted to avoid a special case for XML, perhaps Tidy could make all comments conform to XML's stricter syntax. (The extra latitude allowed by SGML/HTML is small enough and obscure enough that I wonder if anyone would miss it.) Randy
Received on Saturday, 20 November 1999 19:12:15 UTC