- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Wed, 08 Feb 2006 09:32:53 +0100
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- Cc: www-validator@w3.org
- Message-Id: <1139387573.6694.122.camel@cumulustier>
Le mercredi 08 février 2006 à 08:58 +0100, Bjoern Hoehrmann a écrit : > * Dominique Hazael-Massieux wrote: > >When using the direct input form for validation with a FPI that the > >system doesn't recognize, the validator defaults to an SGML-parsing, > >even when there is an XML declaration at the top of the input. I think > >the XML declaration should be a good enough hint to switch the > >XML-parsing. > > <?xml version='1.0'?> > <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"> > <HTML LANG=de> > <HEAD> > ... > > That's perfectly legal HTML content. > The textarea validation essentially > assumes text/html input and since W3C refuses to define how to tell HTML > and non-HTML text/html content apart, I'm not sure there is much we can > do to resolve this, other than not assuming text/html. The question > would then be what to assume, if anything. I guess I was suggesting that a better algorithm than assuming SGML parsing in any case for direct input would be to do as follow: * DOCTYPE known -> use the appropriate parsing mode * DOCTYPE unknown -> XML Declaration -> XML validation -> no-XML Declaration -> SGML validation Of course the XML declaration can be interpreted legally in the SGML validation, but since this is a case where you need more hints rather than less, I think it's fairly safe to default to XML validation when encountering an XML declaration. Dom -- Dominique Hazaël-Massieux - http://www.w3.org/People/Dom/ W3C/ERCIM mailto:dom@w3.org
Received on Wednesday, 8 February 2006 08:33:47 UTC