Re: any idea why validator(-jp) would repeatedly fetch XHTML DTDs?

Hi Bjoern, thanks for your analysis.

On May 2, 2006, at 10:21, Bjoern Hoehrmann wrote:
> I suspect the modularization requests are thanks to third party DTDs
> (XHTML 1.1 plus target attribute and such) and the popular XHTML 1.1
> Strict document type.

Right, and similarly the infamous "-//W3C//DTD HTML 4.01 Strict//EN", I 
suppose, for /TR/html4/strict.dtd.
This does not yet explain the many requests for /TR/html4/loose.dtd and 
the XHTML 1.0 DTDs... Are there really so many document with typo'd 
FPIs out there, besides "HTML 4.01 Strict"?

That sounds weird. I think I'm going to proceed with a little plan of 
mine, and do some stats on the document types of docs that were passed 
to the validator. That may even give us interesting statistics on what 
ratio of documents were broken *after* using the validator.

> I am not sure merging is the right process here, but copying the FPI
> to SI maps over should work to some extend. I'm not sure how this would
> interact with the scary doctype decetion code in the branches though.

It shouldn't have any impact on detection of whether a doctype was 
found or not, but it may have an impact on the choice of parse modes. 
Being more educative about the problem would be a good alternative, 
too. We could:
- improve the text of the "parse mode" warnings (see validation results 
for most of the section at:
http://qa-dev.w3.org/wmvs/HEAD/dev/tests/#doctype_FPI_SI)
- make specific warnings for  "-//W3C//DTD HTML 4.01 Strict//EN", and 
XHTML 1.1 Strict, since they're a common problem for markup authors :/

thoughts?
-- 
olivier

Received on Tuesday, 2 May 2006 04:42:57 UTC