- From: Karl Dubost <karl@w3.org>
- Date: Tue, 3 Oct 2006 13:36:03 +0900
- To: Ian Hickson <ian@hixie.ch>
- Cc: www-qa@w3.org
Le 30 sept. 06 à 05:56, Ian Hickson a écrit : > On Fri, 29 Sep 2006, Karl Dubost wrote: >> >> It is why I have asked more details to Ian Hickson, because I really >> think it is as much important as the derived statistics which have >> been >> published in the [previous survey][1]. When the sample is not >> given or >> clearly identified it is really difficult to draw meaningful >> conclusions. > > This is absolutely true. This is why the survey(s) haven't been > published > formally; due to the nature of the way in which the results were > obtained, > I can't write a scientific report. 1. True to "we can't draw meaningful conclusions". It is not suitable scientific report. > The data was collected for the purposes > of helping WHATWG's spec development work 2. Google has created the survey for helping WHATWG. > (I think all specifications > should be written based on solid research of authoring practices, > etc), > and I consider the data to be suitably representative for that > purpose. 3. The survey is a "solid research of authoring practices" > For other purposes, the data probably isn't useful as anything > other than > an idle curiosity, and I would not recommend treating it as > anything but > that. I have hard time to connect 1, 2 and 3 in a logical way. > > If you would like a more formal survey of the Web, I recommend > comissioning your own. :-) It is a good idea. Maybe I should ask to TV Raman, Google if Google would agree to help us to do that. >> - DOCTYPE > > I'm not sure how you would define this; take this document, for > instance: > > http://damowmow.com/playground/html-or-xml.html > What's the DOCTYPE? > How about this one: > http://damowmow.com/playground/html-or-xml.xml Do you mean there are plenty of these documents on the Web? Or are there just corner cases that has been created to identify potential problems? using http://web-sniffer.net/ GET /playground/html-or-xml.xml HTTP/1.1[CRLF] Accept: text/xml,application/xml,application/xhtml+xml,text/ html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5[CRLF] GET /playground/html-or-xml.html HTTP/1.1[CRLF] Accept: text/xml,application/xml,application/xhtml+xml,text/ html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5[CRLF] GET /playground/html-or-xml HTTP/1.1[CRLF] Accept: text/xml,application/xml,application/xhtml+xml,text/ html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5[CRLF] I have just put the source here. ################ <?test ><!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN"> <html><?test ><!-- ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <?test --><?test?> <head> <title>HTML or XML?</title> </head> <body> <p>Is this file HTML or XML?</p> <p>Why, it's <?test > HTML <!-- ?> XHTML <?test --> <?test ?> of course!</p> </body> </html> ################ How many documents with this kind of structure have you found on the Web? > What's the DOCTYPE? > If your answer was different for the two pages, then why was it > different? > The two pages are byte-for-byte identical. If your answer was the > same, > then why were they the same? Browsers treat the two very differently. Your document is sent as text/xml and then as application/xhtml+xml and then as text/html if the first is not understood. plus the problem of encoding. > (This is why my survey mostly ignored the DOCTYPE and instead just > assumed > HTML5 parsing rules.) Then Google has created a "WebApps 1.0 parser" for the purpose of the survey? Is the code accessible somewhere? Was it a crawler? Was it a parser working on files outside of their HTTP context? -- Karl Dubost - http://www.w3.org/People/karl/ W3C Conformance Manager, QA Activity Lead QA Weblog - http://www.w3.org/QA/ *** Be Strict To Be Cool ***
Received on Tuesday, 3 October 2006 04:36:24 UTC