- From: Shelley Powers <shelleyp@burningbird.net>
- Date: Sun, 02 Aug 2009 08:16:50 -0500
- To: Maciej Stachowiak <mjs@apple.com>
- CC: Julian Reschke <julian.reschke@gmx.de>, Ian Hickson <ian@hixie.ch>, Sam Ruby <rubys@intertwingly.net>, John Foliot <jfoliot@stanford.edu>, 'HTML WG' <public-html@w3.org>
Maciej Stachowiak wrote: > > On Aug 1, 2009, at 11:47 PM, Julian Reschke wrote: > >> Ian Hickson wrote: >>> ... >>>> Your sampling is flawed because it doesn't account for a >>>> significant number of web pages that are not accessible to the public. >>> Pages that are not part of the Web do not need to use a standard >>> interoperable across the entire Web, they can use proprietary formats. >> > ... >> >> Sorry? I think this is something we need to discuss. Just because a >> web-based application only runs on an intranet doesn't mean it's >> irrelevant. It just means it is harder to collect data about it. > > I don't think intranets are irrelevant, but they do raise an > epistemological problem. People often claim that intranets have > content with substantially different characteristics than the public > Web, in particular respects. But in practice it is usually impossible > to test this kind of hypothesis. That means these kinds of claims are > not falsifiable and therefore not scientific. > > So we have three basic options: (1) ignore all data and make decisions > purely based on armchair reasoning; or (2) by Occam's Razor, assume > intranet content is much like public Web content unless specifically > shown otherwise with concrete evidence; (3) ignore intranet content > except when we can gather concrete data about its unique > characteristics or special needs. > > I don't think #1 is the most rational of these choices. But Maciej, you are ignoring data. I, and others, have pointed out, numerous times, that the use of HTML tables in the data collected was also incorrect. If you're making a conclusion about summary, only, you need to have a good collection of data that reflects good HTML table use, but bad summary use, and from what I can see, the data that's not been collected does not warrant such a conclusion. In addition, I have tried to point out numerous discussions discovered in Google that demonstrate that the data in HTML tables in intranets, behind firewalls, could very well demonstrate good HTML table use AND goos summary use. It is somewhat anecdotal in nature, true, since it is culled from Google search engines. However, all of the data provided for the arguments against summary have been anecdotal in nature. None of it is derived from gathering data in controlled circumstances. It's all based on scraping a portion of web pages, with no real way of knowing how viable this scraping is when it comes to representing all uses of HTML. You also don't take into account the fact that the web is both archival and current, which means that you can't differentiate between use of HTML table and summary in older, no longer maintained web pages, and pages that are actively being maintained. So the data is tainted because we can't really determine how people are using HTML tables or summary _today_. The data is tainted in so many ways, making it too easily vulnerable to subjective interpretation, that I'm really surprised that people who espouse a scientific methodology would continue to rely on it. Then you use a pejorative term such as "armchair", most likely to undermine the expertise of the people involved, which is also counter to typical scientific practice. Reasoned arguments have been provided. I have not seen any of them refuted. > > Regards, > Maciej > > > Shelley
Received on Sunday, 2 August 2009 13:17:39 UTC