W3C home > Mailing lists > Public > public-html@w3.org > September 2007

Re: Heuristic Tests for Data Tables (Discussion)

From: Robert Burns <rob@robburns.com>
Date: Tue, 4 Sep 2007 13:09:21 -0500
Message-Id: <B756A63B-251E-4532-8277-1C14325F0398@robburns.com>
Cc: Ben 'Cerbera' Millard <cerbera@projectcerbera.com>, Philip Taylor <philip@zaynar.demon.co.uk>, HTMLWG <public-html@w3.org>
To: James Graham <jg307@cam.ac.uk>

Hi James,


On Sep 4, 2007, at 12:43 PM, James Graham wrote:

>
> Robert Burns wrote:
>> Hi James,
>> On Sep 4, 2007, at 7:13 AM, James Graham wrote:
>>>
>>> James Graham wrote:
>>>> James Graham wrote:
>>>>>
>>>>> Ben 'Cerbera' Millard wrote:
>>>>>> I wonder how many tables can be made natively accessible? How  
>>>>>> many will need to be retrofitted by authors with <th> or  
>>>>>> scope="" or headers="" and how likely is that? I guess more  
>>>>>> studying (like Philip and I and others have done) and  
>>>>>> prototyping of implementations (like James Graham might do)
>>>>>
>>>>> There's some very early work on this available at [1] (only the  
>>>>> HTML4 algorithms are currently implemented). Due to a bug in a  
>>>>> html5lib serializer badness occurs when you give it a page  
>>>>> containing more than one table. I also haven't checked that  
>>>>> it's giving the correct results in almost any cases. However if  
>>>>> you want to report bugs feel free.
>>>> This now also has some work on the algorithm from the HTML 5  
>>>> spec, with similar caveats as before i.e. it is hideously under  
>>>> tested.
>>>
>>> I have now moved the table inspector to [1] and added an  
>>> "experimental" option which is currently based on the HTML 4  
>>> algorithm with some improvements:
>>>
>>>  * Optionally treats <td><strong> and <td><b> as headers
>>>  * Ignores headers with @scope set when looking for implicit  
>>> headings (to see why this is important look at the "Day 2" cell  
>>> in [4]).
>> Its looking good. However, I'm not clear what you mean by this  
>> "Ignores headers with @scope set..."  Even when looking at "Day 2"  
>> cell its still not clear to me. Could you say a little more about  
>> what you mean.
>
> Imagine a table structure like the following
>
> th           1    |   td   2
> th scope=row 3    |   td   4
> td           5    |   td   6
>
> The question is which headers apply to the td cell 5? It doesn't  
> have any heading information set from explicit scope or headers  
> attributes so we fall back on the implicit algorithm. Per HTML 4  
> the implicit algorithm searches up the column and marks heading  
> cells as headers of cell 5. The "algorithm" given in the HTML 4  
> spec isn't clear about how the presence of @scope affects this  
> association. My initial reading was that @scope is not considered  
> so both cells 1 and 3 are considered headings for cell 5, whereas  
> excluding cells with @scope set, only cell 1 is a heading for cell  
> 5. Reading again, I suppose "Then search upwards to find column  
> header cells" is supposed to be taken to exclude headings with  
> scope="row[group]" set, which I didn't pick up on when I first  
> implemented this, but did change in the experimental version.

That is much clearer now wit this example. It makes sense to me that  
the HTML4 algorithm  even the basic algorithm  should take into  
account the scoping of header cells whenever those are scoped.  
However, I'm not so sure "ignoring" the cell would be the right word  
here. If a header cell is scoped to a row and the current algorithm  
is being applied to a data cell in the same column then it follows  
that the header cell doesn't apply to the present data cell. However,  
if the scope is set to column and the two cells share the same  
column, then they should indeed be associated. Likewise if the header  
cell's scope is set to rowgroup and the two cells share the same row  
group, then there too the cell should be associated.

 From the example you pointed to in [4], I wasn't clear how this new  
change to your algorithm came into play. In that case, the associated  
data cells all share the same rowgroup with the header cell and that  
header cell has the scope set to rowgroup. In that case it follows  
that the HTML4 algorithm would respect the scoping of the header cells.

> This is another example of why the spec has to be precise in its UA  
> requirements if we want interoperable behavior, even though that  
> precision will make sections of the spec less accessible to authors.

I don't think I've heard anyone complain about the UA focussed norms.  
The complaint is unrelated to this issue. Instead, there have been  
issues raised with not not including the appropriate author norms.  
Omitting those requires authors to reverse engineer the UA norm to  
uncover what norms they will have to follow in order to create a  
document that will work with an HTML5  UA.

Take care,
Rob

[4] http://annevankesteren.nl/2007/09/tmb-overview
Received on Tuesday, 4 September 2007 18:09:38 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:49 UTC