- From: Jim Ley <jim@jibbering.com>
- Date: Tue, 22 Mar 2005 16:57:57 -0000
- To: <public-wai-ert@w3.org>
Hi, Sorry for the lateness of this, this is my overview of the previous Fuzzy Pointer suggeestions, and other parts of locating the subject. One of the big problems of inaccessible content, is that it's also likely to be invalid content. Because of this XPointers cannot be used, it's no good not continuing to test the content simply because there's already a failure due to invalid content, we want to review everything. XPointers also are not defined for use with HTML content, and due to the way HTML parsers have been created, the same DOM representation is not created in different implementations even for valid content. Because of this we developed the idea of a Fuzzy Pointer, this was defined against the infoset created as result of parsing an invalid document. This pointer was interopable across many parsers, HTML renderers and validators - openSP, IE, Mozilla, Opera all created the same pointer on the same invalid documents. This allowed us to identify elements more reliably than just row/column in the source (this information is often not available, and is unreliable against minor changes in the source such as whitespace) Fuzzy Pointers are also often persistent beyond changes that fix the HTML validation issues - this is an advantage which gives us the ability to not invalidate all the expensive checks. For example: With this document: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html lang="en"> <h1>Chickens!</h1> <title>Example Page</title> <img src="chicken.jpg" alt="[32324 bytes]"> </body> </html> There's two obvious errors, the document is invalid (title in the body) and the image doesn't have an appropriate alt. Fixing the validation error, could give us: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html lang="en"> <title>Example Page</title> <h1>Chickens!</h1> <img src="chicken.jpg" alt="[32324 bytes]"> </body> </html> but the ALT error hasn't been fixed - however because the document has changed we'll not have any idea at all if the error is still there, however fuzzy xpointers can be used to overcome this, whilst this example is easy to test again, if the test is an expensive one then re-testing may not be a practical option. There's another element we need though, since the document may have changed more than is allowable to invalidate a result, and for this we developed a number of hashes, these again were based on structure, they took a hash of the structure of the document or the structure and the Hn element titles etc, and by seeing if these change you can see more if the document has changed. Whilst not guaranteeing that the other tests results are still relevant, it allows for the computer to decide which are most likely to still be valid etc. Eek, time for the meeting... Cheers, Jim.
Received on Tuesday, 22 March 2005 16:58:21 UTC