- From: Gabriele Bartolini <me@gabrielebartolini.it>
- Date: Thu, 05 May 2005 12:27:03 +0200
- To: Nick Kew <nick@webthing.com>
- CC: public-wai-ert@w3.org
Hi Nick, thanks for your great contribution. I find the normalisation process extremely useful in some cases, but IMHO not practical in some others. I hope you can change my idea, in the very likely case I have not fully understood your arguments. I don't know if you have already discussed about this. If so, I apologise for that; however, I could not find any tracks on the threads that were recently posted on the list. I also apologise for the length of this e-mail, but I swear to you guys it is very fast to read. Normalisation "somehow" changes the original content of an SGML/XML document. I don't want to state the obvious, but normalisation is a one-way process and going back from a normalised document to its original is very hard (unless all the changes are stored). This process could therefore affect the localisation of the subject of an assertion. Especially when we assert something like "an element is missing". I want to clarify this with an example, and I hope we can discuss about it. Let's suppose my aim is to deploy a statistics regarding the usage of the "tbody" element in a collection of HTML documents on the net. I want to use EARL to write a report with assertions of all the documents that have been fetched and checked and the results (and maybe repeat it every quarter of a year). If original documents do specify "tbody", I guess the normalisation process produces a structure which would not affect the localisation of the subject of my assertion. On the other hand, if we consider this document portion: [...] <table> <tr> <th>Country</th> <th>Population</th> </tr> <tr> <td>Italy</td> <td>57 millions (?)</td> </tr> [...] my question is. Would the normalisation process introduce the following change or not? [...] <tbody> <tr> <td>Italy</td> <td>57 millions (?)</td> </tr> [...] </tbody> If it does, I think, there could be problems when trying to locate the missing tbody on a document that's been normalised: indeed, the tbody actually exists, as it has been artificially added. My question is: how would you locate this kind of problem using the normalised document? Are you still able to refer to the problem in the original document using a fuzzy pointer or Xpath expression (which are related to the normalised document)? Thank you for your attention. Ciao, -Gabriele Nick Kew ha scritto: > Jim has given us very briefly his take on the normalisation problem. > > FWIW, there's a piece on the subject by Joe English at > http://groups-beta.google.com/group/comp.text.sgml/msg/70ec0496587b03bb > taken from an SGML viewpoint. He doesn't make any reference to HTML > as such, but puts forward general rules. His analysis supports the > view that <tbody> elements (along with the usual suspects <html>, > <head>, <body>) should be inserted into the document tree where there > is ambiguity. > -- Gabriele Bartolini: Web Programmer, IWA/HWG Member ht://Check, ht://Miner and Wuhkag maintainer Current Location: Prato, Toscana, Italia me@gabrielebartolini.it | www.gabrielebartolini.it > "Lasciate ogne speranza, voi ch'intrate", Dante Alighieri, Divina Commedia, Inferno
Received on Thursday, 5 May 2005 10:27:27 UTC