- From: Charles McCathieNevile <charles@w3.org>
- Date: Wed, 10 Apr 2002 10:50:39 -0400 (EDT)
- To: Steven Pemberton <steven.pemberton@cwi.nl>
- cc: Nick Kew <nick@webthing.com>, <www-annotation@w3.org>, <w3c-wai-er-ig@w3.org>, HTML WG <w3c-html-wg@w3.org>
So one approach the RE group could take is to define a document namespace which is in fact defined as the Tidied version of something, where there is a reulst defined for when Tidy just gives up. A variation is to annotate a given docuemnt with an annotation type of "valid XML representation so we know what the xpointers refer to" or something, and make Xpointers refer to that (and define it, also, as the result of applying Tidy or something, so the actual thing can be autogenerated). Anyone want to make a server that does this? chaals On Wed, 10 Apr 2002, Steven Pemberton wrote: From: "Nick Kew" <nick@webthing.com> > > Therefore the answer to the question "what should an XPointer into HTML look > > like?" is a very loud "it depends". > > Indeed. It depends on defining a canonical normalisation of HTML. > If we can do that, we're fine. And what I said is: that is a minefield onto which we [the HTML working group] do not want to step. Real-world HTML documents are jokingly called "tag soup" for a reason. You take a goodly collection of HTML tags, stir them up, put them into a file, and publish it on the web. <style> elements before the <html> tag; <titles> outside the <head>; misspelled closing tags, misspelled opening tags, <ul>s with no enclosed <li>s; <li>s outside <ul>s. Imagine a combination of tags, you will find a document that contains that combination. Even Tidy throws up its hands sometimes, and instructs you to go back and change the source file! Finding a canonical normalisation of real-world HTML documents is not something the HTML WG feels inclined to spend its scarce time on. Best wishes, Steven Pemberton -- Charles McCathieNevile http://www.w3.org/People/Charles phone: +61 409 134 136 W3C Web Accessibility Initiative http://www.w3.org/WAI fax: +33 4 92 38 78 22 Location: 21 Mitchell street FOOTSCRAY Vic 3011, Australia (or W3C INRIA, Route des Lucioles, BP 93, 06902 Sophia Antipolis Cedex, France)
Received on Wednesday, 10 April 2002 10:52:00 UTC