- From: olivier Thereaux <ot@w3.org>
- Date: Tue, 25 Oct 2005 11:07:16 +0900
- To: Nick Kew <nick@webthing.com>
- Cc: QA-dev Dev <public-qa-dev@w3.org>
Hi Nick, Thanks for your notes. On 24 Oct 2005, at 18:53, Nick Kew wrote: > On Monday 24 October 2005 06:35, you wrote: >> in http://lists.w3.org/Archives/Public/www-archive/2005Sep/0001 > > Hmmm, I don't recollect that. I think I only mentioned it on IRC one day, I had just done it to get familiar with making SAX filters. > Hmmm. OpenSP is a SAX parser; libxml2 provides a SAX filter used in > many of my tools (including AccessValet). Both work fairly well to > generate document outlines. Or am I missing something? Most likely I am the one missing something. But I realize I should probably have given more details in my previous mail, sorry about that. Here goes: The current development state of check uses SGML::Parser::OpenSP instead of onsgmls, and as a result some of the features (including "raw errors display", outline and parse tree [1]) are gone. [1] http://qa-dev.w3.org/wmvs/HEAD/check?uri=http%3A%2F%2Fwww.w3.org% 2F&outline=1&sp=1&verbose=1&errors=1 Unless I am mistaken, it was mentioned in prior discussions that these could be re-enabled by making and using a few SAX filters. (see [2]) [2] http://esw.w3.org/topic/SoftwareProjects I tried that, and on documents with wellformedness issues, my quick- and-dirty SAX filter-writer choked and gave up. Hence my questions. > If you tried it with pure-XML SAX then of course it'll fall over on > most of > the web. I find libxml2's HTMLparser the easiest to use for HTML. > Except > in the context of _validating_ SGML/HTML, where of course OpenSP is > the > only show in town. I guess this is where the answer to my questions lie. Do you mean that the SAX filters we would need to create a view of the outline or parse tree would not take as a source the actual document, but rather an even sequence from SPO, which would, unlike the source document, be wekk formed? Thank you, -- olivier
Received on Tuesday, 25 October 2005 02:07:24 UTC