[00:39] --- xover has changed the topic to: W3C QA Tools || Markup Validator v0.6.7 Released || next regular meeting 2004-08-31 TODAY, NOW! :-) [00:49] on the agenda for today, we could have a "short chat" on m12n, possibly focusing on the common parts between check and checklink? what do you think? [00:50] Sounds good. [00:50] ditto. Bjoern had some ideas in his msg to list [00:50] Martin's recent work seems to suggest there is a lot more in common than I thought, or at least should be [00:50] We should also set aside a full five minutes for worshipping Björn. [00:50] yup [00:51] let me find reference [00:51] http://lists.w3.org/Archives/Public/public-qa-dev/2004Aug/thread.html [00:51] and http://lists.w3.org/Archives/Public/public-qa-dev/2004Aug/0016.html especially [00:53] apparently s:p:o is our first candidate for m12n of check... haven't done my homework and tried it yet, but the whole retrieval, charset, size and pre-parsing could be spun off "easily" too [00:55] scop, in http://lists.w3.org/Archives/Public/public-qa-dev/2004Aug/0015.html you wrote that you had already done some work in this direction [00:55] For the record, Björn has solved pretty much all the problems that were in the original first-cut of S:P:O in his (ground-up) rewrite. [00:56] There's some build and packaging stuff left to do, but mostly it's now only a matter of building trust in the code and then we can begin implementing it in "check". [00:57] yep, I've done something, but it's not online available yet. mainly familiarizing myself with Perl's XML::SAX [00:57] that's libxml2? [00:58] yep, there's a libxml2 "driver" for XML::SAX [00:58] Well, it's Perl-ified SAX. [00:59] what's the general feeling about using SAX in the first place? event driven things are generally good, but we need DOM'ish things also anyway... [01:00] I'm not too fond of SAX (or event-driven designs in general), but it does seem the sensible approach and I'm willing to give it fair shake. [01:00] We can produce a DOM from the SAX event stream if we have to. [01:01] scop: context? I find SAX very useful, and do produce DOM from the SAX stream [01:01] s/DOM/pseudo-DOM or DOM/ [01:01] of course, but given that we absolutely _need_ DOM, do we _need_ SAX, or should we try with DOM only? [01:01] or even co-dom [01:01] niq: http://lists.w3.org/Archives/Public/public-qa-dev/2004Aug/0016.html [01:02] For what do we _need_ DOM? [01:03] xover, see the 0016.html link above, at about 60% of that message [01:03] co-dom works nicely for reporting, fwiw [01:03] what's co-dom [01:04] --> [01:04] like xslt [01:04] You're thinking in terms of XPath/XPointer? [01:04] yes [01:04] + bjoern had some other requirements [01:05] I can see ways it might be possible to do both with SAX... [01:05] multi-pass? [01:06] what can't you get from sax? [01:07] Unroll the path beforehand, and keep a stack of (potential) seen targets. e.g. [01:08] EXPN(unroll the path)? [01:08] scop: how near would XMLReader be to your needs? [01:08] Programatically create event handlers from the path, and apply them to the event stream. [01:09] niq: which XMLReader do you mean? [01:10] xover: I'm pretty sure I'm not followig [01:10] XMLReader API [01:10] streamed pull API, with XPath/etc [01:11] pointers? that may be news to me. [01:11] http://xmlsoft.org/html/libxml-xmlreader.html [01:12] The point is, 1) we /may/ be able to do all we need with SAX, 2) we can generate a (sufficient) DOM from SAX if absolutely needed, 3) OpenSP produces an event stream so returning a DOM there would need an additional layer (which can be added in later if we think we need it). [01:13] 3) OpenSP produces an event stream so returning a DOM there would need an additional layer (which can be added in later if we think we need it). [01:13] niq, too much to digest now (will need to run RSN) [01:13] The original VisVal did that [01:13] In fact, mod_validator does a modified version of it [01:14] so we have some relevant code [01:14] IOW, I think starting out aiming for SAX is a good idea. If we run into trouble we can do however much or little DOM we need after the fact. And we can do it at a sufficiently low level that there will be no added overhead from the DOM-later compared to the DOM-now approach. [01:15] ok, fine with me [01:16] * xover feels he should emphasize the "I think" bit of the above, but decides this should be blindingly obvious to all concerned... :-) [01:17] In particular, I've never been familiar with checklink's code so from that perspective I may be talking out of my ass. [01:17] On the subject - libxml2 uses same underlying parser for SAX and DOM, and gets the best benchmarks [01:17] plus it has an HTMLparser - though that's nowhere near validating [01:19] FWIW, I've benchmarked tested my initial Perl SAX filters with different parsers, XML::SAX::ExpatXS is #1, libxml2 2nd [01:19] Isn't expat still extremely limited in what it does compared to almost anything else out there? [01:20] CTBCPP? [01:20] xover: yes and no. It's very compact, but can be built on [01:20] libxml2 does much more within [01:21] xover, expat only does parsing and namespaces, libxml2 does validation, dom, xpath, etc. i.e., expat is a parser, libxml2 a toolkit [01:21] (eg expat+sablotron does dom, xslt etc but is much slower than libxml2) [01:21] yup, I needed only SAX+namespaces for now [01:21] Hmm. It's more capable than it was last I checked it seems. [01:22] BTW, I would strongly encourage everyone to have a good long look at Björn's SGML::Parser::OpenSP! [01:22] and expat+namespaces is a lower-level API than SAX2 [01:23] S::P::O is at http://cvs.sourceforge.net/viewcvs.py/spo/spo [01:23] That's moving forward from yours, yesno? [01:23] As long as we have to deal with HTML (as opposed to only strict XML-based stuff), it will be a determining factor for what we can do and how we can do it. [01:23] It's a ground-up rewrite of my nasty hack. [01:23] aha [01:23] its feature complete except ->halt() [01:23] And it builds on Win32! [01:23] *gasp* [01:24] yes, opensp means sax has to be the basis of anything multi-parser [01:24] * xover notes we can fake DOM before passing stuff back to Perl if we really need to... [01:25] natch [01:26] ok, regrets, but I'm already late... need to go now. I'll try to get the SAX bits I have online somewhere for public beating this week [01:28] * bjoern_ notes there are Perl modules that build a DOM from SAX events if we really need full DOM [01:26] BTW, scop, since you're a bit closer to the Fedora process than I am... Does it look likely that anything will happen with the openjade/opensp split for FC3? [01:27] <-- scop has quit ("bye") [01:28] Tim Waugh is the Red Hat guy who owns the OpenJade component, and hence is responsible for implementing our RFE to split the OpenJade and OpenSP packages in Redhat/Fedora/RHEL. [01:29] (RHL/Fedora/RHEL includes OpenJade with OpenSP embedded; so you can't upgrade OpenSP without conflicting OpenJade). [01:29] * niq notes that there are C and C++ modules that do ditto [01:30] Fedora Core 3 is likely to become RHEL 4.0; or something close to it. [01:31] And with an 18 month cycle, and 5 years maintenance life, it'll be a while until our next opportunity to effect the split comes up. [01:32] what influence does scop have there that you wanted to draw on? [01:32] Meaning OpenSP 1.5.2 (with the memleak-fixes from Björn) and S:P:O won't make it into FC until FC4, and RHEL until 5.0. [01:33] how about deb? [01:33] scop was involved with Fedora.us -- third-party packager for RHL -- and since Fedora.us became the official community organization for RHL when it became Fedora Core, he's sitting pretty close to the action. [01:34] I haven't seen hide nor hair of Frederic Shutz for a good long while now. [01:34] But he was still tracking us with his Debs around 0.6.5... [01:34] right, me neither... was wondering about the upcoming release [01:35] ACTION: yod to go on a quest to find Frederic Schutz and check his status...? [01:35] * niq has the ear of one or two deadrats through apache stuff, but I don't know if any of them would have heard of Open[SP|Jade] [01:36] As scop said, it's too late for FC3; so I think I'll make a new push when FC4 development goes live, and probably with OpenSP 1.5.2. [01:37] Hmmm. Probably a good idea to get this on the "stuff to not forget about list"... [01:38] ACTION: xover, scop; go pester RH Bugzilla about getting openjade/opensp packages split for FC4... [01:39] http://lists.w3.org/Archives/Public/www-validator/2004Aug/0241.html [01:39] can anyone confirm my diagnosis? And if so, we need to report it better [01:40] * bjoern_ suggests niq to file a bug report in W3C bugzilla [01:40] fairy nuff [01:40] the diagnosis was speculative .... [01:40] It sounds plausible. [01:41] It's what I remember from porting libwww-perl to PHP... [01:41] We might also want to take the opportunity to investigate how we might implement a HTTP checker into the validator. [01:41] aha. [01:42] it would start with `use HTTP::Checker` or something like that [01:42] Most likely, yes. [01:42] does HTTP::Checker exist? [01:42] probably not [01:42] LWP is not so good for that; it corrects (or tries to correct) too many things [01:43] true [01:43] * xover implements WHAT WG checker: if ($FPI =~ m(WHATWG)) {print "Go bug Hixie; your doc will never validate"} [01:43] xover++ [01:43] that was fast [01:43] I'll write the test suite [01:43] * yod would be interested in working on such a thing as HTTP::Checker [01:44] Well, there's an existing service that might be useable; but perhaps all we need is basic sanity checks, built on top of LWP. [01:44] yeah [01:44] URI of existing service? [01:44] Similarly for the mnot cacheability checks and such. [01:45] bjoern_: I can't recall OTTOMH. Never used it but seen it referenced. [01:45] aha, ok [01:45] cg-eye (old) [01:45] there is a http-compliance list on yahoogroups, FWIW [01:46] updating that has been on the todo list for years .... [01:46] * niq considers yahoogroups too painful to use [01:47] one of the active QAIG participants wrote an HTTP test framework [01:47] for clients? [01:47] That may be what I'm thinking of... [01:47] yod: url? [01:48] yes, that's a company. closed software, but he would probably be happy to help any work promoting better HTTP [01:48] (Alex Rousskov?) [01:48] oh, a tool not a spec [01:48] yes - http://www.measurement-factory.com/ [01:48] Eeek! [01:49] they have - um - interesting views on HTTP [01:49] (http://groups.yahoo.com/group/http-compliance/) [01:49] with reference to apache's bugzilla, his reading of HTTP differs from mine in many regards [01:49] they know their stuff pretty well. what arguments they use to sell their product might be another issue [01:50] niq - I see [01:53] The value in their product seems to be in their test case collection, not the testing framework. IOW, they're unlikely to give up the crown jewels. We'd need to implement this from scratch if we were to go that route. [01:53] example: http://issues.apache.org/bugzilla/show_bug.cgi?id=15861 [01:55] There's not really so much to check in the validator context [01:55] * bjoern_ suggests to postpone HTTP compliance testing integration to when there is something to integrate... [01:56] Any heavyweight tool has to be based on formulating requests+responses: a higher-level frontend to something-like cg-eye [01:56] we are probably more interested in checking syntax in response headers [01:56] Yeah, what I had in mind was more stuff like the bogus Size and various other header field garbage. [01:56] yeah, fairy nuff [01:57] Fulll HTTP testing would probably be a better fit for a completely standalone test suit (with included testing framework). [01:57] but people don't even understand that text/plain isn't html, so what hope? [01:57] well, their problem is that they cannot use our service because of some sub-optimal browser behavior... [01:58] (which they cannot easily fix) [01:58] You thinking of file upload now? [01:58] I wasn't .... [01:58] well, not in particular [01:58] I could make the same point about sub-optimal servers/scripts... [01:59] bjoern_: do you use IIS at all? [01:59] the problem is probably rather that people are not aware of HTTP at all [01:59] niq, little... [02:00] Can it support HTTP HEAD? Is failure to do so a config error or total brokenness? [02:00] It can support HTTP HEAD properly, yes [02:00] thanks [02:00] those IIS installations that do not are probably broken by ISAPI scripts [02:00] yes it does HTTP_HEAD fine [02:00] I got similar behavior from mod_php some years ago [02:01] * niq preparing a report on a site that sends full bodies with every HEAD [02:01] IIS seems to encourage its users to reimplement HTTP facilities in broken ASP... :-( [02:01] like the apaches that break with some tomcat servers on top. [02:01] and it's cfm ... [02:01] cold fusion yeah, that's awful http! [02:01] (http://bugs.php.net/bug.php?id=5885 my bug report) [02:02] JibberJim: several of apache's big addons tend towards brokenness [02:11] also http: someone (could be me or yod) to list checks we should make and report to users [02:12] * xover proposes we AI this to niq, and get him to follow up on the list? [02:12] speaking of feedback, I haven't received any on the test suite/catalogue proposal [02:12] * niq could summarise what valet reports as startingpoint [02:12] niq: sure, if you can do that we can develop from there [02:12] yod: testsuite url? [02:13] http://qa-dev.w3.org/tests/ [02:13] The test suite... I'm more interested in building the test suite for spo at the moment; meaning the CPAN/Perl test framework, but probably some of the same tests. [02:14] bjoern_: fairy nuff; I had in mind a post, .... [02:14] The presentation of our existing tests was ver nice, but I had trouble getting really interested in it as a testsuite/framework. [02:15] I probably need to look at it in more detail and try harder to grok it, but it's not a top priority right now. [02:15] niq, http://www.w3.org/mid/1551F3C2-E4FD-11D8-B4D9-000A95E54002@w3.org [02:15] problem with our existing tests was that they were impossible to manage [02:16] so I wanted to keep the same structure (repository of files, creating an html file with links to the docs and ,validate) but also have a catalogue of them so that we can e.g easily find a subset of them, or add test cases outside of these files [02:17] a good start at feedback would be saying whether this is completely off the mark wrt our needs or not [02:18] ok, will thinkabout [02:18] great, thanks [02:18] I think it's probably a good idea; it's just that I'm focussing on completely different issues right now so I have nothing very sensible to say about it. [02:19] xover, that's fine [02:19] "not much to say about it right now" is already feedback, better than full silence at least [02:20] although if you have time to have a look at it, that's obviously better [02:20] my question would be how http://qa-dev.w3.org/tests/ brings us closer to a `make test` test suite for the code in check [02:21] What may perk my interest is relating it to tests from ... /me acks bjoern_ ... spo. Can we conceivably autogenerate its t/* from your code? [02:21] we can easily generate a list of cases that should be {in}valid [02:22] special cases such as UI test will still need to be run by hand [02:23] I was planning to generate these and diff the validation results for two check versions [02:23] so all /tests/ currently provides is something to the effect of ` :isa :validDocument`? [02:23] not only [02:23] ACTION: yod to do `perldoc Test::More` and engage wetware neural net to the problem of generating that from his test collection? [02:24] bjoern_, not necessarily [02:25] the system can be expanded to specify what component is tested [02:25] but...? [02:25] /tests/ includes a lot of _implied_ tests; each test is abstract, and tests a lot of UI and processing in addition to valid/invalid. [02:26] Hmm. Do we agree that with m12n each module would be responsible for testing and that test suite for the the web interface, i.e. check, would do little or no testing of their functionality? [02:28] Except possibly invoking their respective test suites (if that happens to be a convenient way to do it), yes. [02:28] well, I think it would be fair to expect that their test suites would be run on their install time [02:29] then I think we should have a road map for what code `check` will be responsible for and thus what to test in a `check` test suite [02:29] test suites for other components would be sorted out during the development of the component [02:30] no? [02:30] This does not preclude a central repository of tests, from which individual tests and test groups are derived. [02:30] check tests should run live over http, and should be compatible with other tools like valet/wdg [02:30] why? [02:30] (insofar as they perform the same task) [02:31] we're talking about validating the validator [02:31] Tests of "check" as such would be mostly GUI (i.e. mostly manual), and possibly with some testing of conneg and such. Maybe even testing whether all output is, ehm, Valid. :-) [02:32] xover, fairy nuff [02:32] This does not preclude a central repository of tests -> precisely... the catalogue as I presented it uses "valid y/n" whereas it could say "component x"+"fail/pass" [02:32] niq, you were thinking of "valid or not" tests? [02:32] indeedie [02:32] Hmmm. [02:33] There may be a place for a collection of "Valid y/n?" tests that tests an entire service; in addition to, e.g., similar tests for e.g. spo. [02:34] yes, and a validator-validator tool that applies them [02:34] Something that might be run against all v.w.o frontends (CGI, PHP, SOAP, etc.) and against Valet and WDG to check that everyone agrees. [02:34] which as we know is not-always ... [02:34] Right. [02:35] I think checking with other services is out of scope but we should design it so that this is made simple [02:35] we should only validate against our expections [02:35] agreed. [02:36] like, this is not well-formed but we know the validator does not know that and will thus fail the test which would be "ok" [02:36] so we could actually release something that does not pass all the tests but is consistent with our expections [02:36] bjoern_: that's a layer on top of the basic right/wrong [02:36] true [02:37] A test that produces a wrong but expected result is a passed test case. We'll probably need that at the spo level too. [02:38] * yod notes the idea of "expected" could be rather thought of as the comparison with a "reference" test result [02:38] different modeling, same result [02:39] I think we should try to avoid writing tests for external code, i.e., the s::p::o test suite should be concerned only with the XS code, not the OpenSP code, e.g., [02:39] yeah, fairy nuff [02:39] Perhaps. But SPO might expected-fail a test, while XML::Parser:... would expected-pass the same test. [02:39] if OpenSP has limitations in its XML support, the OpenSP test suite should test for it, not the XS test suite [02:40] this is getting big ... [02:40] * xover is inclined to special-case this for SPO/OpenSP for various reasons, but agrees in the general case... [02:41] what would be okay is a higher-level general purpose test suite, like, say, XML::Tests that can be utilized by all XML processors [02:41] bjoern_: have you done anything for spo? [02:41] niq, no [02:41] yod: Your test model, with the "reference" result, would need to keep a reference for each component the test applies to. No? [02:43] for each component we'd store a reference test results (EARL or something else) of all applicable tests, yes [02:44] yod: SQL would be good as bottom layer, so we can run queries [02:44] EARL next layer up, or a query-result option [02:45] Hmmm. It seems to me that this topic needs some time to mature before we get too deep into it. Perhaps focus on v.w.o testing for yod's test framework, while we concentrate on individual tests for the bits of m12n (SPO, to start) and then look at how they overlap when we have more specifics to work from? [02:45] * niq wonders if he's at cross-purposes with yod; just reread [02:46] xover++ #need to review what yod's done first [02:46] xover++ especially given what time it is :) [02:46] I suggested to start with a roadmap earlier [02:46] aha [02:46] m12n roadmap, or test roadmap? [02:47] Roadmap for Testing [02:47] I could put ideas expressed here together as a start, into the wiki [02:48] wiki:TestingRoadmap++ [02:48] hmmm [02:48] http://esw.w3.org/topic/ValidatorTestSuite [02:48] aha [02:49] I had completely forgotten about that one :) [02:49] I think we need more active list-discussion on this, suggest ACTION: all to contribute to that [02:49] at least I'm consistent in the wish to have that in a wiki [02:49] yod, it's linked from http://esw.w3.org/topic/MarkupValidator :) [02:50] I can see that :) [02:50] * xover suggests ACTION for yod or bjoern to clean up wiki with an eye to making all pages findable from a common entry point. [02:51] But I'm not a big wiki user so it might be me being dense again. [02:51] anyway yes for cleanup [02:51] I'll have to re-read this anyway [02:52] * bjoern_ suggests moving ValidatorTestSuite to MarkupValidator/Testing [02:52] and +1 for more discussion on list as people give the demo a look [02:52] ANd probably a good idea to make sure the qa-dev.w3.org fornt page is usefull and contain the relevant links. [02:54] * xover decides to bug scop about adding remote-repos support of some kind to FreeBSD-CVSWeb and setting it up on qa-dev... [02:54] okay, any other business? [02:56] I'm thinking of adding a tracker-bug for 0.7, and sticking blockers on (possibly new) bugs to create a TODO/plan for a 0.7 release. [02:56] * bjoern_ always scared when xover brings up such management stuff... [02:57] Primarily because I hope it'd help me be more focused on what's needed for 0.7 to happen. [02:58] better start with completing your AI on sending mail about issues in current merged codebase [02:58] But also because we never seem to get todo.html updated, and we should be more explicit (and discuss more) what goes / does not go for any specific release. [02:58] bjoern_: that's a good start, yes. [02:59] ANy comments on that? ("*shrug* Whatever floats your boat" is a valid response) [03:00] in principle agreed, in practice dunno how we should balance such a wish for cleanliness with limited time [03:01] We need to get m12n done to a reasonable extend before we should start thinking about roadmaps for new features and stuff [03:01] As mentioned, I'd hoped that approach might make me, at least, be more focussed on the release; and hopefully make it easier to pick off remaining issues and get it out the door faster. [03:03] Might make it easier for the rest of you to chip in on getting it done too, BTW. Clearer way to note an issue that needs addressing, and a single place to find a suitable task to do. [03:04] if that helps you and you're not overdoing it ;), then yeah, go for it [03:04] a bit of organization in bugzilla won't do harm [03:04] I want to get `check` using HTML::Template code out ASAP and support anything that helps that [03:05] Ok then. Unless anyone objects I'll do it as an experiment for myself. The rest of you shouldn't be affected by it unless you want to be. [03:06] next meeting [03:06] 14 Sep [03:08] adjourned [03:09] soon we are on 3 hours... [03:09] bjoern_: You won't be around until next week? [03:09] not much [03:09] 3 hrs. Yeah. Maybe we might entertain the idea of weekly meetings, with every other week being an ad-hoc interrim meet. [03:10] well, we hadn't had a meeting in a while [03:10] Us IRC dweebs being the ones that get the vague chatter done on thos, saving a more fleshed agenda (and hopefully shorter time) for the "real" meetings?