- From: Dan Brickley <danbri@w3.org>
- Date: Mon, 24 Jun 2002 09:14:48 -0400 (EDT)
- To: <w3c-wai-er-ig@w3.org>
WAI/ER http://www.w3.org/2002/05/er-swade-f2f Evaluation and Repair Tools Working Group Meeting in conjunction with SWAD-Europe Workshop 24-26 June 2002, Bristol, U.K. Partial Notes from morning session 2002-06-24, danbri@w3.org (personal notes rather than minutes; thought might be useful) -------------------------------------------------------------- (about 11am after I stopped fiddling w/ wireless LAN, and started taking notes) wendy (summarising): test spec -- leave to qa ...granularity issues to investigate dave pawson -- test needs to be clear, objective, black and white answer. TEST CORPUS eg Nick and Jim's work... Jim: important to have some data, to experiment with combining data Nick: w/ valet tests and high/low/med confidence, a little arbitrary, would be better to put numbers, eg. say that when valet fails on a test, would like to say 'x% of cases that fail this test are assessed by a human as failing that checkpoint'. wendy:... i) ...first is a test suite, test cases we can run eval tools thru ii)...we need a ton of earl data, see how well merging etc goes libby: how much earl do we have? wendy: can generate a lot, but no big stores, repositories nick: Jim and I experimented with using Annotea's db, but that wasn't entirely ideal charles: ...try out the trust use case: see the results from valet, verify?, utag... ...and experiment w/ trust rankings What tools do we have that we can use to generate these results? - valet, verify, utag, .... at end of today, have folk leave with an action to test 3 different things, some folk to test the same things, so we have a corpus of things we expect to be interop. eg. there's no expecation that the CSS netscape test suite would be useful to merge/interop with say wcag test suiite, (just cos both using earl) wendy: just one overlap point -- does the page work w/out stylesheets nadia: eric's earl annotea db not quick what we need, would need eric to hack stuff danbri: what extra features/reirements would we have of an rdfdb nadia: he added/improve attributions ...kind of tricky to write algae by hand danbri: i'd love to see someone write up what EARL would need from generic RDF tools. davep: danbri, do you know what rdf db support 4suite has? danbri: I think an sql backend, but not sure action?: danbri + wendy to talk to eric davep: wendy, to clarify, when you talk about data, you mean test results? wendy: yes davep: ...so tests, plus environment plus libby: in swad-europe, we have a workpackage where we'll be setting up a database, probably anntoea [discssion of a database installation subgroup] wendy -- take responsiblity within er wg charles -- swad-e workpackage lead danbri+libby+nadia(+ericp?) too charles: 508 results... interesting cos big overlap between 508 and WCAG ...can work on trust mechanisms wendy: josh ? + chris r? created 278 test files nick: I had a brief look... these test some v specfiic things but have important ommissions. Narrow tests... wendy: I looked, they had some interesting tests. Eg. a file with just one image that has alt="insert alt text here". GOod, to see if tools look for such placeholder text. nick: that's not such a bad example... but if a test picks up such a case, likely has hard coded knowledge... results would need interpreting with care wendy: because there are authoring tools that generate that test nick: ok, if there are tools that do that wendy: that's why that test was created, yup charles: test suite would be a bucnh of html pages w/ specific errors i'm vageuly concerend there'll be tools optimised for the test suite but if we were to test on that test suite and 2 or 3 live sites with each tool, would be a bit more reassuring. Running against an unaniticipated site alongside a std test suite would give us more confidence. wendy: we're not testing the results of the data, but getting some data. WCAG goals (with other hat) need this data. goal is to have a complete test suite for WCAG cos we're going to CR. nick: if you're doing it like that, I could for eg make Valet 100% compliant w/ the test suite without actually making it any better! jim: good start would be manual testing to be followed by mech tests [missed some comments] charles: any reason not to use the test suite as one of the hings we test [...] Wendy: one use case, wcag, pretty much covered. potentially a lot of earl for that specific use case. what about the other use cases? nickg: as soon as I work out whether/how earl fits my reqs wendy: Implementation reports? good start anyway. Got a place to store it, tools that generate it. danbri: asked about schedule/calendar for EARL... stability etc wendy: go to TR working draft soon, hopefully towards a note within a few months. davep: some things lacking from spec... a schema/dtd/etc for the format so I can sit down with it. The spec doesn't leave clear which things are formally part of the vocab/naespace, or just examples. ...an rdf schema doesn't quite tell me what actual earl docs should look like ---- EARL Spec nadia: earl is basically tree structured, so we could represent it like this in the spec. separate out structure from properties(?) for class and property overviews... there is the structure of the evaluation: you have subject/predicate/object and the assertion ...and each of those has the class that is associated with them the way the thing is written now, isnt clear which classes are core wendy: I added tables this time, in pretty random order. Could re-do with things that are associated are closer together. nadia: I was planning to try a reorg [action] nadia to work on this after the meeting danbri: DTDs/schemas can give you a file format that also parses as RDF ACTION: danbri to send refs to DC, RSS (also to MaxF) w3c-wai-er-ig ====break for lunch
Received on Monday, 24 June 2002 09:14:49 UTC