danbri's notes, 2002-06-24 (morning)

	Evaluation and Repair Tools Working Group Meeting
	in conjunction with SWAD-Europe Workshop
	24-26 June 2002, Bristol, U.K.

Partial Notes from morning session 2002-06-24, danbri@w3.org

(personal notes rather than minutes; thought might be useful)


(about 11am after I stopped fiddling w/ wireless LAN, and started
taking notes)

wendy (summarising):
test spec -- leave to qa
...granularity issues to investigate

dave pawson -- test needs to be clear, objective,
black and white answer.


eg Nick and Jim's work...

Jim: important to have some data, to experiment with combining data

Nick: w/ valet tests and high/low/med confidence, a little arbitrary,
  would be better to put numbers, eg. say that when valet fails on a
  test, would like to say 'x% of cases that fail this test are assessed
  by a human as failing that checkpoint'.

i) ...first is a test suite, test cases we can run eval tools thru
 ii)...we need a ton of earl data, see how well merging etc goes

libby: how much earl do we have?

wendy: can generate a lot, but no big stores, repositories

nick: Jim and I experimented with using Annotea's db, but that wasn't
   entirely ideal

 ...try out the trust use case: see the results from valet, verify?,
  ...and experiment w/ trust rankings

What tools do we have that we can use to generate these results?

 - valet, verify, utag, ....

at end of today, have folk leave with an action to test 3 different
things, some folk to test the same things, so we have a corpus
of things we expect to be interop.

eg. there's no expecation that the CSS netscape test suite would
be useful to merge/interop with say wcag test suiite, (just cos both using

wendy: just one overlap point -- does the page work w/out stylesheets

nadia: eric's earl annotea db not quick what we need, would need
 eric to hack stuff

danbri: what extra features/reirements would we have of an rdfdb

nadia: he added/improve attributions

...kind of tricky to write algae by hand

danbri: i'd love to see someone write up what EARL would need from
  generic RDF tools.

davep: danbri, do you know what rdf db support 4suite has?

danbri: I think an sql backend, but not sure

action?: danbri + wendy to talk to eric

davep: wendy, to clarify, when you talk about data, you mean
	test results?

wendy: yes

davep: ...so tests, plus environment plus

libby: in swad-europe, we have a workpackage where we'll be setting
 up a database, probably anntoea

[discssion of a database installation subgroup]

wendy -- take responsiblity within er wg
charles -- swad-e workpackage lead
danbri+libby+nadia(+ericp?) too

 508 results...
 interesting cos big overlap between 508 and WCAG
 ...can work on trust mechanisms

wendy: josh ? + chris r? created 278 test files

nick: I had a brief look... these test some v specfiic things but
   have important ommissions. Narrow tests...

wendy: I looked, they had some interesting tests. Eg. a file with
   just one image that has alt="insert alt text here". GOod, to see
   if tools look for such placeholder text.

nick: that's not such a bad example... but if a test picks up
   such a case, likely has hard coded knowledge... results would need
   interpreting with care

wendy: because there are authoring tools that generate that test

nick: ok, if there are tools that do that

wendy: that's why that test was created, yup

charles: test suite would be a bucnh of html pages w/ specific errors
   i'm vageuly concerend there'll be tools optimised for the test suite
   but if we were to test on that test suite and 2 or 3 live sites with
   tool, would be a bit more reassuring. Running against an
   unaniticipated site alongside a std test suite would give us more

wendy: we're not testing the results of the data, but getting some
   data. WCAG goals (with other hat) need this data.
   goal is to have a complete test suite for WCAG cos we're going to

nick: if you're doing it like that, I could for eg make Valet 100%
   compliant w/ the test suite without actually making it any better!

jim: good start would be manual testing to be followed by mech tests
   [missed some comments]

  any reason not to use the test suite as one of the hings we test

  one use case, wcag, pretty much covered. potentially a lot of earl
for that specific use case. what about the other use cases?

  as soon as I work out whether/how earl fits my reqs

wendy: Implementation reports?

good start anyway. Got a place to store it, tools that generate it.

danbri: asked about schedule/calendar for EARL... stability etc

wendy: go to TR working draft soon, hopefully towards a note within a
  few months.

davep: some things lacking from spec... a schema/dtd/etc for the
   format so I can sit down with it. The spec doesn't leave clear which
   things are formally part of the vocab/naespace, or just examples.

   ...an rdf schema doesn't quite tell me what actual earl docs
   should look like



nadia: earl is basically tree structured, so we could represent it like
in the spec. separate out structure from properties(?)

for class and property overviews...
there is the structure of the evaluation: you have
 and the assertion

...and each of those has the class that is associated with them

the way the thing is written now, isnt clear which classes are core

wendy: I added tables this time, in pretty random order. Could re-do
 with things that are associated are closer together.

nadia: I was planning to try a reorg

[action] nadia to work on this after the meeting

danbri: DTDs/schemas can give you a file format that also parses as RDF

ACTION: danbri to send refs to DC, RSS (also to MaxF)

====break for lunch

Received on Monday, 24 June 2002 09:14:49 UTC