- From: Sean B. Palmer <sean@mysterylights.com>
- Date: Sat, 14 Jul 2001 02:42:06 +0100
- To: <w3c-wai-er-ig@w3.org>
I wonder if EARL 1.0 could come out of EARL 0.95 by Wu Wei? I don't mean by not doing anything to 0.95, but rather making it more natural for people to use. Effort is the thing that keeps any new language down. Languages are nothing without implementations, just as natural languages are nothing without native speakers. The problem that we face is that with limited resources, how do we come up with a language that is incredibly easy for people to get into, learn, use, understand, implement, develop, contribute to, evolve, define, and so on. One of the most important steps was defining it in RDF. Although misunderstood by so many, RDF had the potential to let us create a model however we wanted, and also a syntax in which people could express the model. When N3 came along, it helped tenfold for people to express what was going on. But is RDF really the answer? Tools like CWM have shown that general Semantic Web processing tools can get the EARL jobs done... but CWM is it. It's the only Semantic Web tools available that's worth anything, and it's demonstration code. So what alternatives are there? There's straight XML. We already know that that is just XML with a layer of abstraction that will make it practically impossible for us to agree on anything. Meta *models* are more important than meta syntaxes. Syntax is a pain, but it's not important in the design phase. The model is everything. UML is one suggestion. This would allow the OOP programmers amongst us to come up with class bases and code to enable people to process EARL. It would also still link in with RDF. However, I don't really know the first thing about implementing UML and what that entails (which should not stop ERT from pursuing it). Apart from that, I really don't know what else there is to implement in. Clearly, our report format needs to be in an XML format of some kind. It's the de facto language. The way we came up with EARL 0.95 is that Daniel wrote a list of stuff down based on the outcome of our F2F meeting, and after some refinements, and input from Aaron, we ended up with an RDF Schema. But did we stop to question if the context => { content result criteria } is the most efficient way of going about it? Of course not, we just stuck with it and ended up with 0.95. The design of EARL, based on the requirements (ease of use, machine procesability, extensibility, etc.) are as follows (in note form):- * Able to attach context to particular assertions. Contexts should be mergable and queryable * A standard evaluation. What does an evaluation mean without context? You always have the triple "MyPage passes/fails SomeCriteria", but what is this mean? Let's go through a couple of likely scenarios:- 1) I have a page that I want to promote as being WCAG-A accessible. Perhaps I want user agents to recognize this and let the user know in some way 2) I've produced a new tool and want people to send in bug reports that I can store in a database For 1), the context is fairly important. What we need to know is if the author of the evaluation is lying, and whether or not we can repeat the experiment ourselves. It is useful to know who ran the test (it should be the author of the page, and do we trust the author?). It is useful to know exactly what form of compliance the author is talking about. It is useful to know what time and for what content the evaluation was made (has the page changed since then?). It is useful to link to some online script (through GET) that will perform the same test, if possible. I think that EARL 0.95 can provide for all of this right now. For 2), the context is important in a different way. The tool provider wants to know who is making these bug reports so that they can a) track management of the issue, b) track any fixes the assertor has posted, c) notify the assertor that the bug has been fixed. Important details include: author name, and context information. Not necessary if the author wants to remain anonymous. The context information is here being used not for *trust*, but for *feedback* and *management*. Trust is implicit - people rarely file random bug reports, and even if they do, there is no way to monitor that. This is out of scope for EARL. So, context information is important. However, in 2), we find that context information is not required, and that it is out of scope for EARL to consider why context should be required all of the time. IMO only, of course. The assertion itself (the main part of the evaluation) is something that will rarely be disputed. In p(s, o) notation:- validity(content, criteria) or, rather:- type(?x, validity) (type(?y, content), type(?z, criteria)) But in EARL 0.95 there are some subtle deviations from this model that are kind of fudged. Especially the stuff on the criteria side that doesn't really belong there - repair information and so forth. We call the criteria side a "TestCase", but I'm not so sure what that is anymore. The properties to do with criteria explicitly are:- earl:testId earl:testSuite earl:testCriteria earl:suite earl:id earl:excludes earl:level the pseudo junk is:- earl:testMode earl:repairInfo earl:purpose earl:operatorInstructions earl:reproducableStep What does it mean to say that something passed a test mode? Is that not in the wrong place? But, where should this information go? These are particular properties of an assertion. The assertion had this test mode, the assertion generated this repair information, the assertion serves this purpose, the assertion has these operator instructions, the assertion has this reproducable step. This makes the assertion a first class object in itself, and greatly complicates things w.r.t. reification and so forth. For the purposes of continuing discussion, I am classing these properties (informally!) as assertion result properties. For example, it begs the question - when is an assertion the same assertion as another? Or, more specifically, are two assertions the same when they have the same set of assertion result properties hanging off of them? What if one has none, and another has some? Clearly, the assertion result properties are once again linked to the context information for the overall evaluation itself. However, I think this time, the connections aren't all that clear. Let's take the "testMode" property as an example. When we run an assertion, we say that it had a certain test mode. If the test mode was automatic, this might infer that the test can be reproduced cleanly. Let's say I validate my page on the W3C validator. The assertion that the validator outputs is automatic. The assertion should also contain a link back to the URI that it came from, so that it can be run again, or whatever. Is this information inherent to the assertion itself? I think not; I would still say that it is a property of the assertion, not part of the assertion itself. In this respect, EARL 0.95 has a bug. I also don't think that this was a part of the context information. The context information is one level up from the actual test itself, it says who ran the test. Who tested this page? Can I trust that they really tested that page? The test mode information is something which is asserted by this context. It is not a part of the context itself. So, we have a few parts to the evaluation model now. We have a context. We have the assertion. And then we have properties about that assertion, which should also be linked back to the context. To digress slightly back to how we're going to teach EARL, I think we're going to have to lead by example. By showing how EARL can successfully model real-life situations and facilitate these, we can impart the collective wisdom of EARL much better then pointing them at some crummy class diagrams and saying "ooh, look at this wonderful model". Tutorials of the kind "here is how to add EARL to your page" are wonderful, and that's the type of thing we can be looking into for EARL 1.0. For EARL 0.95, I suggest we keep on the same lines as we're doing at the moment. We can reuse much of this for EARL 1.0 anyway. Back to the model. I cannot help but think of the model that I've just been through in terms of a nodes and arcs thing. I simply can't help but see it that way! So here's a strawman:- evaluation = assertor asserts assertion . assertor context_properties information . assertion assertion_properties information . Here, I'm trying to get rid of the anonymity that EARL gave to things. The reification can still occur, but the statement should have an ID. I can't think of it being any other way, now that I've thought about things for EARL 1.0. For example, in EARL 0.95 we said:- :Sean :asserts { :MyPage :passes :MyTest } . Using this strawman (which should not be referred to as EARL 1.0 - this is an intermediary state), it would be more like:- :Sean :asserts :x . :x :s :MyPage; :p :passes; :o :MyTest . We could even (and this is where it gets rather tricky...) fake, or somehow sub class, reification. O.K., we're already sub classing statements in EARL 0.95, but why not refine the reification properties as well? This would mean that as RDF changes, we can just update the definitions without having to change the URIs. I'm not sure it'll work, and I'm not sure how much confusion it will create in RDF people, but it's something worth investigating. How do you sub class rdf:predicate? That's going to be messy. Back to the concrete. The syntax really isn't important. What we are developing is the *model* and the *vocabulary* people can reuse the model without using the vocabulary (I tried this with HumanML, and it worked quite well with very few changes!). People can reuse the vocabulary without reusing the model. But then people use the vocabulary and model together, that is EARL. EARL M&V? He he he :-) Actually, that brings me back to the status of EARL. I've always wondered just what on earth EARL is? Is it just some experiment by the ERT WG? Then again, it's nice that the process doesn't matter. I feel quite free to say anything, experiment anyhow, and contribute as much or as little as I want without process getting in the way. I can say things which would be considered quite bizarre in other circles, and yet they're often paid serious attention, attention that is thankfully deserved in many cases. But I digress again. The model and vocabulary. I hope somewhat that the model will be *repurposable*; that is to say, we shall be able to transform it to make it grokable by humans as well as machines. XSLT is a great tool, so we should be able to use that so long as EARL is expressible in an XML format. It would also be nice to investigate what other kind of evaluation languages there are out there, what properties they use, and so on. Maybe some of them can be transformed into EARL? That would be a good test of independent invention. EARL 1.0 will hopefully be a thing of beauty, simplicity defined, and powerful, but not to the point of ambiguity. A tool that can be used in all sorts of environments, for all sorts of applications. So anyway, in conclusion, we have a lot to think about for EARL 1.0, if any of us can be bothered. I'm not too sure how enamoured people really are with the concept of EARL, but it seems to be one of those unavoidable things. The two things that I've found most annoying is watching EARL run around in circles, and not having the OOP knowledge to develop some real tools for EARL. The Semantic Web thing is annoying too. The Semantic Web is heading off in odd directions, and I feel that the projects are diverging and either getting lost, or absorbed by other competition. The SW isn't some mega killer-ap because it's so empheral, and yet if a few more really good programmers took the time to develop some well-documented and stable inference engines, we'd be laughing. That's really all we need. EARL could use them to tremendous advantage - we could almost label them as EARL engines. All we'd need to do is write filters and stuff. It would be nice if the W3C could install some RDFQL-type database for us to play around with using EARL. I'm sure we could come up with some really neat structured queries and so forth that would make for some really great results. I think you can even produce Web pages and so forth from that type of stuff. I don't know how to draw more people into thinking about EARL. Apathy about stuff that requires work seems great, as Wendy will tell you after some time in getting people to write/organize WCAG techniques :-) I don't want EARL to turn into one of those things that you start discussing with people and they make their excuses and leave; and yet I fear that has happened on many occasions already. Unfortunately, no matter what we do, it's always going to be over promotion, or under promotion. But perhaps the main worth from EARL will not be in the results of the experiment, but the method by which we got there. The interconnectedness of all things means that EARL might already have had some incredible and wonderful ramifications that we are not aware of. I've certainly learned a lot from it - I knew very little about RDF when I started tinkering with EARL, and although I've still got a lot to learn, at least I have the capability to develop other RDF based languages and so forth. But results are nice. The bookmarklet thing is fun, and I can't wait to use a real EARL tool, and check out how it performs and so on. Oh for just one person to come across the ERT site and be super-interested in EARL... EARL 1.0? Let's do it! -- Kindest Regards, Sean B. Palmer @prefix : <http://webns.net/roughterms/> . :Sean :hasHomepage <http://purl.org/net/sbp/> .
Received on Saturday, 14 July 2001 03:55:42 UTC