- From: Sean B. Palmer <sean@mysterylights.com>
- Date: Thu, 13 Dec 2001 00:26:27 -0000
- To: <w3c-wai-er-ig@w3.org>
Sounds like it should be a simple thing, doesn't it? But we had all that stuff to consider: years of background discussions on identification and resources, the XPointer nightmare, equivalence measures, trying to manage big sacks of terms, and so on. Defining "stuff", and deploying a consistent language with which one can make claims about what they want, isn't easy. But I really want to stablize EARL; to say, "here, it's done, go away, stop bothering us". And we can, if we follow a plan... There are some terms in EARL that simply do not change - they have remained consistent since the EARLiest drafts for 0.9. These include terms such as "earl:asserts", "earl:passes", and so on. These are core terms, they are convincingly stable. On the next level are the model terms that should have been consistent, but may need to change because of the "identification" discussions - terms like "earl:testSubject", which I shall refer to. On the last level are the utility terms, stuff like "earl:name", terms that we can sprinkle liberally into the vocabulary space: packing the schema. We expect languages to be able to evolve, but when they do, there is always some trade off. Natural languages especially evolve rapidly, taking on colloquial forms, and integrating cultural idioms into the mainstream of the language. With programming and computer data-oriented languages, we have to make the changes "jerkier", since tools are not as clever as humans... they don't expect things to change, and when they do, they don't know what to do about it. Part of the vision of the Semantic Web was to ease the pain of new versions of languages. Part of the reason for choosing RDF in the first place for EARL was so that we could upgrade easier. However, in practice, it doesn't quite work like that, for two reasons: * few Semantic Web tools, * people will always want to create non-SW tools that can still read EARL. So there is a certain tension between the two ways of using EARL. AFAICT, the latter method is bound to be more popular. As languages evolve, the grammar and the structure change. EARL is no exception to that rule. The aim is to let people add extensions to the language, but make sure that these extensions can still be recognized enough by current agents. This comes under the umbrella of two phrases: forwards compatability, and partial understanding. It is fairly easy to ground these in terms of the EARL model. Let's take the example of the result property. This is the kind of properties that say whether or not something has "passed", "failed", or whatever. We had facilities in 0.95 for customizing the result properties, perhaps adding a new type of result, or confidence levels. However, we were thinking of dropping them, since they would probably not be supported until some point way off in the future, and by then would break current tools. However, it should be possible to give tools some hope of recognizing the new properties. Of course, from the Semantic Web/RDF POV it's incredibly easy - not worth giving a second thought in some cases, but we want to approach it from the POV of the general EARL user. What do they have to sniff in order to come to conclusions about new result properties? Currently, the validity of a validity property is the most important part. If EARL clients could simply search for the validity of any new property that they did not understand, then it is possible for them to work out roughly what is going on - partial understanding. Let me give an example. The usual kind of model is the following:- :Sean earl:asserts { :MyPage earl:fails :MyCheckPoint } . Now, we know (because it's a standard fact of the EARL language, in 0.95) that earl:fails has a validity of "fails". This kind of information should be built into the clients, so that when they come across an extension, they can roughly compare it to what they currently know. Let's say that someone adds a property that lets them give the confidence of a result. Such a example might follow the following format:- :Sean earl:asserts { :MyPage :kindaFails :MyCheckPoint } . :kindaFails earl:validity earl:Fail; blargh:confidence :Low . [Ignoring the fact that confidence levels were part of the language - imagine we took them out, or didn't have them in the first place. I can't predict what other extensions might be made, otherwise I'd add them now]. An processor which didn't understand ":kindaFails" should be made to look up the validity of this property. It could find "earl:Fail", and roughly conclude that the page fails the checkpoint. It's not particularly accurate, *but* sometimes getting along is preferable to totally breaking. And so there is the tension of just how much EARL clients should be expected to know, just how much they need to be able to infer - what are the core parts of EARL that we want processors to recognize? Of course, the goal from my POV is to simply add inference to any application that processes EARL, but that isn't practical. I'll leave this as a semi-open question for now. The other problem with stablizing the language is that Web architecture is a little bit screwy. Fragment IDs on URI-refs only apply consistently when the URI-ref is being used in a retreival action, and the content-type of the representation can be known. I had hoped that with a little ironing of the specifications (and this is a matter of some contention) that genericity could be added to fragID space, such that interoperability could be maintained between content negotiation, etc. XPointer seemed to run contrary to what, to me, are important principles, but then it is only following the current trend of specifications. The EARL "testSubject" property is interesting, because it attempts to bulldoze over all of these problems, but it's more of a hack than anything. Really, it means "the thing that I take to be represented by", or sometimes, "a representation of" (using the word repr. in two different senses there). Consider the following:- :MyPage earl:testSubject <http://example.org/> . :MyChunklet earl:testSubject <http://example.org/#blargh> . :MyTool earl:testSubject <urn:x-tools:SomeTool> . :WhatIsThis earl:testSubject <http://cam-seven-fish.ext/> . :XMLChunklet earl:testSubject <http://example.org/#xpointer([...])> . Can anyone honestly tell me what is being identified by the subjects of each of those statements? I certainly can't. I can give you popular interpretations, but I can't say definitely, because it's a sea of opinons out there. Then, we have the further questions:- * How do I identify only the attribute values in some representation? * How do I point to things in non-XML languages? Without a very, very, clear view of Web archtecture and "what is identified", you get a mess. This is a very importna tpart of earl, on the top end of the assertion. On the bottom end of the assertion (the TestCase) we have further worries, that I won't get into a lot, but it's basically the question of whether we *specify* or *point* to a TestCase (or both). I don't really want to start using the word "context", but it seems as if I may have to... in the "context" of an EARL report, there are two considerations that need to be made: what is identified within the report, and what is identified on the Web. Conisidering the "testSubject" statements again, the subjects of those statements are things that are very RDF/EARL model specific - we don't care what they resolve to on the Web, we simply care what is said about them in EARL reports. For the objects of the triples, the opposite is true: we care most about what they actually mean, what is referenced by the object. Aaron has argued for a long time that fragURIs are harmful to RDF< get them out of the specification. I did not agree with him; to some extent I still don't agree with him... but it does seem as if for EARL test subjects, we need to be very careful about what we are pointing at. Let me define the following categories of "things":- WebContent - representations of a resource. This is simply a series of bytes, perhaps with a content type and language type attached. The thing about WebContent is that you can point inside them and say "this bit", just as you can point inside a sentence and say "the fifth word". You can use an XPointer expression on it if you know that the MIME type is XML. Tool - some program that evalutes, authors, fixes, displays. This is generally an abstract concept, but it is a special type of concept. It may have an online description, it may be identified by a URI already (or it may not), it may have a version, a code repository, and so forth. Document - some kind of thing, generally with IPR rights, that can be evaluated. This may be an article. It is most certainly a resource, and may have a number of representations attached to it. An example "Document" is the W3C homepage - it can be rated as a work of art, or whatever. It's a generalization of a set of WebContent. and then, of course, there are all other resources - bananas, the concept of love, dew on grass in springtime. Whatever. The above are things that we can most easily ground in the Web, but that often get confused for one another. I recently proposed to get rid of "testSubject", for its vagueness, and add reprOf. For reprOf, the domain would be WebContent. This means that the object would have to be some kind of thing either slightly more abstract than Document (seems to be Al's POV) or exactly document (seems to be TimBL's POV). It would rule out having XPointers as an object... which is fine: anything with an XPointer shoved onto the end of it clearly does not identify anything abstract - it identifies a chunk of XML. So, what if we want to talk about a chunk of XML? How do we use an XPointer? Well, now that I think about it, it kinda makes sense to use the XPointered URI-ref as the subject itself:- <http://example.org/#xpointer([...])> a earl:WebContent . I did at first want to make sure that all "WebContent" instances had a "reprOf" property dangling from them... it still seems kinda necessary to me: you have to say that the above is a bit of XML content, otherwise it is meaningless. You also have to hack into the URI-ref itself to get the xpointer. I would much prefer to use the following:- [ a earl:WebContent; earl:reprOf <http://example.org/>; earl:mime "text/xml"; earl:xpointer "xpointer([...])" ] . It means that we kinda boycott XPointers as FragIDs, but I think of that more as a benefit for both EARL and Web architecture than anything else :-) So it is feasable to set a cardinality restriction on all instances of WebContent such that they must have one and only one earl:reprOf arc hanging from them. Jim immediately asked what happens about identifying all of the other things that we want to talk about. Well, let's start with tools. It is possible to give enough information about a tool to disambiguate it from other tools. A "homepage" (for want of a better word), "version", and "author" combination is probably enough to swing it. In fact, "homepage" and "version/date" may constitute a UnambiguousPropertySet, or at least, we can specify it as being so. So, an example test subject which is a tool would be the following:- [ a earl:Tool; earl:homepage <http://mytool.org/>; earl:version "1.0" ] . Now, we come to the more intreguing stuff - from the large (documents and abstract concepts) to the small (all attributes in document x) to a mixture (how to evaluate multiple things in EARL). For the large stuff, we can just come up with a class which is everything that Tool and WebContent aren't. We could further probe some of the stuff that TimBL and Al et al. have been talking about, but I don't think that it has all that much relevance to what we're doing; or rather, not as much as I did. On the scale of the larger SW, it's one of the most interesting topics... but here, I'm just going to brush it off. For the small stuff, and equivalence measures... this is interesting. I managed to scribble something in the telecon on Monday that created a good class for equivalence relations. Basically, they must be reflexive, transitive, and symmetric. It is then a matter of some simplicity to come up with an equivalence class for such a relationship, where for each member a, and each other member a', where the equivalence relationship is p, I believe the following always has to be true: p(a, a'). There are some other things that you can conclude from that, but I need to brush up on it a little... for now, it is enough to work out how it can be deployed in EARL. Equivalence relations are something which are, to a great extent, implementor specific, but we can come up with a framework for expressing the relationships, and some simple examples. To create an equivalence relationship, we could set up a class in EARL so that people can just say:- :sameAttributeContentAs a earl:EquivalenceRelationship . then we could define a predicate - "earl:equivalenceClassOf" - with an obvious usage. Of course, on many occasions, it will probably just be easier to use the predicate a few times between test subjects, rather than defining a custom class. The main conclusion from this is that equivalence classes are rather insignificant insofar as EARL is concerned. They're simply too local to concern us too much. We'll put a basic framework into the language, and then let people use it. It's not something that we need to spend a great deal of time discussing. The "mixture" question is quite simple to answer: you make the statements one by one, or you provide some impelemntation specific method for pointing to a whole range of test subjects, and then define some conversion to standard EARL. If you can't or don't define the conversion, then there is no way that we can say "this is standard EARL". I don't think that it makes good sense to start putting RegExp syntax into a basic evaluation language. It certainly runs contrary to the principle of least power, which itself is a derivative of KISS. Bags are pointless too. We have a method for evaluting things one at a time, and that's enough. So there you have it; EARL 1.0 architecture in a nutshell. Now that we have that under our belt, I can go through a summary of the answers to some of the recent "issues" that have presented themselves (as summarized excellently in Wendy's upcoming talk, deriving from the recent ER discussions):- [[[ Issue 1: Identifying State Changes Issue 2: Combining and Querying Results Issue 3: Threading Issue 4: Test Subjects ]]] Identifying State Changes: this is basically the equivalence measures thing. Since it varies from implemntation to implementation, and is information that is only necessary to have within a certain range of data processing (the step after EARL - what happens to the EARL), we simply don't care about it too much. We'll provide a framework for interoperability, and perhaps collect some of the scenarios etc. Combining and Querying Results: Go find a decent RDF processor, or build one yourself. I had hoped that there would be millions of the things by now, but I was wrong... there are few decent ones. I recommend CWM, of course, but it's not for everyone, it seems. It's an impelementation question, at any rate. Threading: Collorary to "Identifying State Changes". Doesn't concern me. Test Subjects: We now have a stable model for people to implement, outlined above, which I shall incorporate into EARL 1.0, and which you can all provide feedback on, until such a time as the chair decides that the language is stable enough to release. I don't mind how many iterations it takes, but I'm confident in myself that the above is "it". Evolutionary measures are a different question. Cheers, -- Kindest Regards, Sean B. Palmer @prefix : <http://purl.org/net/swn#> . :Sean :homepage <http://purl.org/net/sbp/> .
Received on Wednesday, 12 December 2001 19:27:30 UTC