EARL 0.95 Architecture

[This email addresses some issues for 0.9 and resolves how these may
be fixed for the next version, 0.95. It also addresses some central
EARL issues that are not addressed anywhere else, or have not been
raised w.r.t. EARL as it is now.]

In its current state, EARL is an RDF based framework for making
evaluations and assertions about many types of resources. Wading
through the ERT IG list archives, and especially those of early Feb
2001, I find that there are some recurring themes that come up time
and time again. EARL 0.9 is supposed to be an attempt at "getting EARL
out there", and although we have done that, I think we need to move on
to 0.95 to incorporate some of these points that aren't fully
addressed by EARL 0.9.

What we have currently (in EARL 0.9) is a two part statement - an
evaluation, and a set of assertions, thus:-

   { [ a :Assertor ] :asserts [ a :Assertion ] } a :Evaluation .
   { [ a :Resource ] [ a :TestCase ] [ a :Result ] } a :Assertion .

   :AnAssertor :asserts { :AResource :testCaseTo :AResult } .

Where here the parts in curly brackets "{}" are higher
order-statements of any kind, e.g. reified statements, contexts, etc.,
and where the outer context is an evaluation, and the inner context
and assertion. The structure of the Assertion at the moment runs
contrary to Charles' and Dan's RDF Conformance Language [1], so in
EARL 0.95, the basic statement could be:-

   { [ a :Assertor ] :asserts [ a :Assertion ] } a :Evaluation .
   { [ a :Resource ] [ a :ResultProperty ] [ a :TestCase ] }
     a :Assertion .

   :AnAssertor :asserts { :AResource :result :TestCase } .

In other words, instead of having something like "my page evaluated to
checkpoint 1 fails", you have "my page fails checkpoint 1", which
flows slightly better for many people (and machines?!).

Note the use of "A Resource" here. In EARL, we specialize resources
into a "groups" (of type "rdfs:class" viz. in RDF parlance
"classes") - non unique and unique, and Web content, Tool, or other
(undetermined). We call the most abstract group of resource being
evaluated by EARL a "TestSubject". A TestSubject can unique or non
unique, viz. it has a single date or it has many dates (or a range?).
We call these resources "UniqueTestSubject" and "NonUniqueTestSubject"
respectively.

This is eminently practical - sometimes resources change, and
sometimes they don't. Sometimes resources are abstract, sometimes they
are very explicit pieces of code. EARL cannot discriminate - it
includes them all. However, it must be noted that all TestSubjects
must be date stamped. This is because it's useful to be able to create
a new resource and make assertions about it - a TestSubject will have
these problems forever, because it is date stamped.

A note on generating IDs for EARL output: in EARL, you must often give
labels (IDs) to each part of the evaluation and assertions, so as they
can be reused in the future. These IDs *must* be unique, and not
conflict with anything else. They should not have secondary uses. All
EARL IDs are simply URIs within arbitrary namespaces, as created by
the EARL evaluation processor. In other words, they are GenIDs
(generated IDs). RDF syntax may in future address methods of
specifying bases for GenIDs, but no guarantee is made.

If EARL had a last name, it might be Kasday-McCathieNevile-
Dardailler-Swartz-Brickley-Palmer-Loughborough-Chisholm-
Gilman-Etc., but it's middle name should almost certainly be
"Practicality". With EARL, we are trying to create a framework that
is:-

   1) Easy to use
   2) Easy to process
   3) Easy to extend

Point 1) "usability" is something that will only be decided from
implementation to implementation, and by the quality of the
documentation produced. RDF controls point 2), "processability (and
repurposability)". Point 3), "extensibility", is the tricky one. There
is no way that such a generic language can be the root of the ontology
tree - people will always be wanting to introduce new terms for their
software and so forth. This can be controlled by clever thinking on
our part.

At the moment, the EARL vocabulary is mainly (90%) as set out in
Daniel's "EARL Properties" note [2]. This is a good core foundation,
but has been little discussed, and little implemented. However, I have
found already in implementing it that EARL is most interesting in the
way it handles things - there is a vocabulary term for almost
anything, and anything else can be extended from it. The only property
that I have used so far that cannot be extended from EARL is the
"earl:excludes" property. This is to be used when you are pointing to
a particular "earl:Suite", but you want to exclude certain
checkpoints. For example, when you say that "my page conforms to all
WCAG AA points except for one...". Note also that although EARL
currently doesn't have a "earl:level" property, that could be added to
0.95.

Another novel introduction that could be brought into 0.95 thanks to
the new model for "earl:Assertion" is a set of standard
ResultProperty(s). For example, many people will just want to say that
on no particular date, "x" passes with high confidence "y". 0.95
should provide for such standard properties, e.g. by providing
"earl:passes" and "earl:fails".

A note about dating (not that kind of dating...): one should be very
careful about where one puts dates. Note again that the current model
for 0.95 is:-

   { [ a :Assertor ] :asserts [ a :Assertion ] } a :Evaluation .
   { [ a :TestSubject ] [ a :ResultProperty ] [ a :TestCase ] }
     a :Assertion .

   :AnAssertor :asserts { :AResource :result :TestCase } .

Thus far, a date can be hung off of any node except for "earl:asserts"
(er... and the two reified statements/contexts). Let's take a look at
what each one means:-

 [ a :Assertor; dc:date "2001" ]
This means that the properties about this certain assertor are true
for a particular date. For example, if you are saying that "Daniel
uses Linux, date 2001", then you are saying that these properties are
true for Daniel on this particular date. In a way, you are creating a
new resource out of the person Daniel, and the equipment he used on a
certain date. A dc:date property is not required here.

[ a :TestSubject; dc:date "2001" ]
This means that the particular test subject here is a representation
of some tool, Web content, or otherwise, on that particular date. This
has been discussed a little before, so the only thing to rub in again
is that a dc:date property is *required* in this case.

[ :- :result; a :ResultProperty; dc:date "2001" ]
This simply says that this result is true for this particular date. It
is like saying "this is when we made the assertion". You may well
think that because the TestSubject is time-invariant (in most cases),
there is no need to state this, but be aware that tools used to make
the assertion may change over time, and hence date information here
could be useful for administration and debugging etc. dc:date is
optional here.

[ a :TestCase; dc:date "2001" ]
All this says is that a test case has a particular date - a guideline
was created on this day for example. dc:date is, once again, just an
option here.

A note about the use of dc:date, and the datatype as an object:
dc:date is a very broad term with a wide-range of applications.
However, in EARL, we probably want to be more specific about exactly
the object of the property can be when it is used - in other words, we
don't want people giving dates in strange formats that machines can't
understand. We need to look very carefully at how this can be applied,
particularly w.r.t. RDF, and then we might be able to enforce this in
the schema. We need to look into the question, "should we set a
standard for EARL dates, or should we leave this to be decided on a
processor to processor basis?". If the latter, don't forget that this
will make interoperability very difficult without some kind of
conversion scheme.

Apart from that, the only quibble about using other namespaces is that
it requires a processor to recognize more primitive terms than
otherwise, and it boosts the amount of data in EARL files (using
current QName methods). Stuff like rdfs:comment and rdfs:seeAlso could
quite easily be used in EARL, but wouldn't have any point if the EARL
implementations support it. One thing about declaring equivalences in
our own namespace, and then using those terms is that if the terms you
are using change, you simply change the schema and not the millions of
implementations! This is something that should seriously be considered
by the group - my personal opinion is that all EARL terms should be
declared in the EARL namespace, and equivalences and other ontologies
declared from there. Those reasons again:-

1) Shorter data. Less namespaces need to be declared as QNames, and so
the files are shorter.
2) Maintenance. We don't control terms in other namespaces, but we do
in our own. If other people change their terms, we have no control
over that, and we lose implementation stability.
3) Maintenance again. As new terms come along, we can simply declare
them in the schema too.

The next issue on the list is the representation of non-XML languages
that have a BNF. The only way to do this (that I can think of) is the
traditional EARL way of creating a new date-stamped resource,
declaring that it's an XML representation of the BNF of the original
resource, and then using an XPointer (somehow!) on that. This begs the
question - can you use XPointers on conceptual resources? Does that
qualify as one of the greatest hacks ever conducted on a Web
development list?

A spin off point from this is the RDF issues. RDF "as is" is pretty
unstable - it has a whacking great list of issues that are being
addressed by the W3C RDF Core WG, and it means that at any time, the
RDF model and or syntax could change. The only place that this really
affects EARL is in the reified statements, or (as we are often using
Notation3), contexts/N3 statements. Because higher-order assertions
are so useful, we can predict that there will always be a mechanism of
some sort for representing it in RDF. As such, we can simply recommend
that people "make do with what they have", i.e. just follow their
particular software, and if not, whatever the W3C (and/or the SW
community) recommends at the time.

EARL is rigid because it's so basic, and the vocabulary is so neat.
What I'd like to have is some more prose and discussion about the
current vocabulary that we use, so that we can provide explicit
documentation in the schema to stop people from mis-interpreting it,
and so that we can provide more updates to it.

The next step from me (after addressing all objections and revising
appropriately), is to create a new schema for EARL 0.95. I already
have a hacked up version waiting to go (after approval).

[1] http://www.w3.org/1999/11/conforms/
[2] http://lists.w3.org/Archives/Public/w3c-wai-er-ig/2001Mar/0015

--
Kindest Regards,
Sean B. Palmer
@prefix : <http://webns.net/roughterms/> .
:Sean :hasHomepage <http://purl.org/net/sbp/> .

Received on Sunday, 13 May 2001 22:31:30 UTC