Unmentioning The Unmentionables

In tracking back through the ERT list archives, I found a thread
running parallel to some of the EARL developments that has to do with
resources, and identification thereof, and I'd like to readdress some
of the points.

[[[
   > What level should description point to?  Characters? Tokens?
   > Tags?  ("tags" is html and xml specific... what about other
   > parsable languages?)

   The units in which the user perceives the interaction, and the
units
   in which the software interface manages the bits and references in
   the DOM or similar API.
]]] - http://lists.w3.org/Archives/Public/w3c-wai-er-ig/2000Nov/0020

The problem we have with identifying resources is the discrimination
between the resource and the identifier, and the identifier for the
identifier, and so on. We can determine that if something is being
used in a procedural sense (i.e. a bit of programming code), then that
this is real - it exists, and is being used, but it may be impossible
to identify. Also, some mathematical sets are impossible to represent
by any sort of flat identifier, unless you bend people's definitions
of infinity :-)

To provide a bit of context here, I have been discussing the idea of
internal representations if identifiers in a "SEM" [1], and some of
the results have been quite amusing. When you identify resources
internally, I found out that it doesn't particularly matter how they
are stored or processed, as long as the associated semantics remain
attached, or are re-attachable. In processing some of these discrete
identifiers, internal representations may be created that can
inadvertently turn nasty... in the RDF model, RDF itself is fairly two
dimensional compared to SEM type languages, such as N3 and Semenglish,
because "RDF" in its current state makes some very odd closed
assumptions, and so it is possible to go (for example) from triples =>
pentuples, but occasionally, it will be impossible to go from
pentuples => triples.

This bears some relation to EARL, in that currently we have the
ability to express it in a higher dimensional order than currently
available in RDF. For example, we make use of context brackets, which
are difficult (although not impossible) to serialize in XML RDF.

Anyhow, I'm side-tracking myself a little here. EARL relies upon the
fact that anything can be identified by a URI. In fact, we know that
this isn't technically true (depending on ones definition of
"technically"), because of the internal/sets problems, and the fact
that new "anonymous" resources can be created just by talking about
them:-

   [ :name "Sean" ] .

which translates into English as: something that has the name "Sean".
It is possible that we may wan to do evaluations of things that we
only know the attached semantics for - e.g. a Web page that doesn't
exist anymore.

   [ :hadTheURI <http://mypage.com/>; :forDate "2001-03-17" ] .

Diversion: note that at the top of the RDFS class tree, we have
"Resource" - EARL does not particularly discriminate at this point
what resources we can evaluate, and that does give us a little extra
power, although I think that in most cases, we will want a severely
non-ambiguous identifier for a resource, rather than a detailed
discussion about the semantics attached to a certain Web page or tool;
the semantics will be implied.

One of the axioms that I'm getting at here is that, as far as EARL is
concerned, "any resource that we want to identify will be
identifiable". In other words, EARL is a fairly closed world system -
contextualization and implied semantics reign the roost here, and I
don't think that situation will change until there is a Semantic Web,
and perhaps even when there is an SW. This doesn't mean that there
will be an "as long as my machine can understand it, it will be O.K."
situation with EARL, but simply that if you want to export EARL
semantics, you will have to explicitly define all of the internal
implicitly defined semantics. That's where the "SEM" conversational
thread comes in - it *is* possible to do such a thing, but you must
expect to lose something along the way. A good analogy is audio
compression: you can always lose 90% of the data, and still have
something that is useful. This, I believe, will also be the case with
EARL. [I wonder if this ties in at all with some of the first-party
assertion screen-scraping stuff that DanC has been doing?]. I also
think that this may be the case with the Semantic Web in general - a
lot of data will be left unsaid (just another facet of "[ :subjectOf
<> ] .").

One interesting thing that came up in the development process just
before EARL 0.9 was released is that "evaluations" themselves are
totally discrete and time independent. That's because all evaluations
are being performed upon a snapshot of a certain resource, and so (as
Aaron told me), you don't need to say that an evaluation is true for a
particular date. An EARL evaluation is always carried out upon a new
resource, one which is equivalent to another resource for some
particular snapshot of time. This is what is meant by:-

     [ a earl:TestSubject ] .

Something which has an rdf:type of earl:TestSubject is *never* a
first-hand resource. This also has an additional benefit (that Aaron
pointed out), which is that the evaluation remains true for all time -
it is not time-variant. I did in fact argue at this point that the
semantics associated with each particular evaluation may indeed change
through time (in the same way that natural language change slowly but
steadily through time), and so the evaluation may need date-stamping
after all.

Anyway, I think that's enough for now. Coming soon: EARL as a test of
independent invention :-)

[1] http://groups.yahoo.com/group/sem-dev/

--
Kindest Regards,
Sean B. Palmer
@prefix : <http://webns.net/roughterms/> .
:Sean :hasHomepage <http://purl.org/net/sbp/> .

Received on Wednesday, 25 April 2001 21:24:49 UTC