Re: Building a test corpus for ER tools

 "Nick Kew" <nick@webthing.com>

> If I were to hypothesise a corpus-building project, what support might
> it expect:
>
>  * Set up a new EARL server, based on the Annotea server.  Ensure that
>    it properly tracks changes in pages, as this is more important to
>    us than to Annotea.

Tracking changes isn't an Annotea issue certainly, but we need some way of
looking at changes, I've been trying to find time to look at this from my
own link maintenance needs but haven't got anywhere really yet. (link
monitoring in usenet FAQs after I got caught out with pages changing
subjects entirely without changing urls) This is an important issue I
think, and one that whilst we've often touched on in meetings haven't
fully researched, although again real world experience is likely needed
which a corpus would also give (perhaps it would be a good idea to cache
pages we evaluate somewhere.

>  * Adapt existing Annotea Clients to work with it.

My two agents will do this as soon as is practical, and I'll possibly have
graphical querying of RDF in my RDF Editor which will also be given an
Annotea interface (it'd be the same annotea interface as in Snufkin and
FillyJonk so no skin of my nose.)  I'll also look at either a server side,
or client-size in Mozilla solution so as not to be windows IE specific.

>  * Use this infrastructure to collect human evaluations, either "blind"
>    (just evaluate a page) or in the context of existing evaluations.

One thing before this is an agreement on a single namespaces to use, to
make the evaluation tools simpler, I'd propose either rapidly agreeing a
1.0 namespace or formalise the 1.0-test.

Jim.

Received on Wednesday, 6 March 2002 18:55:33 UTC