TR automation status

A report on where TR automation was standing as of 2002/11/26
(this mail is a public copy of mid:<1038321272.12406.733.camel@stratustier> -> http://lists.w3.org/Archives/Member/w3c-semweb-ad/2002Nov/0000.html )

* just before the publication moratorium, Henri (W3C Webmaster) decided
to switch to use the TR automation framework to update the TR page.
Basically, he had to publish more than 11 documents in one day, and
automation was the best way to avoid making errors.
To this end, the following items have been completed/added:
- the RDF extractor XSLT [1] now outputs warnings when informations
extracted from a TR doc are inconsistent with the existing knowledge
(missing trailing slash, for instance), or are missing in a source
(author not yet identified), etc.
- a small shell script [2] calls the said XSLT on a parameter URI (the
doc being published), adds the extracted data in the TR publication log
[3] 
- then a makefile [4] allows to build various views of the TR state,
using CWM and XSLT. The available views are:
 + RDF [5]
 + HTML classic (which holds the state of what's used right now in /TR/)
 + HTML by title, author and date [7]
(the view by author relies on a list of known TR editors [8] maintained
by the Webmaster thanks to the warning given by [1]).
 + statistics of the TR publication since the start of the TR log [11],
used by the Comm Team to compute the # of TRs for the AC meeting

In short, most of the first big phase in TR automation is done.

Next:
- Henri wants to start working on a Webmaster Calendar project, similar
to the one I describe in my TR papertrail machine [9]. I have advised
him to look for help in the SWeb Team
- the deliverables of the TR automation needs more visibility; now
that's the publication process is set, I will try to discuss with the
Comm Team to see what can be done with the various views and probably
get a news item on this
- this publication process needs to be documented in more details for
Henri's successors
- I want to re-use this new source of data to automate most of the QA
Matrix [10] maintenance... Shouldn't be too hard
- of course, the best would have time to work on more stuff of the
papertrail machine [9]... well, someday :)

There are a few issues that still need to be fixed:
- there is no distinction between 2nd Edition and revision for a Rec
- some data are detected but not collected (feedback mailing list,
patent disclosure page which could lead to WG detection at the same
time)

Dom

1. http://www.w3.org/2001/10/trdoc2rdf
2. http://www.w3.org/2002/01/tr-automation/updateNewTr.sh
3. http://www.w3.org/2002/01/tr-automation/new-tr.rdf
4. http://www.w3.org/2002/01/tr-automation/Makefile
5. http://www.w3.org/2002/01/tr-automation/tr.rdf
6. http://www.w3.org/2002/01/tr-automation/tr-pg.html
7. http://www.w3.org/2002/01/tr-automation/tr-title
http://www.w3.org/2002/01/tr-automation/tr-author
http://www.w3.org/2002/01/tr-automation/tr-date
8. http://www.w3.org/2002/01/tr-automation/known-tr-editors.n3
9. http://www.w3.org/2002/01/tr-automation/TR-papertail#calendar
10. http://www.w3.org/QA/Matrix
11. http://www.w3.org/2002/01/tr-automation/tr-stats applied for
instance in
http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Fwww.w3.org%2F2002%2F01%2Ftr-automation%2Ftr-stats.xsl&xmlfile=http%3A%2F%2Fwww.w3.org%2F2002%2F01%2Ftr-automation%2Fnew-tr.rdf&startdate=2002-05-01
-- 
Dominique Hazaël-Massieux - http://www.w3.org/People/Dom/
W3C/INRIA
mailto:dom@w3.org

Received on Tuesday, 25 March 2003 07:05:35 UTC