W3C home > Mailing lists > Public > spec-prod@w3.org > October to December 2001

Scraping a TR doc into RDF (was: spec-prod, xmlspec, docbook and Co. (esp. metadata))

From: Dominique Hazael-Massieux <dom@w3.org>
Date: Fri, 21 Dec 2001 06:46:32 -0500
To: Dan Connolly <connolly@w3.org>
Cc: Norman Walsh <Norman.Walsh@Sun.COM>, danbri@w3.org, spec-prod@w3.org
Message-ID: <20011221064632.K11693@w3.org>
On Wed, Oct 17, 2001, Dan Connolly wrote:
> I scraped our TR index into RDF:
> 
>   http://www.w3.org/2000/04/mem-news/tr2.rdf
> technical details on how it works are described, at least
> in a way that the machine understands, in...
>   http://www.w3.org/2000/04/mem-news/Makefile
> I haven't written it up for people ;-)
> 
> The format, in brief, is:
> 
>     <REC rdf:about="http://www.w3.org/TR/1999/REC-xpath-19991116">
>         <dc:date>1999-11-16</dc:date>
>         <dc:title>XML Path Language (XPath) Version 1.0</dc:title>
>         <doc:versionOf rdf:resource="http://www.w3.org/TR/xpath"/>
>         <editor rdf:parseType="Resource">
>             <contact:fullName>James Clark</contact:fullName>
>         </editor>
>         <editor rdf:parseType="Resource">
>             <contact:fullName>Steve DeRose</contact:fullName>
>         </editor>
>     </REC>
>[...]
> 
> Meanwhile...
> 
> Dominique HazaŽl-Massieux, our webmaster, has been working
> on a tool to chec W3C publications w.r.t. our publication
> rules...
> 
>   http://www.w3.org/2001/07/pubrules-form
>   http://www.w3.org/2001/07/pubrules-checker.xml
> 
> We're working on enhancing it to produce metadata
> in RDF ala the above as a byproduct of checking.

As Dan told 2 monthes ago, we have been working on scraping our Technical 
Reports into RDF metadata as part of our publication process: the
XSLT stylesheet http://www.w3.org/2001/10/trdoc2rdf used on a document
compliant with our publication rules outputs rdf. For instance, applied
on "RDF/XML Syntax Specification (Revised)"
(http://www.w3.org/TR/rdf-syntax-grammar/), you get:

<?xml version="1.0" encoding="utf-8"?>
<!--Produced by $Id: trdoc2rdf.xslt,v 1.29 2001/12/20 16:17:44 dom Exp
$-->
<rdf:RDF xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#"
xmlns:dc="h
ttp://purl.org/dc/elements/1.1/"
xmlns:doc="http://www.w3.org/2000/10/swap/pim/d
oc#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.w
3.org/2001/02pd/rec54">
<WD
rdf:about="http://www.w3.org/TR/2001/WD-rdf-syntax-grammar-20011218">
<dc:date>2001-12-18</dc:date>
<dc:title>RDF/XML Syntax Specification (Revised)</dc:title>
<doc:versionOf rdf:resource="http://www.w3.org/TR/rdf-syntax-grammar"/>
<editor rdf:parseType="Resource">
<contact:fullName>Dave Beckett</contact:fullName>
</editor>
</WD>
</rdf:RDF>

See
http://www.w3.org/2000/06/webdata/xslt?xmlfile=http%3A%2F%2Fwww.w3.org%2FTR%2Frdf-syntax-grammar%2F&xslfile=http%3A%2F%2Fwww.w3.org%2F2001%2F10%2Ftrdoc2rdf

Dom
-- 
Dominique HazaŽl-Massieux - http://www.w3.org/People/Dom/
W3C's Webmaster
mailto:dom@w3.org
Received on Friday, 21 December 2001 06:46:35 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 10 March 2012 06:19:11 GMT