W3C home > Mailing lists > Public > spec-prod@w3.org > October to December 2001

Scraping a TR doc into RDF (was: spec-prod, xmlspec, docbook and Co. (esp. metadata))

From: Dominique Hazael-Massieux <dom@w3.org>
Date: Fri, 21 Dec 2001 06:46:32 -0500
To: Dan Connolly <connolly@w3.org>
Cc: Norman Walsh <Norman.Walsh@Sun.COM>, danbri@w3.org, spec-prod@w3.org
Message-ID: <20011221064632.K11693@w3.org>
On Wed, Oct 17, 2001, Dan Connolly wrote:
> I scraped our TR index into RDF:
>   http://www.w3.org/2000/04/mem-news/tr2.rdf
> technical details on how it works are described, at least
> in a way that the machine understands, in...
>   http://www.w3.org/2000/04/mem-news/Makefile
> I haven't written it up for people ;-)
> The format, in brief, is:
>     <REC rdf:about="http://www.w3.org/TR/1999/REC-xpath-19991116">
>         <dc:date>1999-11-16</dc:date>
>         <dc:title>XML Path Language (XPath) Version 1.0</dc:title>
>         <doc:versionOf rdf:resource="http://www.w3.org/TR/xpath"/>
>         <editor rdf:parseType="Resource">
>             <contact:fullName>James Clark</contact:fullName>
>         </editor>
>         <editor rdf:parseType="Resource">
>             <contact:fullName>Steve DeRose</contact:fullName>
>         </editor>
>     </REC>
> Meanwhile...
> Dominique HazaŽl-Massieux, our webmaster, has been working
> on a tool to chec W3C publications w.r.t. our publication
> rules...
>   http://www.w3.org/2001/07/pubrules-form
>   http://www.w3.org/2001/07/pubrules-checker.xml
> We're working on enhancing it to produce metadata
> in RDF ala the above as a byproduct of checking.

As Dan told 2 monthes ago, we have been working on scraping our Technical 
Reports into RDF metadata as part of our publication process: the
XSLT stylesheet http://www.w3.org/2001/10/trdoc2rdf used on a document
compliant with our publication rules outputs rdf. For instance, applied
on "RDF/XML Syntax Specification (Revised)"
(http://www.w3.org/TR/rdf-syntax-grammar/), you get:

<?xml version="1.0" encoding="utf-8"?>
<!--Produced by $Id: trdoc2rdf.xslt,v 1.29 2001/12/20 16:17:44 dom Exp
<rdf:RDF xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#"
oc#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
<dc:title>RDF/XML Syntax Specification (Revised)</dc:title>
<doc:versionOf rdf:resource="http://www.w3.org/TR/rdf-syntax-grammar"/>
<editor rdf:parseType="Resource">
<contact:fullName>Dave Beckett</contact:fullName>


Dominique HazaŽl-Massieux - http://www.w3.org/People/Dom/
W3C's Webmaster
Received on Friday, 21 December 2001 06:46:35 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:42:16 UTC