Re: Data on successive spec draft publication?

CC-ing the AB as a FYI

> Someone suggested that we look at data on *specs* (not just working groups) as we look for efforts not 
> destined for success.  I was hoping to get data that -- for each spec -- would let us track publication of working drafts, 
> LC working drafts, CRs, PRs, Recommendations, and edited Recs.  This would let us get an idea of how long it takes -- 
> minimum, average, maximum -- to get all the way through the process, and perhaps identify red flags such as "If a spec 
> doesn't get to CR in x years, it is unlikely to even do so" and generally look for patterns in the distribution of times it 
> takes specs to move through the process.

Thanks to Ian's suggestion, the W3C webmasters have pointed us to the following URLs containing data on all spec draft publications since 1995.

http://www.w3.org/2002/01/tr-automation/new-tr.rdf has everything since September 2011
The rdf files below contain data for publications between Nov 17, 1995
to Sept 21, 2011. 

http://www.w3.org/2002/01/tr-automation/tr-pub-19951117-20020326.rdf
http://www.w3.org/2002/01/tr-automation/tr-pub-20020327-20030519.rdf
http://www.w3.org/2002/01/tr-automation/tr-pub-20030520-20040114.rdf
http://www.w3.org/2002/01/tr-automation/tr-pub-20040114-20040601.rdf
http://www.w3.org/2002/01/tr-automation/tr-pub-20040602-20050103.rdf
http://www.w3.org/2002/01/tr-automation/tr-pub-20050104-20050729.rdf
http://www.w3.org/2002/01/tr-automation/tr-pub-20050730-20060303.rdf
http://www.w3.org/2002/01/tr-automation/tr-pub-20060304-20060822.rdf
http://www.w3.org/2002/01/tr-automation/tr-pub-20060823-20070518.rdf
http://www.w3.org/2002/01/tr-automation/tr-pub-20070519-20071231.rdf
http://www.w3.org/2002/01/tr-automation/tr-pub-20080101-20090327.rdf
http://www.w3.org/2002/01/tr-automation/tr-pub-20090328-20110921.rdf

I'm working on extracting the basic spec name, date, and publication type (WD, CR, Rec, etc) into a spreadsheet format.  There's lots more information in there, but that seems to evolved over time and with spec maturity. I hope we can collaborate to investigate what information might be available to distinguish the specs that matured from those which did not.

FWIW, as a complete newbie to the RDF world, I've found the W3C RDF Validator [1] a useful way of visualizing these RDF-XML files as triples, and the Twinkle UI [2] for the SPARQL engine the most useful way to extract tables from the triples.  

[1] http://www.w3.org/RDF/Validator/
[2] http://www.ldodds.com/projects/twinkle/


________________________________________
From: Ian Jacobs <ij@w3.org>
Sent: Tuesday, January 20, 2015 6:52 AM
To: Michael Champion (MS OPEN TECH)
Cc: public-success-fail@w3.org
Subject: Re: Data on successive spec draft publication?

> On Jan 19, 2015, at 11:51 PM, Michael Champion (MS OPEN TECH) <Michael.Champion@microsoft.com> wrote:
>
> Someone suggested that we look at data on *specs* (not just working groups) as we look for efforts not destined for success.  I was hoping to get data that -- for each spec -- would let us track publication of working drafts, LC working drafts, CRs, PRs, Recommendations, and edited Recs.  This would let us get an idea of how long it takes -- minimum, average, maximum -- to get all the way through the process, and perhaps identify red flags such as "If a spec doesn't get to CR in x years, it is unlikely to even do so" and generally look for patterns in the distribution of times it takes specs to move through the process.
>
> Unless I'm missing something, that's not easily available from the /TR page:  There's just one entry for each spec indicating the the maximum level of maturity, not one entry for each time something was published to /TR.  Does anyone know if there is a way to get this more fine grained data out of /TR , the WG database, etc?  Is there a single mailing list (or public log file of some sort) that gets an entry every time something is published?

We have all the data for all TR publications in RDF. Please send a request to the Webmaster webreq@w3.org for the URIs to all the RDF files.

Ian


>
> CC-ing Ian since he seems to know the ins and outs of the W3C data collection and publication system.
>
> Thanks for any suggestions, or pointers to someone who could help.
>
> Mike Champion

--
Ian Jacobs <ij@w3.org>      http://www.w3.org/People/Jacobs
Tel:                       +1 718 260 9447



Received on Tuesday, 27 January 2015 15:01:02 UTC