- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Thu, 6 May 2021 18:51:28 +0200
- To: Jeffrey Yasskin <jyasskin@google.com>, Marcos Caceres <marcosc@w3.org>
- Cc: spec-prod <spec-prod@w3.org>
- Message-ID: <c8187eac-edc0-efd4-4486-8c36b609adef@w3.org>
Le 05/05/2021 à 07:19, Dominique Hazael-Massieux a écrit : > Reffy is made to run on the list of specs maintained in browser-specs > [5] - if your crawl needs to run on a different list, some further > customization might be needed (happy to help with them). I've ended up hacking my way through this [1] (very much a work-in-progress), which has made it possible to extract editors and their affiliations from 313 specs, with a few miss (whose affiliation appear as "undetermined" in the attached data - available both as JSON and CSV). This is still very much a ad-hoc process, and more importantly, it extract data "only" from 313 specs in browser-specs [2], which means both that there are a few browser-specs specs from which it couldn't extract the information, and more importantly, that it isn't looking at the many known W3C specs that aren't in browser-specs, and even less so at the many other specs (e.g. from CGs) that aren't in browser-specs. It would be relatively easy to add the known W3C specs that aren't in browser-specs; much harder to get data from other CGs specs since I don't think we have a good mechanism to track their existence at this point (although the data collected by the CG monitor [3] might be a starting point). I'll wait to see if this data is useful and used before looking into making the whole thing more robust. Dom 1. https://github.com/w3c/reffy/blob/spec-crawler/src/cli/extract-editors.js with the extracted data post-processed with https://gist.github.com/dontcallmedom/290986d35a8991a163f805e1692ff53a 2. https://github.com/w3c/browser-specs 3. https://w3c.github.io/cg-monitor/
Attachments
- text/csv attachment: editors-affiliations.csv
- application/json attachment: editors-affiliations.json
Received on Thursday, 6 May 2021 16:52:02 UTC