- From: Jeffrey Yasskin <jyasskin@google.com>
- Date: Tue, 11 May 2021 21:52:27 -0700
- To: Dominique Hazael-Massieux <dom@w3.org>
- Cc: Marcos Caceres <marcosc@w3.org>, spec-prod <spec-prod@w3.org>
- Message-ID: <CANh-dX=sp1kqHiGBwT=iPSg7B2eWbfJbkh6axVYT=LygM-pSdQ@mail.gmail.com>
Thanks! This list looks useful, and it caused https://github.com/w3c/media-source/pull/271 so far. I can scan https://chromestatus.com/ for specs that aren't in browser-specs. An initial run finds ~300 possible URLs, but a lot of those are abandoned, not specifications, or stage 4 (i.e. merged) javascript features. Some of the remaining ones are IETF documents that we probably should figure out how to include. Jeffrey On Thu, May 6, 2021 at 9:51 AM Dominique Hazael-Massieux <dom@w3.org> wrote: > Le 05/05/2021 à 07:19, Dominique Hazael-Massieux a écrit : > > Reffy is made to run on the list of specs maintained in browser-specs > > [5] - if your crawl needs to run on a different list, some further > > customization might be needed (happy to help with them). > > I've ended up hacking my way through this [1] (very much a > work-in-progress), which has made it possible to extract editors and > their affiliations from 313 specs, with a few miss (whose affiliation > appear as "undetermined" in the attached data - available both as JSON > and CSV). > > This is still very much a ad-hoc process, and more importantly, it > extract data "only" from 313 specs in browser-specs [2], which means > both that there are a few browser-specs specs from which it couldn't > extract the information, and more importantly, that it isn't looking at > the many known W3C specs that aren't in browser-specs, and even less so > at the many other specs (e.g. from CGs) that aren't in browser-specs. > > It would be relatively easy to add the known W3C specs that aren't in > browser-specs; much harder to get data from other CGs specs since I > don't think we have a good mechanism to track their existence at this > point (although the data collected by the CG monitor [3] might be a > starting point). > > I'll wait to see if this data is useful and used before looking into > making the whole thing more robust. > > Dom > > > 1. > https://github.com/w3c/reffy/blob/spec-crawler/src/cli/extract-editors.js > with the extracted data post-processed with > https://gist.github.com/dontcallmedom/290986d35a8991a163f805e1692ff53a > 2. https://github.com/w3c/browser-specs > 3. https://w3c.github.io/cg-monitor/ >
Received on Wednesday, 12 May 2021 04:53:52 UTC