- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Fri, 15 Jul 2016 03:59:21 -0400
- To: www-tag@w3.org
- Cc: François Daoust <fd@w3.org>
Dear TAG, Francois (cc'd) and I have recently been working on a set of tools aiming at crawling WebIDL data from the various Web Platform specifications that use it. More specifically, these tools try and extract WebIDL fragments from the latest identified version of specs (even when fed with URLs that are not necessarily the latest), and at the same time extract information on normative references that these specs declare. The WebIDL fragments are then parsed into a JSON AST built from webidl2.js. This lets us build a complete map of usage of WebIDL across the specifications of the OWP, which itself has enabled us to run various analyzers: * we can run diagnostics on specs to ensure their normative references are consistent with the various WebIDL "names" they import * we can detect duplicate or missing definitions (the tool for instance spotted the bug that the TAG had already separately reported on the double usage of "Credential" in WebAuthN & Credential API) * we can easily detect specs with invalid WebIDL fragments See for instance one report produced recently: https://github.com/tidoust/reffy/wiki/Report-per-anomaly-(20160711) (there are known false positives) On top of that, we also built a more general explorer of WebIDL usage across specifications: https://dontcallmedom.github.io/webidlpedia This explorer lists all the defined WebIDL names (interfaces, dictionaries, typedef, enums), with information on which specs define them and which specs makes use of them. An interesting way to look at these lists is the one sorted by "popularity" (i.e. highest level of usage by other specs): https://dontcallmedom.github.io/webidlpedia/?full=popularity It might be particularly interesting to explore in more depth the patterns that lead to some dictionaries and enums having 0 usage. A similar view shows the list of strings that are used as enum values across specifications: https://dontcallmedom.github.io/webidlpedia/?enums=popularity That view could hopefully become useful in bringing more consistency in these names across specification. There are obviously many other ways the collected data ought to be exploited, for instance by exploring which specs make use of which extended attribute that have particular platform relevance (e.g. [SecureContext]). Likewise, there is probably quite a bit more that can be extracted and analyzed from the list of normative references that the tool collects. The said tools are available at https://github.com/tidoust/reffy https://github.com/dontcallmedom/webidlpedia Francois and I will likely keep working on these tools time permitting; we also welcome pull requests on the repos. Should they be of interest to the TAG in its operations, we would also be happy to discuss how they can be improved in that direction. Thanks, Dom & François
Received on Friday, 15 July 2016 08:00:11 UTC