Re: Shepherd

On Aug 19, 2015, at 3:31 AM, Anne van Kesteren <annevk@annevk.nl> wrote:

> It's my understanding Shepherd is a tool used for generating
> cross-specification cross-references. I also learned it's not public.
> Neither the database nor the tool itself. This makes it hard to
> propose changes and evaluate where certain things go wrong when
> writing specifications.
> 
> Would it be possible to open this up?
> 
> https://github.com/whatwg/xref is what I used to use, but it's mostly
> driven by Anolis which is in decline.

Hi Anne,

first I don't know who told you Shepherd isn't public, as it's been open source since day one, the source is available at:
http://hg.csswg.org/dev/shepherd/

Second, Shepherd isn't the cross-spec cross-reference tool, it's a management system for the CSSWG test suite repository, in addition to an issue tracker, it validates test sources, has a comprehensive test search system (base on the test metadata), and manages the GitHub <-> Mercurial synchronization of the repo.

Shepherd does have a specification parser and spec DB that it uses to associate test links with specs, but that has long been factored out in to a separate module that's used by several systems on the csswg.org server including the test harness, the Bikeshed online service, and the draft server.

Bikeshed uses the Shepherd API to fetch the specification DB when it generates the cross-spec cross-references.

The source for the specification module is online here:
http://hg.csswg.org/dev/specification/

In particular the specification parser has been factored out to not have any DB dependencies and is here:
http://hg.csswg.org/dev/specification/file/tip/python/specification/specificationparser.py

And all the rest of the tools and modules from csswg.org are here:
http://hg.csswg.org/dev/

The specification DB is fully available via a HTTP/JSON API at any of the following URLs:
https://api.csswg.org/shepherd/spec/
https://api.csswg.org/bikeshed/api/spec/
https://drafts.csswg.org/api/spec/
https://test.csswg.org/harness/api/spec/

(calling it without any arguments just lists the specs available, by passing args you can get detailed information on all anchors in the specs.)

Some (auto-generated though sparse) documentation on the API is available by just pointing a browser at:
https://api.csswg.org/shepherd/
It at least explains the available arguments and returned data formats.

(that URL uses content negotiation, so requesting JSON will give you a JSON home page for the API surface, see:
http://tools.ietf.org/html/draft-nottingham-json-home-03 )

Note that the DB is updated every day at midnight (Pacific) and every time there's a push to the CSSWG, FXTF, or Houdini draft repositories. THE DB currently contains information on all the CSSWG specs as well as a handful of specs that the CSSWG specs refer to, such as HTML5, SVG, DOM, WebIDL, etc. Adding more specs is trivial, just let me know if there are any others you need.

There's also a Python API client library that simplifies calling the above APIs via the JSON home page which allows server side API changes without breaking clients at:
http://hg.csswg.org/dev/apiclient/
or
https://github.com/plinss/apiclient

Bikeshed uses this API client when fetching the specification DB.


FWIW, there's also an API there which reads W3C's /TR rdf file and produces JSON output at:
https://api.csswg.org/shepherd/tr/
(and the other API endpoints from above)

Received on Wednesday, 19 August 2015 19:13:07 UTC