- From: Shane McCarron <shane@aptest.com>
- Date: Thu, 2 Oct 2014 06:49:50 -0500
- To: Tobie Langel <tobie.langel@gmail.com>
- Cc: Robin Berjon <robin@w3.org>, "spec-prod@w3.org Prod" <spec-prod@w3.org>
- Message-ID: <CAOk_reG6UooUbzSPu50Uz8h9dwKp8jM=VTuXtX-6Z9fDD2Tdyw@mail.gmail.com>
On Thu, Oct 2, 2014 at 6:41 AM, Tobie Langel <tobie.langel@gmail.com> wrote: > On Thu, Oct 2, 2014 at 12:10 PM, Robin Berjon <robin@w3.org> wrote: > >> On 02/10/2014 10:10 , Tobie Langel wrote: >> >>> My plan for this solution is to do daily crawling of relevant specs and >>> extract the dfn and put them in a DB. Further refinements could include >>> a search API, like I added for Specref and exposed within Respec. >>> >> >> Could you somehow reuse or modify what Shepherd does here? If it includes >> enough information (or additional extraction can be easily added) and new >> specs can be added to its crawling (which I suspect ought to be relatively >> easy — I recall Peter's code being able to process quite a lot of different >> documents) then we can all align, which I reckon is a win (even without >> counting the saved cycles). >> > > I've bumped into way too many painful issues with non browser-based HTML > parsers to waste more time with them. I'm also very interested in gathering > data from editor's draft which requires a JS runtime for those which use > ReSpec. > Exactly. The real value of this is during development - especially of a family of specs such as the (possibly) upcoming HTML5 modules! > > Shepherd exposes an API that allows you to just simply dump the data it >> has. If you look inside update.py in Bikeshed you can see how it works. >> What Bikeshed does is, instead of querying services live, allow the user to >> regularly call bikeshed update and get a fresh DB (of a bunch of stuff). >> The same could be injected into SpecRef. > > > That sounds like a worthwhile idea to explore but seems somewhat > orthogonal to this project, no? > It hadn't occurred to me to conflate this feature with SpecRef. I mean - it's a service, so I guess it can do anything. Having the exposed references from many specs available to other, unrelated specs is interesting and ultimately useful. But I agree that it is orthogonal to the goal of making it easier for *related* specs to connect together - particularly during development. > My focus will be on the gathering the data and providing a JSON API. Not >>> on actual implementation within ReSpec (which I won't have cycles for at >>> that time, I'm afraid). >>> >> >> The hard part is getting the data. Hooking it into ReSpec oughtn't be >> difficult, unless I'm missing something. > > > Good. (I haven't thought about this at all, so I'll take your word for > it). > Yeah, I looked at the code for how we talk to SpecRef and it seems pretty straightforward to do a similar integration into the place where we are creating the list of cross references we need to look up. As an aside, I note that the SpecRef lookup (in ReSpec biblio.js) uses https GET. I would change that to POST so that if there is a huge query we don't overflow URL length limits. I will create an issue about it.
Received on Thursday, 2 October 2014 11:50:18 UTC