Re: Thinking about cross references and ReSpec from Shane McCarron on 2014-10-02 (spec-prod@w3.org from October to December 2014)

From: Shane McCarron <shane@aptest.com>
Date: Thu, 2 Oct 2014 06:49:50 -0500
To: Tobie Langel <tobie.langel@gmail.com>
Cc: Robin Berjon <robin@w3.org>, "spec-prod@w3.org Prod" <spec-prod@w3.org>
Message-ID: <CAOk_reG6UooUbzSPu50Uz8h9dwKp8jM=VTuXtX-6Z9fDD2Tdyw@mail.gmail.com>

On Thu, Oct 2, 2014 at 6:41 AM, Tobie Langel <tobie.langel@gmail.com> wrote:

> On Thu, Oct 2, 2014 at 12:10 PM, Robin Berjon <robin@w3.org> wrote:
>
>> On 02/10/2014 10:10 , Tobie Langel wrote:
>>
>>> My plan for this solution is to do daily crawling of relevant specs and
>>> extract the dfn and put them in a DB. Further refinements could include
>>> a search API, like I added for Specref and exposed within Respec.
>>>
>>
>> Could you somehow reuse or modify what Shepherd does here? If it includes
>> enough information (or additional extraction can be easily added) and new
>> specs can be added to its crawling (which I suspect ought to be relatively
>> easy — I recall Peter's code being able to process quite a lot of different
>> documents) then we can all align, which I reckon is a win (even without
>> counting the saved cycles).
>>
>
> I've bumped into way too many painful issues with non browser-based HTML
> parsers to waste more time with them. I'm also very interested in gathering
> data from editor's draft which requires a JS runtime for those which use
> ReSpec.
>

Exactly.  The real value of this is during development - especially of a
family of specs such as the (possibly) upcoming HTML5 modules!

>
> Shepherd exposes an API that allows you to just simply dump the data it
>> has. If you look inside update.py in Bikeshed you can see how it works.
>> What Bikeshed does is, instead of querying services live, allow the user to
>> regularly call bikeshed update and get a fresh DB (of a bunch of stuff).
>> The same could be injected into SpecRef.
>
>
> That sounds like a worthwhile idea to explore but seems somewhat
> orthogonal to this project, no?
>

It hadn't occurred to me to conflate this feature with SpecRef.  I mean -
it's a service, so I guess it can do anything.  Having the exposed
references from many specs available to other, unrelated specs is
interesting and ultimately useful.  But I agree that it is orthogonal to
the goal of making it easier for *related* specs to connect together -
particularly during development.

>  My focus will be on the gathering the data and providing a JSON API. Not
>>> on actual implementation within ReSpec (which I won't have cycles for at
>>> that time, I'm afraid).
>>>
>>
>> The hard part is getting the data. Hooking it into ReSpec oughtn't be
>> difficult, unless I'm missing something.
>
>
> Good. (I haven't thought about this at all, so I'll take your word for
> it).
>

Yeah, I looked at the code for how we talk to SpecRef and it seems pretty
straightforward to do a similar integration into the place where we are
creating the list of cross references we need to look up.

As an aside, I note that the SpecRef lookup (in ReSpec biblio.js) uses
https GET.  I would change that to POST so that if there is a huge query we
don't overflow URL length limits.  I will create an issue about it.

Received on Thursday, 2 October 2014 11:50:18 UTC