- From: Robin Berjon <robin@w3.org>
- Date: Tue, 29 Oct 2013 17:35:43 +0100
- To: Simon Sapin <simon.sapin@exyr.org>
- CC: spec-prod@w3.org
On 29/10/2013 16:39 , Simon Sapin wrote: > Le 29/10/2013 15:22, Robin Berjon a écrit : >> (There are also resource issues to consider, a spider going through all >> the history of a long and complex draft would likely use up >> non-negligible resources.) > > I don’t think a spider is needed. It's not, I meant what happens if someone starts spidering the history. > It could be server side-software that > serves files directly from the repository based on a commit hash in the > URL, which AFAIK is not very resource-intensive. My knowledge of git internals is a bit rusty, but at the very least I believe that you need to: - grab and parse the commit object - grab (parse, etc.) the root tree that it points to - depending on the resource you're serving, possibly get at several subtrees in sequence until you get a file SHA - get the file SHA and return that - do the same again for every subresource loaded by the page If your implementation language has a good, low-level git library it's probably not the end of the world. But if you have to shell out at every step, you're going to have a bad time. Another alternative is that whenever a given SHA is requested, you create a clone with its working area set to point to that commit. That would be a lot less processing-intensive once the first request is handled, but it could possibly use up a lot of space. In any case, I'm not saying that it's impossible, just that I want to be cautious about this. It's not required for v0, we'll look at it once we already have something usable. -- Robin Berjon - http://berjon.com/ - @robinberjon
Received on Tuesday, 29 October 2013 16:35:50 UTC