- From: Yoav Weiss <yoav@yoav.ws>
- Date: Tue, 5 Nov 2013 17:06:37 +0100
- To: Robin Berjon <robin@w3.org>
- Cc: Marcos Caceres <w3c@marcosc.com>, public-webdevdata@w3.org
- Message-ID: <CACj=BEi6uj7+61zOZ-91zR9p0Le_KtKoVrpW4GqHFn_snnnATA@mail.gmail.com>
On Tue, Nov 5, 2013 at 4:59 PM, Robin Berjon <robin@w3.org> wrote: > On 05/11/2013 16:25 , Marcos Caceres wrote: > >> I wonder if we should start hosting the dataset on the W3C’s HG >> server. Trying to d/l the latest data set has been really slow for me >> (~1h today, but it was going to take 9h to d/l yesterday - and it’s >> only 700mb). Also, having the data sets on HG means we can keep a >> nice version history. >> > > Not speaking on behalf of the systeam or anything but... > > While W3C does have a nice infrastructure, I'm not sure that it's > necessarily up to the task here. Also, please note that the HG server is > often down. > > Also, I don't know if it's such a good idea to hold the snapshot zip in > HG. I don't know how HG does its internal storage, but if it's anything > like Git then *every* single zip snapshot will be kept. At 700MB a piece, > that could increase pretty fast. (Plus all the unzipped content too.) > My personal experience is that HG is actually worse than git with binaries. So +1 to not store 1G binaries in source control. > > This strikes me as the sort of thing that could get some form of corporate > sponsorship. You know, hosting on Google, Akamai, Amazon, or whatever. > I'm in Velocity next week. If you guys are cool with that, I can try to talk to Steve to see if we could join forces with the HTTPArchive, at least regarding the hosting aspects.
Received on Tuesday, 5 November 2013 16:07:06 UTC