> On Oct 5, 2023, at 9:34 AM, Rory Hewitt <rory.hewitt@gmail.com> wrote:
> 
> 2. I certainly didn't mean to overstep by suggesting myself as a designated expert - I'll ensure this section is appropriately vague. But I want to clarify that the HTTP Archive folks already have all the info we need in terms of a representative web sample that we can query. The last thing I would want is for this proposal to be mired in "we need people to do this complex, time-consuming bit of work, so let's not do it at all". The data is there and is freely available and is regularly updated.
Umm, no, the HTTP Archive is not a representative sample, nor does it reflect all fields. It has a lot of extremely useful features, but being the source of HTTP sampling is not one of them. Not even close, unless you only want to sample one general purpose browser making generic requests on the front door of only the most popular public websites.
....Roy