- From: Erik van der Poel <erikv@google.com>
- Date: Thu, 8 Apr 2010 07:41:12 -0700
- To: Dan Brickley <danbri@danbri.org>
- Cc: uri@w3.org
The typical length of a URL as found in HTML on the Web is around 64 bytes. I don't know what the average and median are because I bucketed the stats in powers of 2 (i.e. ..., up to 32, up to 64, 128, etc). The peak for these buckets was 64. There is a sharp drop at 2048. This makes sense because MSIE's limit in HTTP requests is 2048. Firefox and Chrome do not appear to have limits. (I gave up trying when I reached 32k.) MSIE's limit in URLs in HTML is 4096 characters. This is not the same as 4096 bytes. MSIE uses UTF-16 internally. I used IDNA to find this limit. Good luck with your barcode/audio efforts, Erik On Thu, Apr 8, 2010 at 3:06 AM, Dan Brickley <danbri@danbri.org> wrote: > Hi folks > > Some topics seem peculiarly ill-suited for Web searches - hence this > mail. I am looking for data on typical lengths of URIs, in particular > as they're used in the public Web. Breakdown by scheme would be nice, > but anything would be a start. > > Context for this enquiry is an investigation into the use of > mechanisms like QR Codes and also audio encodings (eg. > http://github.com/diva/digital-voices/ ) as a way of passing URIs > around, eg. to a smartphone from a media centre. I'd like to know > what's out there, what's feasible to encode using these techniques, > and as well as what the official limits are. In > http://tools.ietf.org/html/rfc3986 I don't see much about URI length > except in the reg-name portion. > > So - what are the official limits? what are the practical limits (eg. > imposed by common implementations)? Can we say that 99.9% of URIs in > the public Web are shorter than ...X chars? > > Ideally barcode and audio encodings wouldn't impose arbitrary limits; > however it would be good to document what's folk can expect to > encounter, if only for sensible testing of error correction, reader > accuracy etc. > > Thanks for any pointers, > > Dan > >
Received on Thursday, 8 April 2010 14:41:43 UTC