content type sniffing - unknown type from Ryan King on 2008-07-12 (public-html@w3.org from July 2008)

From: Ryan King <ryan@theryanking.com>
Date: Fri, 11 Jul 2008 18:06:30 -0700
To: public-html@w3.org
Cc: implementors@whatwg.org
Message-Id: <00256E09-A249-44A4-A629-A69F54453D31@theryanking.com>

I'm working on a content type sniffing implementation based on the  
current spec, that will eventually make it into html5lib (its part of  
a separate project for now).

Anyway, in "2.7.4 Content-Type sniffing: unknown type", i think  
there's a few things flipped around. Where is says "Examine the  
index<sub>stream</sub>th byte of the byte stream as follows:", i think  
it should actually be referring to the to the index<sub>pattern</ 
sub>th byte of the pattern.

The I understand the algorithm is like this:

  walk through the pattern
   if we're at a WS byte
     consume all the whitespace
   else
     do the 'and' operation with the mask and test it against  
pattern[index<sub>pattern</sub>]
  if we made it through without a mis-match, return the given type.

Implementing it this way has yielded the expected results (ie, the  
examples given in the comments work).

-ryan

Received on Saturday, 12 July 2008 01:07:10 UTC