content type sniffing - unknown type

I'm working on a content type sniffing implementation based on the  
current spec, that will eventually make it into html5lib (its part of  
a separate project for now).

Anyway, in "2.7.4 Content-Type sniffing: unknown type", i think  
there's a few things flipped around. Where is says "Examine the  
index<sub>stream</sub>th byte of the byte stream as follows:", i think  
it should actually be referring to the to the index<sub>pattern</ 
sub>th byte of the pattern.

The I understand the algorithm is like this:

  walk through the pattern
   if we're at a WS byte
     consume all the whitespace
   else
     do the 'and' operation with the mask and test it against  
pattern[index<sub>pattern</sub>]
  if we made it through without a mis-match, return the given type.

Implementing it this way has yielded the expected results (ie, the  
examples given in the comments work).

-ryan

Received on Saturday, 12 July 2008 01:07:10 UTC