- From: ryan <ryan@theryanking.com>
- Date: Tue, 21 Aug 2007 11:35:10 -0700
- To: Dan Connolly <connolly@w3.org>
- Cc: public-html@w3.org
- Message-Id: <0B72581E-399E-4EB8-912D-B0B1471B288F@theryanking.com>
On Aug 21, 2007, at 7:44 AM, Dan Connolly wrote: > On Mon, 2007-08-20 at 17:28 -0700, ryan wrote: >> Section 4.7.4, which deals with sniffing for different content types, >> has no mention of BOMs.[1] >> >> In implementing this, I encountered a case where this failed: >> >> http://www.armencomp.com/tradelog/trader_tax_topics.rss > > Thanks for finding a specific case. > > It's awkward for this WG to use that document as a test case, > as it's not clear that we have license to republish it, > create derivative works, etc. Agreed. I tend to use cases like this in my work, but that's mostly a private test suite. > Would you please create a file that exhibits the same issue, > and attach it to a message to this WG? Done, test attached (hope it doesn't get mangled). > There are perhaps other places you could put it and still > make it clear that you're contributing it to this WG, but > that's the simplest one that occurs to me just now. I contribute tests to the html5lib project[1]. I, and I assume most people working on that code, would be happy to contribute those tests [2] to the WG. >> Though this resource should be a problem with the sniffing algorithm >> (since its served as text/plain, which shouldn't trigger the feed vs >> html sniffing), it still illustrates the problem. >> >> Also, Firefox treats this as a feed, while Safari treats it as plain >> text. > > Interesting. > > Have you given any thought to a format for expected results > for a test case such as this? For sniffing I've created one in the html5lib project already[3]. It's JSON that includes a stream of bytes plus the type we expect the sniffing algorithm to return. As a sidenote, these are all examples taken from the wild web. I'm not sure what copyright law says about quoting a small part of a document for the purpose of building a test suite. > I'm interested to start capturing claims about which implementation > passes which test in machine-readable form. > I had fun doing this with the GRDDL tests; see > http://www.w3.org/2001/sw/grddl-wg/td/test_results > http://www.w3.org/2001/sw/grddl-wg/td/earlsum.py > http://www.w3.org/TR/grddl-tests/#earl-reporting > >> -ryan >> >> 1. http://www.whatwg.org/specs/web-apps/current-work/#content-type3 > aka http://www.w3.org/html/wg/html5/#content-type3 Of course. :) I've been using the what-wg version for several years, so the habit's pretty ingrained. -ryan 1. http://code.google.com/p/html5lib/ 2. http://html5lib.googlecode.com/svn/trunk/testdata/ 3. http://html5lib.googlecode.com/svn/trunk/testdata/sniffer/
Attachments
- application/octet-stream attachment: bom.rss
Received on Tuesday, 21 August 2007 18:35:34 UTC