- From: Jo Rabin <jrabin@mtld.mobi>
- Date: Wed, 25 Jul 2007 17:23:42 +0100
- To: "public-mobileok-checker" <public-mobileok-checker@w3.org>
- Message-ID: <C8FFD98530207F40BD8D2CAD608B50B4484198@mtldsvr01.DotMobi.local>
Hi Laura Sorry about taking a long time to get back, you ask good questions! Some thoughts. We should include all resources pointed to by img tags. I noticed in the preprocess method that the checker seems to want to assess content type by looking for the file extension. But that is really not right at all. It's not necessarily the case that images will have a file extension, and even when they do, it's an error to infer the content type from them - see e.g. [1] and [2] which make it clear that the resource must be retrieved and its Content-Type header examined in order to determine its type. [1] http://www.w3.org/2001/tag/doc/metaDataInURI-31.html#erroneous [2] http://www.w3.org/2001/tag/doc/mime-respect#missing Even though the object element allows the specification of content type, browsers typically taste the content of nested objects even in the presence of this information to determine the actual content type. Given that they stop when they find something they like, it's a good question to ask whether the checker should continue and whether and where it should put those references in moki. It's an even better question to wonder how the xslt would differentiate between those objects that should be counted and those that should not. So some thoughts about the code: 1. Given that the image type is not known in advance of retrieving it, and given that the image may not be of a known type, there seems to be the need for a factory somewhere which constructs a JPEG resource, a GIF resource or a generic image resource depending on the result of the retrieval. It looks like the image element in moki needs to be extended to include an image type which should be set to the media type of the response under the imageInfo element. 2. When processing images (and links and so on) in the primary document, I think that duplicates should not be suppressed and the duplicate detection should be handled in the preprocess method. Aside from anything else, the detection of duplicates should be done on a canonical URI not just a text match (and on the absolute version of the URI, for that matter). Though as we saw from a little test that Dom put together real browsers do appear to do a textual match, so that aspect of the behaviour needs to be centralized so we can change it easily or control it by a switch. 3. In the CSSResource class, an image list needs to be constructed and then processed as above. 4. The same observation applies to link elements as to images. Since CSS files can include other CSS files they need to have a list of included CSS and that needs to be preprocessed according to the same URI matching strategy. 5. Ideally, each of the lists of URIs should provide a reference to where they were found in the source of whatever document they were found in for error reporting purposes. (Did I hear a collective groan about line and column number references :-() and so that the moki document can provide the info that an image/css was in error and is referenced in 7 rather than just one place. 6. I think there is a need for an objects element in moki. It should contain objects and the objects should say a) what their content type is and b) whether they should be counted as an external reference. That should be easy enough to do. What's not so obvious is what to do about text/html when it is found in an object and I think the answer is that it should be counted and skipped. 7. Oh, and finally, before I forget. There is the case (401 Authentication) where both the page presented with the response and the primary document are tested and the external resources from the authentication page are added to the total. On reflection, I think we should think again about this behaviour before we go to the next last call of the mobileOK doc. And not worry about it in the code for now. (Famous last words) I've just checked in some updates with a couple of TODOs in the relevant places, I hope. I'll also update the moki example doc with the suggestions I made. And while I am about it I will generate a schema for moki. It's about time. Hope this helps. Oh and these are just my suggestions, you or anyone else may have better ones. Jo ________________________________ From: public-mobileok-checker-request@w3.org [mailto:public-mobileok-checker-request@w3.org] On Behalf Of Laura Holmes Sent: 24 July 2007 23:54 To: public-mobileok-checker Subject: Proposed changes to Moki - External Resources Test Hi all, I just wanted to run some changes by you all and get some feed back. Currently, I'm working on the ExternalResourcesTest and am running into conditions that haven't been accounted for in the existing code. These conditions include: 1) counting references contained in objects that are not jpeg or gif: there are many other image types and other types of objects (such as applications or audio) that may be included on a page. I'm assuming that we want to include these references even if they can't be rendered on a mobile phone due to a comment made regarding nested objects: "For nested object elements, count only the number of objects that need to be assessed before content matching the request header defined in 2.3.2 HTTP Request <http://www.w3.org/TR/mobileOK-basic10-tests/#http_request> is found." So, we want to assess other content types other than jpeg and gif when counting external resources. 2) keeping track of unique references to resources that are other than jpeg or gif: If two references are made in the primary document to the same image, it is only counted once, but if we reference the same image in css, we currently don't have a way of tracking this. 3) references contained in nested objects are counted regardless of whether or not the reference is actually reached: We only identify object nodes by name, not in serial order. Here are my proposed changes I want to make, which would entail changing the shape of the moki doc a bit: We create an ArrayList of URIs that is maintained throughout the entirety of the parsing process. When a reference to a resource is encountered, we check to see if the list already contains that URI. This list will contain a list of all the resources contained in both the primary doc and css files. At the end of the parsing process, we can add an additional node to any location in the moki that states the length of the list ( i.e. how many unique resources were encountered). I propose adding this as it's own node under moki, as it spans information in the primary doc, images, and css. Because we only want to record the number of unique references, I can't see any other way to pull it from the moki document using xsl. I'm open to any other suggestions. As to the nested object problem, I'm at a loss for solutions given our current implementation of the DOM. Suggestions? Thanks for your input in advance, Laura
Received on Wednesday, 25 July 2007 16:24:17 UTC