- From: Jo Rabin <jrabin@mtld.mobi>
- Date: Tue, 31 Jul 2007 00:35:16 +0100
- To: "Sean Owen" <srowen@google.com>
- Cc: "public-mobileok-checker" <public-mobileok-checker@w3.org>
I'm not sure we are on exactly the same wavelength here as I don't think that we are all that worried about unidentified content types. And I don't think that we are going very far down the "too general" route really. The issue wrt to images at least is that when presented with an IMG or OBJECT element one does not know what the content type is in advance of retrieving it. So one must retrieve it and construct an appropriate object depending on the image type. If the image type is not one we are interested in then we should construct an "unknown" image object to hold at least the content type for later inspection. Reporting where the references to images are found does seem like it is important for a 1.0 implementation, especially when those images are being retrieved as a result of a stylesheet which itself might be imported from another stylesheet. Without that information I think developers are likely to be very lost when they see results referring to images they have never heard of and won't find in the primary resource. Jo > -----Original Message----- > From: Sean Owen [mailto:srowen@google.com] > Sent: 30 July 2007 23:34 > To: Jo Rabin > Cc: public-mobileok-checker > Subject: Re: Proposed changes to Moki - External Resources Test > > My general reaction is that this is getting too complicated -- for > version 1.0 at the very least. Retrieve images, examine their > Content-Type. If missing, assume it's not a supported image. If > present and it's GIF or JPEG, great, parse it. If it's something else, > assume it's not supported. > > I strongly believe we need to favor simple solutions that solves the > problem of "implement mobileOK Basic 1.0" first. It's good to develop > this into a more general platform for evaluating a web resource but we > haven't quite signed on for that just yet. At the moment scope and > complexity appears to be outpacing progress towards an implementation. > This particular issue -- unidentified content types -- feels > corner-case-ish to me. I am happy for version 1.0 to go out with a > crude reaction to this situation as long as it's handling the 99% of > other cases usefully. And then this can be tackled. > > On 7/25/07, Jo Rabin <jrabin@mtld.mobi> wrote: > > > > > > > > > > Hi Laura > > > > > > > > Sorry about taking a long time to get back, you ask good questions! Some > > thoughts. > > > > > > > > We should include all resources pointed to by img tags. I noticed in the > > preprocess method that the checker seems to want to assess content type > by > > looking for the file extension. But that is really not right at all. > It's > > not necessarily the case that images will have a file extension, and > even > > when they do, it's an error to infer the content type from them - see > e.g. > > [1] and [2] which make it clear that the resource must be retrieved and > its > > Content-Type header examined in order to determine its type. > > > > > > > > [1] > > http://www.w3.org/2001/tag/doc/metaDataInURI-31.html#erroneous > > > > [2] http://www.w3.org/2001/tag/doc/mime-respect#missing > > > > > > > > Even though the object element allows the specification of content type, > > browsers typically taste the content of nested objects even in the > presence > > of this information to determine the actual content type. Given that > they > > stop when they find something they like, it's a good question to ask > whether > > the checker should continue and whether and where it should put those > > references in moki. It's an even better question to wonder how the xslt > > would differentiate between those objects that should be counted and > those > > that should not. > > > > > > > > So some thoughts about the code: > > > > > > > > 1. Given that the image type is not known in advance of retrieving it, > and > > given that the image may not be of a known type, there seems to be the > need > > for a factory somewhere which constructs a JPEG resource, a GIF resource > or > > a generic image resource depending on the result of the retrieval. It > looks > > like the image element in moki needs to be extended to include an image > type > > which should be set to the media type of the response under the > imageInfo > > element. > > > > > > > > 2. When processing images (and links and so on) in the primary document, > I > > think that duplicates should not be suppressed and the duplicate > detection > > should be handled in the preprocess method. Aside from anything else, > the > > detection of duplicates should be done on a canonical URI not just a > text > > match (and on the absolute version of the URI, for that matter). Though > as > > we saw from a little test that Dom put together real browsers do appear > to > > do a textual match, so that aspect of the behaviour needs to be > centralized > > so we can change it easily or control it by a switch. > > > > > > > > 3. In the CSSResource class, an image list needs to be constructed and > then > > processed as above. > > > > > > > > 4. The same observation applies to link elements as to images. Since CSS > > files can include other CSS files they need to have a list of included > CSS > > and that needs to be preprocessed according to the same URI matching > > strategy. > > > > > > > > 5. Ideally, each of the lists of URIs should provide a reference to > where > > they were found in the source of whatever document they were found in > for > > error reporting purposes. (Did I hear a collective groan about line and > > column number references L) and so that the moki document can provide > the > > info that an image/css was in error and is referenced in 7 rather than > just > > one place. > > > > > > > > 6. I think there is a need for an objects element in moki. It should > contain > > objects and the objects should say a) what their content type is and b) > > whether they should be counted as an external reference. That should be > easy > > enough to do. What's not so obvious is what to do about text/html when > it is > > found in an object and I think the answer is that it should be counted > and > > skipped. > > > > > > > > 7. Oh, and finally, before I forget. There is the case (401 > Authentication) > > where both the page presented with the response and the primary document > are > > tested and the external resources from the authentication page are added > to > > the total. On reflection, I think we should think again about this > behaviour > > before we go to the next last call of the mobileOK doc. And not worry > about > > it in the code for now. (Famous last words) > > > > > > > > I've just checked in some updates with a couple of TODOs in the relevant > > places, I hope. > > > > > > > > I'll also update the moki example doc with the suggestions I made. And > while > > I am about it I will generate a schema for moki. It's about time. > > > > > > > > Hope this helps. Oh and these are just my suggestions, you or anyone > else > > may have better ones. > > > > > > > > Jo > > > > > > > > > > ________________________________ > > > > > > From: public-mobileok-checker-request@w3.org > > [mailto:public-mobileok-checker-request@w3.org] On Behalf > > Of Laura Holmes > > Sent: 24 July 2007 23:54 > > To: public-mobileok-checker > > Subject: Proposed changes to Moki - External Resources Test > > > > > > > > > > Hi all, > > I just wanted to run some changes by you all and get some feed back. > > Currently, I'm working on the ExternalResourcesTest and am running into > > conditions that haven't been accounted for in the existing code. These > > conditions include: > > > > 1) counting references contained in objects that are not jpeg or gif: > > there are many other image types and other types of objects (such as > > applications or audio) that may be included on a page. I'm assuming that > we > > want to include these references even if they can't be rendered on a > mobile > > phone due to a comment made regarding nested objects: "For nested object > > elements, count only the number of objects that need to be assessed > before > > content matching the request header defined in 2.3.2 HTTP Request is > found." > > So, we want to assess other content types other than jpeg and gif when > > counting external resources. > > > > 2) keeping track of unique references to resources that are other than > jpeg > > or gif: > > If two references are made in the primary document to the same image, > it is > > only counted once, but if we reference the same image in css, we > currently > > don't have a way of tracking this. > > > > 3) references contained in nested objects are counted regardless of > whether > > or not the reference is actually reached: > > We only identify object nodes by name, not in serial order. > > > > Here are my proposed changes I want to make, which would entail > changing > > the shape of the moki doc a bit: > > > > We create an ArrayList of URIs that is maintained throughout the > entirety > > of the parsing process. When a reference to a resource is encountered, > we > > check to see if the list already contains that URI. This list will > contain a > > list of all the resources contained in both the primary doc and css > files. > > At the end of the parsing process, we can add an additional node to any > > location in the moki that states the length of the list ( i.e. how many > > unique resources were encountered). I propose adding this as it's own > node > > under moki, as it spans information in the primary doc, images, and css. > > Because we only want to record the number of unique references, I can't > see > > any other way to pull it from the moki document using xsl. I'm open to > any > > other suggestions. > > > > As to the nested object problem, I'm at a loss for solutions given our > > current implementation of the DOM. Suggestions? > > > > Thanks for your input in advance, > > Laura > > > >
Received on Monday, 30 July 2007 23:35:41 UTC