- From: Laura Holmes <holmes@google.com>
- Date: Fri, 27 Jul 2007 10:05:59 -0400
- To: "Jo Rabin" <jrabin@mtld.mobi>, public-mobileok-checker <public-mobileok-checker@w3.org>
- Message-ID: <135a9f560707270705u3b0635d6s5a7c66951112dd56@mail.gmail.com>
On second thought, here's an addendum to that class I mentioned... In the master list of external resources, the class/struct should really be made up of a key (the actual URI of the resource) and a list of references. That way when the resource is processed via it's URI, the references can be added as nodes within the resource object. On 7/26/07, Laura Holmes <holmes@google.com> wrote: > > > 3. In the CSSResource class, an image list needs to be constructed and > then processed as above. > > >It sounds as if you're suggesting building up a list of references within > the style sheet independent of the images contained in the primary document. > I proposed the idea of a universal list of references because it's very > possible that a >stylesheet could reference the same images as referenced in > the primary document. If we construct an independent resource list from the > css and the primary document, we might get some duplications. > > >Well, what I am suggesting is that rather than passing a single list > around the place, each resource is responsible for managing and maintaining > its own list. I agree that there needs to be a central "cache" list so that > when about to >check a resource you look at the list and see if you already > did it. I think that it should be noted specifically in the moki document > that you didn't retrieve it because you think it's the same as some > specified other resource. > What I think you mean is that each resource will manage it's own list of > resources, but we'll pull all those resources together to create a master > list before we process them. If each resource manages it's own list, as a > list of HTTPResource objects, duplicate resources will exist under different > resources' lists. In order to make sure we get the whole master list of > resources before processing them, we'd probably have to completely dive into > the css, create the list of image and object resources to be processed, and > then process that list. > > Just so I'm completely certain of the structure of the moki, here's an > example as how I see things being structured right now. Correct me if I'm > thinking about this the wrong way or missing any nuances. > > <moki> > <primaryDoc/> > <stylesheets> > <stylesheet type="external"/> > (references the stylesheet listed below, but you'd only > know if you looked at the source of this css document) > <stylesheet type="external"/> > </stylesheets> > <images> > <image> > <reference/> > <reference/> > </image> > <image> > <reference/> > <reference/> > </image> > </images> > <objects> > <object> > <reference/> > <reference/> > </object> > </objects> > </moki> > > The method of keeping track of resources list, that most closely resembles > the final output as represented by this model, would be an array list of > external resources - and each entry keeps track of where it is referenced. > This might involve a simple external resource class, such as: > > public class ExternalResource { > private ArrayList<reference> references; > private HTTP____Resource; > } > > I'm still figuring out how this would affect the flow of preprocessing, > but I think fully processing the stylesheets as we encounter them would > work, while each css style sheet maintains its own list of resource URIs. If > one of those resources is a stylesheet, then we process the stylesheet while > maintaining a simple cache list of stylesheets already processed. Then, as > soon as all stylesheets are processed, we coalesce all the external > resources that are either images or objects into a master list, and then we > create the master list of fully fledged HTTPResources. > > Thoughts? > > Cheers, > Laura > > P.S. Jo - I'd recommend Coupa Cafe as it's one of my favs and has free > wireless internet. :) > > On 7/26/07, Jo Rabin <jrabin@mtld.mobi> wrote: > > > > *From:* Laura Holmes [mailto:holmes@google.com] > > *Sent:* 26 July 2007 16:13 > > *To:* Jo Rabin > > *Subject:* Re: Proposed changes to Moki - External Resources Test > > > > > > > > > It looks like the image element in moki needs to be extended to > > include an image type which should be set to the media type of the response > > under the imageInfo element. > > Given that there may be a difference between declared image type and the > > actual image type, if we include an element the states the retrieved file > > type and there's an element that states the declared image type, should we > > issue a warning if these two pieces of information don't match? > > > > Sounds like a good idea, though this will only happen for Objects, > > right? Img doesn't allow you to state the content type. > > > > > 2. When processing images (and links and so on) in the primary > > document, I think that duplicates should not be suppressed and the duplicate > > detection should be handled in the preprocess method. > > > > I'm unclear as to what we do after we've detected a duplicate, or how it > > should be handled. I think we had a conversation a while ago about how > > exactly multiple references should be represented within the moki document, > > but I'm not entirely sure what we decided on. What I felt was the most > > likely conclusion was that we record all references to that image, request > > the image once, and then in the corresponding image moki information have > > the image info mentioned once and all the listed references with line > > numbers included. > > > > Yes, that sounds good to me > > > > Is some of this answered with the moki example doc? Sean's out of town > > and I don't have the link. If someone could send that to me, that'd be > > great. > > > > No, it's not, as I haven't updated it to do so L. I will do if it is > > not too late over the next couple of days. Fwiw the current example doc is > > referenced from the TF home page [1]. > > > > [1] http://www.w3.org/2005/MWI/ > > BPWG/Group/TaskForces/Checker/Overview.html > > > > > > > 3. In the CSSResource class, an image list needs to be constructed and > > then processed as above. > > > > It sounds as if you're suggesting building up a list of references > > within the style sheet independent of the images contained in the primary > > document. I proposed the idea of a universal list of references because it's > > very possible that a stylesheet could reference the same images as > > referenced in the primary document. If we construct an independent resource > > list from the css and the primary document, we might get some duplications. > > > > Well, what I am suggesting is that rather than passing a single list > > around the place, each resource is responsible for managing and maintaining > > its own list. I agree that there needs to be a central "cache" list so that > > when about to check a resource you look at the list and see if you already > > did it. I think that it should be noted specifically in the moki document > > that you didn't retrieve it because you think it's the same as some > > specified other resource. > > > > > > > 4. The same observation applies to link elements as to images. Since > > CSS files can include other CSS files they need to have a list of included > > CSS and that needs to be preprocessed according to the same URI matching > > strategy. > > > > How deep are we diving as far as included resources? If a stylesheet > > @imports another css file, do we evaluate that stylesheet as well? Or do we > > just include the URI as an external resource? > > > > We go as far as it takes (modulo media type restrictions) as that is > > what a browser would (should) do. > > > > > > > (Did I hear a collective groan about line and column number references > > L ) > > > > Yes, but not because it's impossible, it's just imperfect. As for final > > recording of errors, we can reference line numbers (not column numbers) from > > the primaryDoc/docContent of the moki document. This would give a roughly > > accurate location, except that if the original source document had reduced > > white space (as we suggest), the line numbers we report would not be the > > actual line numbers of the source document. However, if we chose to report > > the snippet of code that corresponds to that line number, that snippet would > > provide much more specific and useful information. > > > > However, right now we only contain the docContent for the primary doc, > > not the included CSSResources. For our currently line reporting solution, we > > would also have to include the source for the css pages. For the sake of > > error reporting but at the expense of keeping the moki smaller, would > > everyone like to include that information? > > > > Yes, I think that is quite important. If we report an error in a > > resource that is not actually referenced from the primary document it could > > leave a developer scratching their head for quite a long time, unless there > > is some way of finding what resource the error is in and where it is in that > > resource. > > > > > 6. b) whether they should be counted as an external reference > > > > I think counting objects as an external resource is a good idea, but I'm > > not sure what the criteria for being counted as an external resource would > > include. As soon as I know, I can start working on it. > > > > That it would actually be retrieved in realistic situations. So each > > object and its fall-back gets retrieved until one is found that matches the > > request criteria. > > > > > > > > Thanks for your detailed response, Jo. > > > > Sorry I didn't update the moki example doc yet, I will do that soon – > > flying to Palo Alto tomorrow will do once there. From a coffee shop in > > University Avenue, maybe J > > > > Jo > > > > > > > > On 7/25/07, *Jo Rabin* < jrabin@mtld.mobi> wrote: > > > > Hi Laura > > > > > > > > Sorry about taking a long time to get back, you ask good questions! Some > > thoughts. > > > > > > > > We should include all resources pointed to by img tags. I noticed in the > > preprocess method that the checker seems to want to assess content type by > > looking for the file extension. But that is really not right at all. It's > > not necessarily the case that images will have a file extension, and even > > when they do, it's an error to infer the content type from them – see > > e.g. [1] and [2] which make it clear that the resource must be retrieved > > and its Content-Type header examined in order to determine its type. > > > > > > > > [1] http://www.w3.org/2001/tag/doc/metaDataInURI-31.html#erroneous > > > > [2] http://www.w3.org/2001/tag/doc/mime-respect#missing > > > > > > > > Even though the object element allows the specification of content type, > > browsers typically taste the content of nested objects even in the presence > > of this information to determine the actual content type. Given that they > > stop when they find something they like, it's a good question to ask whether > > the checker should continue and whether and where it should put those > > references in moki. It's an even better question to wonder how the xslt > > would differentiate between those objects that should be counted and those > > that should not. > > > > > > > > So some thoughts about the code: > > > > > > > > 1. Given that the image type is not known in advance of retrieving it, > > and given that the image may not be of a known type, there seems to be the > > need for a factory somewhere which constructs a JPEG resource, a GIF > > resource or a generic image resource depending on the result of the > > retrieval. It looks like the image element in moki needs to be extended to > > include an image type which should be set to the media type of the response > > under the imageInfo element. > > > > > > > > 2. When processing images (and links and so on) in the primary document, > > I think that duplicates should not be suppressed and the duplicate detection > > should be handled in the preprocess method. Aside from anything else, the > > detection of duplicates should be done on a canonical URI not just a text > > match (and on the absolute version of the URI, for that matter). Though as > > we saw from a little test that Dom put together real browsers do appear to > > do a textual match, so that aspect of the behaviour needs to be centralized > > so we can change it easily or control it by a switch. > > > > > > > > 3. In the CSSResource class, an image list needs to be constructed and > > then processed as above. > > > > > > > > 4. The same observation applies to link elements as to images. Since CSS > > files can include other CSS files they need to have a list of included CSS > > and that needs to be preprocessed according to the same URI matching > > strategy. > > > > > > > > 5. Ideally, each of the lists of URIs should provide a reference to > > where they were found in the source of whatever document they were found in > > for error reporting purposes. (Did I hear a collective groan about line and > > column number references L ) and so that the moki document can provide > > the info that an image/css was in error and is referenced in 7 rather than > > just one place. > > > > > > > > 6. I think there is a need for an objects element in moki. It should > > contain objects and the objects should say a) what their content type is and > > b) whether they should be counted as an external reference. That should be > > easy enough to do. What's not so obvious is what to do about text/html when > > it is found in an object and I think the answer is that it should be counted > > and skipped. > > > > > > > > 7. Oh, and finally, before I forget. There is the case (401 > > Authentication) where both the page presented with the response and the > > primary document are tested and the external resources from the > > authentication page are added to the total. On reflection, I think we should > > think again about this behaviour before we go to the next last call of the > > mobileOK doc. And not worry about it in the code for now. (Famous last > > words) > > > > > > > > I've just checked in some updates with a couple of TODOs in the relevant > > places, I hope. > > > > > > > > I'll also update the moki example doc with the suggestions I made. And > > while I am about it I will generate a schema for moki. It's about time. > > > > > > > > Hope this helps. Oh and these are just my suggestions, you or anyone > > else may have better ones. > > > > > > > > Jo > > > > > > ------------------------------ > > > > *From:* public-mobileok-checker-request@w3.org [mailto: > > public-mobileok-checker-request@w3.org] *On Behalf Of *Laura Holmes > > *Sent:* 24 July 2007 23:54 > > *To:* public-mobileok-checker > > *Subject:* Proposed changes to Moki - External Resources Test > > > > > > > > Hi all, > > I just wanted to run some changes by you all and get some feed back. > > Currently, I'm working on the ExternalResourcesTest and am running into > > conditions that haven't been accounted for in the existing code. These > > conditions include: > > > > 1) counting references contained in objects that are not jpeg or gif: > > there are many other image types and other types of objects (such as > > applications or audio) that may be included on a page. I'm assuming that we > > want to include these references even if they can't be rendered on a mobile > > phone due to a comment made regarding nested objects: "For nested objectelements, count only the number of objects that need to be assessed before > > content matching the request header defined in * 2.3.2 HTTP Request*<http://www.w3.org/TR/mobileOK-basic10-tests/#http_request>is found." So, we want to assess other content types other than jpeg and gif > > when counting external resources. > > > > 2) keeping track of unique references to resources that are other than > > jpeg or gif: > > If two references are made in the primary document to the same image, it > > is only counted once, but if we reference the same image in css, we > > currently don't have a way of tracking this. > > > > 3) references contained in nested objects are counted regardless of > > whether or not the reference is actually reached: > > We only identify object nodes by name, not in serial order. > > > > Here are my proposed changes I want to make, which would entail changing > > the shape of the moki doc a bit: > > > > We create an ArrayList of URIs that is maintained throughout the > > entirety of the parsing process. When a reference to a resource is > > encountered, we check to see if the list already contains that URI. This > > list will contain a list of all the resources contained in both the primary > > doc and css files. At the end of the parsing process, we can add an > > additional node to any location in the moki that states the length of the > > list ( i.e. how many unique resources were encountered). I propose > > adding this as it's own node under moki, as it spans information in the > > primary doc, images, and css. Because we only want to record the number of > > unique references, I can't see any other way to pull it from the moki > > document using xsl. I'm open to any other suggestions. > > > > As to the nested object problem, I'm at a loss for solutions given our > > current implementation of the DOM. Suggestions? > > > > Thanks for your input in advance, > > Laura > > > > > > > >
Received on Friday, 27 July 2007 14:06:21 UTC