Re: Subtests that apply to HTTP header fields [was: Re: mobileOK validation logic - jar file?] from Francois Daoust on 2009-02-12 (public-mobileok-checker@w3.org from February 2009)

From: Francois Daoust <fd@w3.org>
Date: Thu, 12 Feb 2009 18:43:09 +0100
To: Yeliz Yesilada <yesilady@cs.man.ac.uk>
CC: public-mobileok-checker@w3.org, Kentarou Fukuda <KENTAROU@jp.ibm.com>, Yeliz Yesilada <yeliz.yesilada@manchester.ac.uk>
Message-ID: <49945FAD.9010107@w3.org>
Yeliz Yesilada wrote:
> Hi Francois,
> 
> Thanks for these and also for your previous suggestions. I have been 
> looking at the documentation and also the code, and here is my initial 
> thoughts:
> 
> 1. A mOKI (intermediary) document will be created from the local file 
> with some missing information (e.g., HTTPRequest and HTTPResponse, 
> etc...)  [Note: Assuming that they will not be provided externally, may 
> be done in the future].

That looks fine. See 1/ and 2/ below.


> 2. I suggest that each test returns either "PASS", "FAIL" or 
> "NOT_APPLICABLE".
>     The overall result will be "IN_COMPLETE", if any of the tests return 
> "NOT_APPLICABLE".
> Here I am assuming that the tests will stay as they are specified in 
> <http://www.w3.org/TR/mobileOK-basic10-tests/>, they only return 
> PASS/FAIL/WARNING (if there is a missing information, we will assume 
> that that is because of the difference between File/URI validation 
> approach, not that because they don't exist in the original document ).
> 
> For example, in CACHING 
> <http://www.w3.org/TR/mobileOK-basic10-tests/#CACHING>, all the 
> sub-tests assume that we have all the information from the HTTP Header, 
> however, with this approach, we will also check if know that information 
> and if we won't then the test will return "NOT_APPLICABLE".
> 
> With this approach, the major changes will be in the way 
> "PreProcessorResult" is created and also the XSLT files to check if we 
> know the requested HTTP Header information for each test, if we don't 
> then the tests will return "NOT_APPLICABLE".
> 
> What do you think about this? Does this make sense?

It looks good! :)

I have a few comments:

1/ The URI of the file resources in the moki would start with "file://", 
that's probably enough to identify the file vs. http/https case but we 
may need a more dedicated attribute to ease checks in XSLT stylesheets. 
I suggest we try without for the time being.


2/ Not to have to create a separate set of XSLT stylesheets for files, 
we might need to produce a fake HTTPResponse in the moki representation 
anyway, that would actually be the result of the file retrieval.


3/ I think there is a useful distinction to be made between a subtest 
that can't be run because some data is missing, and a subtest that can't 
be run because it doesn't need to, i.e. if there are no objects in the 
page, the OBJECT_OR_SCRIPTS subtests de facto pass. The first 
possibility is what we're talking about. The second possibility may be 
of some use in the future (I'm not suggesting we implement it right 
now). In short, I would rather keep NOT_APPLICABLE to the second case, 
and use DATA_MISSING (I can't think of a better proposal, but the idea 
is to point out that the moki representation is incomplete) for checks 
on files.


4/ You seem to imply that we'll just ignore entire tests as soon as one 
of the subtest cannot be run. It works with caching, but some other 
subtests could be quite useful (e.g. CHARACTER_ENCODING_SUPPORT-5 that 
checks whether the document is valid UTF-8 would be useful even though 
CHARACTER_ENCODING_SUPPORT-1 cannot be checked). Some other subtests 
could be useful as well even though they can only apply partially (e.g. 
EXTERNAL_RESOURCES-2 is still worth raising when the document targets 
too many external resources, even though the total is not 100% correct 
without HTTP redirections).

In short, the possible outcomes for a subtest (the <result> element in 
the XML report) would be:
  - PASS, WARN, FAIL for subtests that can be run normally.
  - PARTIAL_PASS, PARTIAL_WARN, PARTIAL_FAIL for subtests that can only 
be applied partially.
  - DATA_MISSING for subtests that simply can't be run.
I don't think that creates a real complexity or requires a lot more 
changes, because we'll need to update the corresponding XSLT stylesheets 
and handle these cases anyway.

(Note I had suggested to use a different attribute to flag the results 
as "partial", but I think that your approach requires fewer changes and 
actually helps "seeing" that the outcome is not a definitive response).

The possible outcomes for a test would be:
  - PASS, FAIL for tests that can be completely checked
  - PARTIAL_PASS, PARTIAL_FAIL when there is a PARTIAL_* and/or 
DATA_MISSING in one of the subtests
  - DATA_MISSING when none of the subtests could be run (e.g. for CACHING)

The possible overall outcomes would be:
  - PASS, FAIL when all tests can be completely checked (http/https case)
  - PARTIAL_PASS, PARTIAL_FAIL where there is a PARTIAL_* and/or 
DATA_MISSING in one of the tests
  [- DATA_MISSING is not going to ever occur, since it would mean none 
of the tests could actually be run at all]


5/ There is one specific case we need to handle carefully: when the 
primary document is using an HTTP/HTTPS scheme, then we need to return 
an error when one the linked resources is using the file scheme, whereas 
when the primary document is using the FILE scheme, we should process 
other file resources.


If we do that correctly, the final mobileOK report returned when the 
Checker is given a real http/https URI will be no different to the one 
that is returned today and that is good! (i.e. no PARTIAL_blah or 
DATA_MISSING in the http/https case).


How does that sound?

Francois.


> 
> Yeliz.
> On 6 Feb 2009, at 13:23, Francois Daoust wrote:
> 
>> I'm starting a new thread on that specific point. I had a quick pass 
>> through the list of subtests implemented in the mobileOK Checker, and 
>> came up with the following list.
>>
>> I note that the notion of Included Resources [1] relies on the value 
>> of the Content-Type HTTP header field for resources extracted from 
>> object elements, so the Included Resource definition cannot be 
>> properly applied to files. Tests that rely on this notion are listed 
>> in the "partial" category below. I think the impact is limited in 
>> practice, because object elements with fallbacks are not that common, 
>> but that still needs to be taken into account.
>>
>> [Side note: this actually triggers a potential future feature where a 
>> test could return a NOT_APPLICABLE outcome. For the moment, when a 
>> page does not contain any script or object element, the 
>> OBJECTS_OR_SCRIPT test returns a PASS, but it looks weird to say 
>> "You've done something correctly" when the sentence should rather read 
>> "nothing to test here"].
>>
>> I may have missed a subtest or two...
>>
>>
>> Subtests that do not apply to files
>> -----
>> AUTO_REFRESH-3
>> AUTO_REFRESH-4
>>
>> CACHING-1
>> CACHING-2
>> CACHING-3
>> CACHING-4
>> CACHING-5
>> CACHING-6
>> CACHING-7
>> CACHING-8
>>
>> CHARACTER_ENCODING_SUPPORT-1
>>
>> CONTENT_FORMAT_SUPPORT-1
>> CONTENT_FORMAT_SUPPORT-2
>> CONTENT_FORMAT_SUPPORT-7
>> CONTENT_FORMAT_SUPPORT-8
>> CONTENT_FORMAT_SUPPORT-10
>>
>> HTTP_RESPONSE-1
>> HTTP_RESPONSE-4
>> HTTP_RESPONSE-5
>> HTTP_RESPONSE-6
>> HTTP_RESPONSE-7 (well, perhaps we could make a parallel with an 
>> "access denied" for files, but I don't think that's needed)
>> HTTP_RESPONSE-11
>> HTTP_RESPONSE-12
>> (note HTTP_RESPONSE-2 and HTTP_RESPONSE-3 simply do not exist anymore, 
>> which explains why they are not in the list...)
>>
>> HTTPS-1
>> HTTPS-2
>> HTTPS-3
>>
>> META_HTTP_EQUIV-1
>> META_HTTP_EQUIV-2
>>
>> LINK_TARGET_FORMAT-1
>> LINK_TARGET_FORMAT-2
>>
>> OBJECTS_OR_SCRIPT-9
>> OBJECTS_OR_SCRIPT-10
>>
>>
>> Subtests that apply partially to files
>> -----
>> HTTP_RESPONSE-8
>> HTTP_RESPONSE-9
>> HTTP_RESPONSE-10
>> -> 404 is equivalent to a missing file
>>
>> CHARACTER_ENCODING_SUPPORT-4
>> -> No way to check encoding declared in HTTP header fields
>>
>> CONTENT_FORMAT_SUPORT-9
>> -> We can still check whether the resource is a valid JPEG/GIF image
>>
>> CONTENT_FORMAT_SUPPORT-11
>> -> CSS validity can still be checked.
>>
>> EXTERNAL_RESOURCES-2
>> EXTERNAL_RESOURCES-3
>> -> Cannot count intermediary HTTP redirections.
>> -> No way to check the actual content-type of a file resource
>>
>> OBJECTS_OR_SCRIPT-*
>> ->
>> PAGE_SIZE_LIMIT-2
>> -> HTTP Redirects cannot be taken into account.
>> -> Selection of Included Resources is not 100% correct.
>>
>>
>> Francois.
>>
>> [1] http://www.w3.org/TR/mobileOK-basic10-tests/#included_resources
> 
>
Received on Thursday, 12 February 2009 17:43:57 UTC