On the way to v1.0: some new tests/bugs/fixes/questions

Hi guys,

I spent some time testing/debugging/fixing/playing with the checker.
There were a few bugs that definitely needed to be fixed before I may 
update the online checker.

I haven't listed the bugs in Bugzilla, as I only have an intermittent 
modem-like Internet connection for the time being (this allowed me to 
find a rather amusing bug btw, see the last mentioned bug below). I 
fixed most important bugs, the remaining ones being minor enough I think 
to be able to address them later on, although some of them would 
probably better be fixed as soon as possible.

Below is the report of my investigations.

Topics:
- A few remaining questions to start with
- Main changes
- Tests added/completed
- Bugs fixed
- Bugs not fixed yet


Questions that could trigger some more bugs
-------------------------------------------
1/ Should the HTTP response returned for the resource under test be 
counted in EXTERNAL_RESOURCES? The doc seems to say it should. I undid 
the change Dom made 3 weeks ago not to count it. Is it correct? On the 
one hand, that means that we only have 9 'slots' for external resources. 
On the other hand, it seems a bit strange to count the redirections to 
the primary resource, but not the last HTTP Response as an external 
resource.

2/ Suppose imgA is an image served with caching directives that say that 
the image should not be cached. If the page defines:
   <img src="imgA" alt="imgA" />
   <img src="imgA" alt="imgA" />
... should imgA be counted twice in EXTERNAL_RESOURCES and 
PAGE_SIZE_LIMIT? If so, then I think it is not handled by the checker 
right now (minor bug I would say, but not trivial to fix).

3/ Hypothetical case:
   <img src="imgA" alt="imgA" />
   <img src="imgB" alt="imgB" />
... Let's suppose a request on imgA yields a redirect to imgB (I know it 
probably never makes any sense, but that is possible). The redirect 
should be counted in EXTERNAL_RESOURCES and PAGE_SIZE_LIMIT. But the 
HTTP response on imgB should be counted only once in EXTERNAL_RESOURCES 
and PAGE_SIZE_LIMIT. Am I correct? If so, then there is another minor 
bug here, not trivial to fix.

4/ I haven't had time to check that yet: are MIME types case-sensitive?
   <object data="imgA" type="image/gif" />
   <object data="imgA" type="IMAGE/GIF" />
(same question with Content-Type headers)
Looking at HTTPObjectResource.java, I note that the Content-Type header 
is checked in a case-insensitive way, whereas the type attribute is not.

5/ Checks on type attributes and Content-Type headers are made using 
"startsWith". I suppose one may complete the definition of the 
Content-Type with charset details and the like, but that also means 
that: "image/gifted" will match "image/gif". Was the loose check done on 
purpose?

6/ If I understand things correctly, there's a bug in the 
STYLE_SHEETS_USE-4 subtest. See below. But that may be because I didn't 
understand the definition of STYLE_SHEETS_USE-4 correctly.


Main changes
------------
- I mostly removed Dom's code added a few weeks ago in 
includedResources.xsl because, unless I forgot something, changes made 
to the moki by Abel and Miguel make it useless to have to go through the 
HTML code once again to count the external resources and compute the 
total page size (except that caching directives were and are still not 
properly taken into account, I think, see one of the questions above)
- The recursive parsing on objects elements in HTTPXHTMLResource.java 
was incorrect (some objects appeared twice and the code could crash in 
some cases).
- A small 'or' in a check where an 'and' was required made all the 
images trigger a WARN on their Content-Type, no matter if their 
Content-Type was valid.


Tests
-----
I updated the moki.xml and testresults.xml where appropriate.
I think I haven't committed one or two moki updates, but it's hard to 
play with CVS when your internet connection goes up and down all the time :(
I updated/completed the test suite to match new rules:
- MeasuresTest/6: test on CSS case-insensitivity and spaces
- MeasuresTest/7: test on CSS Level 1 properties
- StyleSheetsSupportTest/6: test on CSS case-insensitivity and spaces
- StyleSheetsUseTest/6: test on deprecated elements
- StyleSheetsUseTest/7: test on style-like elements
- StyleSheetsUseTest/8: test on unknown properties/values in CSS
- ObjectsOrScriptTest/2: updated to use GIF instead of PNG not to 
trigger side errors.
- ObjectsOrScriptTest/3: updated to use an object as fallback as well.
- ExternalResourcesTest/4: updated to use 11 times the same object, to 
ensure it counts only once.
- ExternalResourcesTest/5: use 20 different objects, where only the 
first one is rendered.
- ExternalResourcesTest/6: 10 different images + the resource under test 
should trigger a WARN.
- ExternalResourcesTest/7: 11 different images defined in the CSS Style 
should trigger a WARN.
- ExternalResourcesTest/8: check CSS stylesheets are correctly counted
- ExternalResourcesTest/9: check CSS stylesheets are correctly counted
- ExternalResourcesTest/10: 9 images among which one is defined twice + 
the resource under test should PASS
- ExternalResourcesTest/11: count only objects with no type and those 
whose content-type is "image/jpeg" or "image/gif"
- ExternalResourcesTest/12: count all objects with no type and all of 
those whose content-type is "image/jpeg" or "image/gif"
- StyleSheetsUseTest/9: test on STYLE_SHEETS_USE-4 that should apply 
globally
- StyleSheetsUseTest/10: test on STYLE_SHEETS_USE-4 that should apply 
globally
- ContentFormatSupportTest/17: test on a PNG image embedded in the CSS. 
Test on CSS spaces and case-insensitivity
- ContentFormatSupportTest/18: test on an external style sheet that is 
an XHTML page
- ContentFormatSupportTest/19: test on an external style sheet not 
served as text/css
- ContentFormatSupportTest/20: test on invalid GIF images defined as an 
img element and as an object

Tests that still fail, because the underlying bugs/questions are not fixed:
  CONTENT_FORMAT_SUPPORT 18
  CONTENT_FORMAT_SUPPORT 20
  OBJECTS_OR_SCRIPT 4
  OBJECTS_OR_SCRIPT 5
  STYLE_SHEETS_USE 9
  STYLE_SHEETS_USE 10


Bugs fixed
----------
- ContentFormatSupport: images embedded in CSS stylesheets were not 
checked anymore
   see test CONTENT_FORMAT_SUPPORT 17
   -> fixed in ContentFormatSupportTest.xsl by parsing directly new 
structures

- Measures: the following valid CSS style should trigger a FAIL on '12pt'
   "font: normal small-caps 1ex/12pt sans-serif"
   see test MEASURES 7
   -> fixed in MeasuresTest.java with a new regex

- ObjectsOrScript: check for OBJECTS_OR_SCRIPT-10 is invalid. All images 
return a WARN, no matter their Content-Type.
   -> fixed in ObjectsOrScriptTest.xsl: 'or' was used instead of 'and' 
in an if

- ObjectsOrScript: the checker crashes on ObjectsOrScriptTest/3.
   Incorrect retrieval of children objects
   -> fixed in HTTPXHTMLResource.java: getTopLevelObjects updated
   -> fixed in ContentFormatSupportTest.xsl: check only first object 
type that matches a given URI
   -> fixed in includedResources.xsl: check only first object type that 
matches a given URI

- ExternalResources: the main response to the resource under test is not 
included in the count (see ExternalResourcesTest/6)
   -> undid Dom's change in ExternalResourcesTest.xsl

- ExternalResources: CSS images are not included in the count
   see test EXTERNAL_RESOURCES 7
   -> fixed in ExternalResourcesTest.xsl and includedResources.xsl

- ExternalResources: an image that appears twice may be counted twice
   see test EXTERNAL_RESOURCES 10
   -> fixed in ExternalResourcesTest.xsl and includedResources.xsl

- ExternalResources: image/png objects defined without a type attribute 
are flagged as SKIPPED instead of TASTED
   -> fixed in HTTPObjectResource.java: markupContentType may be null. 
Checks rewritten in a more logical way, inconsistency flagged so that 
someone reading the code know it's done on purpose.

- ExternalResources: an object whose content type is image/jpeg or 
image/gif should be flagged as RENDERED, no matter if its type attribute 
says otherwise
   -> fixed in HTTPObjectResource.java: markupContentType may be null. 
Checks rewritten in a more logical way, inconsistency flagged so that 
someone reading the code know it's done on purpose.
   -> fixed in HTTPXHTMLResource.java: null is different from empty when 
parsing the type attribute

- PageSizeLimit: (same bugs as above)
   -> similar fixes applied to PageSizeLimitTest.xsl

- PageSizeLimit: stylesheets were not taken into account
   -> PageSizeLimitTest.xsl: missing '/' appended

- OBJECTS_OR_SCRIPT-10: message ends with a comma
   -> messages.properties.xml: comma removed

+ I fixed the remaining CSS case-insensitivity and spaces case in 
CSSUtils.java


New bugs
--------
- ObjectsOrScript: the checker applies some of the tests to more than 
just "Included Resources": OBJECTS_OR_SCRIPT-5 and OBJECTS_OR_SCRIPT-9 
are applied to all objects, whereas OBJECTS_OR_SCRIPT-6, 
OBJECTS_OR_SCRIPT-8, and OBJECTS_OR_SCRIPT-10 seem not to be
   see tests OBJECTS_OR_SCRIPT 4 and OBJECTS_OR_SCRIPT 5

- ImagesSpecifySize: the checker checks all images, even those that are 
not "Included Resources"
   see tests OBJECTS_OR_SCRIPT 4 and OBJECTS_OR_SCRIPT 5

- ExternalResources: caching directives are not taken into account (an 
image is counted only once, even if it's served with a no cache directive)

- StyleSheetsUse: STYLE_SHEETS_USE-4 is applied to each style element.
   see test STYLE_SHEETS_USE 9
   I interpret "If all styles are restricted to presentation media types 
other than "handheld" or "all" by means of @media at-rules, warn" as 
being global, i.e. when all styles are taken together, if there are all 
restricted to presentation media other than "handheld" or "all" by means 
of @media at-rules, warn.
   In short, the following (defined in the same page) should not trigger 
any WARN IMO, because the first style element contains some style rule 
that applies to all presentations:
    <style type="text/css">body { color:green; }</style>
    <style type="text/css">@media tv { body { color: red; } }</style>

- StyleSheetsUse: STYLE_SHEETS_USE-4 is applied on style elements 
restricted to a media type different from "handheld" or "all" va the 
media attribute.
   see test STYLE_SHEETS_USE 10
   In short, the following (stupid) code should not trigger any WARN 
IMO, because the style is already restricted via the media attribute:
    <style type="text/css" media="tv">@media tv { body { color: red; } 
}</style>

- ContentFormatSupport: the CSS parser crashes when the external 
stylesheet referenced by the page is the XHTML page
   see test CONTENT_FORMAT_SUPPORT 18

- ContentFormatSupport: there is no info on the validity of objects 
within the moki. There is thus no way to check that an image defined as 
an object is a valid GIF/JPEG image.
   see test CONTENT_FORMAT_SUPPORT 20

- Using the checker without a running Internet connection has some weird 
consequences.
   run OBJECTS_OR_SCRIPT 1 without an Internet connection.
   The code defines a dummy URI at example.org when javascript links are 
detected. I'm not sure why. What is sure is that without an 
up-and-running Internet Connection, example.org cannot be resolved, and 
this yields the following error message:
   <test name="LINK_TARGET_FORMAT" outcome="FAIL">
    <result name="LINK_TARGET_FORMAT-1" outcome="WARN">
     <info>WARN: The linked resource 
http://example.org/#javascript%3Aalert%28%27javascript%3A+scripting%27%29%3B 
is in a format ("") that may not be appropriate for a mobile device</info>
    </result>
    <result name="HTTP_RESPONSE-1" outcome="FAIL">
     <info>FAIL: The request to the resource 
http://example.org/#javascript%3Aalert%28%27javascript%3A+scripting%27%29%3B 
does not result in a valid HTTP response (because of network-level 
error, DNS resolution error, or non-HTTP response) </info>
    </result>
   </test>
   -> Can we get rid of this hack somehow?


Francois.

Received on Sunday, 13 July 2008 17:16:07 UTC