ISSUE-66 Change Proposal: be more explicit about potential repair techniques


The spec is very vague about what image analysis techniques could be 
applied to images. This change proposal suggests including more detail 
about possible techniques.


Currently the <img> element section mentions that UAs "may also apply 
heuristics to help the user make use of the image when the user is unable 
to see it", but the only suggested heuristic is OCR.

In practice, there are a host of other heuristics that could help a user 
make sense of an image, and they might be useful even to users who _can_ 
see the image. We do all users a disservice by not being more explicit 
here. Being explicit could encourage significant competition amongst user 
agents, leading to a much better user experience for everyone.

Since these heuristics are in many cases already implemented and shipping, 
sometimes in multiple products from multiple vendors, and since recent 
advances in image recognition techniques have been fast and furious, it 
seems reasonable to mention these techniques as real possibilities.


Strike "when the user is unable to see it". Instead, start a new sentence 
before the "e.g", which says "This would be especially useful to users who 
cannot see the image", and add the following after the "e.g." clauses, in 
a separate clause: "but it could also be useful to users who _can_ see the 
image, but might not fully understand or recognise it".

Move "optical character recognition (OCR) of text found within the image" 
to be the first bullet of a bulleted list, and add the following 
additional points:

   * Facial recognition in photographs, especially facial recognition of 
     notable individuals or of individuals in the user's social network.

   * Product or brand recognition in photographs or logos.

   * Barcode recognition of any embedded barcodes.

   * Bitmap to vector analysis for diagrams, allowing images to be 
     further analysed in specialised tools.

   * Data extraction for graphs, allowing data to be reconstructed from 
     bar charts, pie charts, and the like, or allowing regression lines 
     to be fitted to x,y plots.

   * Landmark recognition for photographs.

   * 3D reconstruction of scenes based on multiple images, allowing a set 
     of images to be taken together and explored in context.



Adding such text could lead to a renewed level of competition in browsers 
as they find the best ways to expose such tools to users.

Such competition would inevitably lead to improved accessibility across 
the board, as many of these analysis techniques could provide users with 
anything from a basic hint of the image's contents to fully-interactive 
reconstructions of the image in more accessible forms (especially in the 
case of text-in-image or graphs).


Makes the spec longer.




It is suggested that mentioning that user agents might be able to repair 
non-conforming pages could make authors less likely to write conforming 
pages, though it is not clear why this would apply here and not in the 
many other parts of the spec that mention repair techniques, especially 
the sections that explicitly mandate specific user agent repair 

Ian Hickson               U+1047E                )\._.,--....,'``.    fL       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 9 March 2010 10:45:14 UTC