- From: Smylers <Smylers@stripey.com>
- Date: Sat, 3 May 2008 14:44:02 +0100
- To: "public-html@w3.org" <public-html@w3.org>
Olivier GENDRIN writes: > On Sat, May 3, 2008 at 11:11 AM, Smylers <Smylers@stripey.com> wrote: > > > Daniel Glazman writes: > > > > > 1. making alt optional in HTML 5 is ridiculous > > > > I don't think that's really an argument. But if it is then I'm > > going to rebut it with: > > > > Making alt compulsory in all circumstances is ridiculous. > > > > In particular, it doesn't make sense to mandate that the HTML author > > provides alt text for an image she doesn't know what it is. > > As it doesn't make sens to mandate the HTML author will respect the > rest of the spec... True. But in all _other_ cases where a webpage is invalid we can state how the author should have marked up the content available to her to produce something that is valid. Whereas in the case where the author doesn't know the image she's including there's nothing she can do. Even an author who wants to meet the spec wouldn't be able to do so if it mandated unknown alt text. > We could also make conformance optional, and in fact, it is optional, > as far UA have non-conformance handling process. There are some spec violations which the spec defines how browsers should handle. There are other spec violations which produce syntactically valid HTML (such as using <h1> just to make text bigger, or using alt="" on an important image) which a browser won't correct. > > > 3. when I read something like "When the alt attribute is missing, > > > the image represents a key part of the content. Non-visual > > > user agents should apply image analysis heuristics to help the > > > user make sense of the image.", I can't believe my eyes... > > > > Why? That sounds entirely plausible to me. > > Because if the author is not aware of @alt, he won't use it for > content images nor for illustration images. That's true. An author can't use something he isn't aware of. But I fail to see how an author could possibly read the part of the spec which explains the narrow circumstances in which it is accepted that no plausible alt attribute can be provided without being aware of the alt attribute. Clearly authors in ignorance of the spec are likely not to meet its requirements. > So the image is more likely not to be a 'key part of the content'. OK. Consider how a text-only browser should treat each of the following: * an image for which it has no alt text but which it knows to be significant content * an image for which it has no alt text and which might be significant content or might not What would you suggest it do differently with the second? Given that it might be an instance of the first it can't ignore it entirely. > And if image analysis heuristics was performant, use of CAPTCHA will > be abandonned. That does appear to be happening: http://www.theregister.co.uk/2008/04/14/msn_captcha_breaking/ It might be that the only heuristic is to say '[image]', or give its filename or dimensions. In this, unfortunate, situation where we have to synthesize alternative content without a human seeing the image there are three places where it could be done: 1 It could be left up to HTML authors to come up with _something_; the spec says it doesn't matter what, so long as there is an alternative. 2 The spec could mandate a formula for specific alt text, such as "[unknown image 000_0372.JPG 816 x 616 px]". 3 It could be left up to developers of user-agents that don't display images to work out the most appropriate behaviour for their users (possibly providing configuration options for users to pick for themselves). Option 2 is a poor choice because it doesn't keep pace with technology, and it prevents browser developers from innovating better ways of synthesizing unknown alt text. Option 3 is superior to option 1 because there is more incentive for developers (and users) to get this right than there is for authors to do so; the decision is being made by those who compete on this stuff, whose vested interest is in creating the best possible user experience for those without images. > And the content would be tainted by the result of the image analysis. > A single image can have thousands of meanings, which one will choose > the image analysis? Will it have also to analyse the context to guess > a probable meaning ? Sure, that's unfortunate. But we're in a situation where that data simply doesn't exist; it's far from ideal, but it happens. The HTML 5 spec can't magic that data out of nowhere, so heuristics are the best that can be done. The only questions are where's the best place to perform those heuristics, and should pages which require such heuristics be deemed valid webpages/ > > > 4. basing the spec'd definition of alt on common practice on the > > > web is crazy, absolutely crazy. > > > > I agree that would be a poor choice, since alt is so often used > > badly (or omitted when it should be provided). But I don't think > > HTML 5 _is_ doing that. Many existing web pages won't be valid HTML > > 5 specifically because they _don't_ provide alt text. > > Are we writing the spec to make 75% of the existing tagsoup webpage > conformant ? I don't understand how your question relates to the quoted text above it; I tried to say that I think HTML 5 (as currently written, with alt being omitted in a few defined cases) will deem many existing webpages to be invalid, specifically because of missing alt text. Widening the scope of HTML to include the x case of serving unknown images has no impact on the squillions of pages which currently omit alt text for other reasons. However, looking at other differences between HTML 5 and HTML 4 (not related to alt text), there does seem to be a move to make more tag-soup pages valid HTML 5 than are valid HTML 4. This seems to be for syntax which is unambiguous and which browsers have to interpret interoperably anyway (and in many cases, already do) -- situations in which changing the syntax to that demanded by HTML 4 will have no practical difference in how the page is interpreted; the only reason for doing so would be to obey the standard. But that's circular: if the standard drops the requirement then there's no reason at all to do it. I reckon it's a definite improvement to focus efforts on asking authors to change things which _do_ make a difference, rather than just on hoop-jumping. That so many existing pages are tag-soup suggests that conforming with HTML 4 was too hard (authors tried and failed), or too unnatural (its demands were counter to authors' instincts), or unnecessary for interopability (authors got the required output without pandering to them). Smylers
Received on Saturday, 3 May 2008 13:44:58 UTC