[Bug 21439] The MAY w.r.t. treating invalid longdesc URLs as text, is harmful. Remove the harm - or remove the MAY.

https://www.w3.org/Bugs/Public/show_bug.cgi?id=21439

--- Comment #7 from Laura Carlson <laura.lee.carlson@gmail.com> ---
(In reply to comment #6)

> The MAY option complicates things for authors. 
> 
>    EXAMPLE:
> 
>       <img src="jpeg.jpg" alt="Text." longdesc="foo bar.html"/>
> 
>    EXPLANATION: 
> 
>    If the longdesc URL happens to contain a space character, then the > URL would be invalid. And thus, per the current spec text's MAY 
> option, a conforming user agent could choose to present the longdesc 
> attribute's content as the long description itself.
>
>  As a result, some users would be presented with the content of 
> the very longdesc attribute, while users of user agents that do not 
> implement the MAY option, would get the content of the file "foo 
> bar.html".

That is a good point, Leif. Let's not inadvertently introduce anything into the
spec that complicates longdesc for authors. For this to work a user agent would
need to differentiate between a space in a text string and a non-escaped space
in a longdesc URL (as well performing other checks to ascertain if a link is
dead or not).

Charles, here is an idea: perhaps have the spec incorporate Leif's algorithm
from Comment 4. Say something like: if a user agent is following the MAY
normative repair statement it MUST do Leif's Point 5 First, Second, and Third.

In addition when a user agent detects a non-escaped space it "MUST" check for
file extensions (i.e., ".html",  ".htm" , ".pdf", ".txt", ".php", etc, etc,
etc). If a file extension is found then consider it a URL. If no file extension
exists AND no document fragment exists either, have the data URI repair kick
in. 

User agents programmatically detecting all of this *correctly* could certainly
help longdesc. If they get it wrong it could/would be a disaster.

To do it right will require a solid UA repair algorithm. Right now we don't
have that and a lot can and would be overlooked if the spec is left as vague as
it is currently. 

Chaals, if you would rather not provide a repair algorithm in this spec,
seriously consider removing the following normative [1] and informative [2]
statements entirely from the longdesc spec and get UAAG to add one.  Then you
couls add something like  "If a longdesc attribute has invalid content, user
agents MAY make that content available to the user. If they do, then they MUST
follow the algorithm as detailed in UAAG." 

I would love to see a good algorithm in this spec or in UAAG. But until that
happens the longdesc spec is better off without [1] and [2].

[1] "If a longdesc attribute has invalid content, user agents MAY make that
content available to the user. This is because a common authoring error is to
include the text of a description, instead of the URL of a description, as the
value of the attribute."

[2] "One of the most common mistakes authors make that is easily repaired by
user agents is to use a description, instead of a URL that links to a
description. This means there is often plain text description in the content of
an invalid longdesc attribute. Converting such attributes to data URLs is a
simple repair strategy that can help recover from cases where authors have made
this mistake."

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Wednesday, 24 April 2013 11:41:25 UTC