[Bug 21439] Update the MAY-section with algorithm for how to turn invalid URL into text

https://www.w3.org/Bugs/Public/show_bug.cgi?id=21439

--- Comment #4 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> ---
(In reply to comment #3)

Executive summary:  As even invalid URLs are URLs, the proposed script is
flawed. The longer explanation is that the script demonstrates the technical
and conceptual flaws of the permission to interpret @longdesc as text:

1) It wrongly assumes that invalid URLs do not work.
   EXAMPLE: <a href="the invalid URL">This link works</a>!

2) It would cause invalid URLs to be presented to users as text.
   EXAMPLE: longdesc="http://example.org/the invalid URL"
   would become <body>http://example.org/the invalid URL</body>

3) It would cause working (a.k.a. non-dead), invalid URLs to
   stop working. EXAMPLE:
   a) iCab and Opera today repeair longdesc URLs that are invalid
      due to space characters.[1] And thus, that class of invalid
      longdesc URL nevertheless work well in said browers.
            [1] Code: longdesc="the invalid URL"
   b) But if the MAY option/the script is applied, then, instead
      of repairing the URL and serving the file [2] to the user,
            [2] Filename: 'the invalid URL.html'
   c) the user would be served a piece of content, whose body
      would say <body>the invalid URL</body>.
      Which would be a loss/degradation of content.

4) It would be a layer violation: An invalid URL should eventually
   be repaired in such a way that it becomes a valid URL. Invalid
   URLs can be - and are - commonly repaired via simple repair
   techniques that almost every Web oriented URL consumer performs.

5) It takes away the attention from the real issue: dead URLs.
   Your script says: "//assumes some URL validating function."
   However, as shown above, URL *validation* would not be enough.
   To be sure that the "repair" did not in fact *destroy* things
   for the user, the UAs would have to run the following steps:

     First: Repair the URL (if necessary). (UAs already do this.)
    Second: Test whether the repaired URL is dead or working.
            That is: Do some sniffing. (Which in turns implies
            that we would have to step into the *formats issue*
            again, not?)
     Third: If dead, then check whether content is likely to
            make sense as text. (E.g. if the @longdesc content
            begins with the string "http://", then the content
            is unlikely to be useful as text.

I am glad that you created this script, as it allowed me to understand that the
idea (at least as currently formulated) is flawed.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Tuesday, 23 April 2013 00:41:15 UTC