Suggestion for Change in HTML Validatior

From: David M Abrahamson (david.abrahamson@cs.tcd.ie)
Date: Mon, Jan 08 2001

  • Next message: Jukka Korpela: "Re: Suggestion for Change in HTML Validatior"

    Message-Id: <v03130300b67f28c58485@[134.226.35.13]>
    Date: Mon, 8 Jan 2001 08:22:38 +0000
    To: www-validator@w3.org
    From: David M Abrahamson <david.abrahamson@cs.tcd.ie>
    Subject: Suggestion for Change in HTML Validatior
    
    Hello there,
    
    I have recently come across a commercial web site that uses HREFs of the
    form <A HREF="url:http://www.xyz.com">.  While Internet Explorer is capable
    of following such links, Netscape is not, but when I suggested to the web
    owner that his page was broken, he informed me that it validates okay.
    
    Quoting from RFC 2396 "Uniform Resource Identifiers (URI): Generic Syntax"
    - Berners-Lee, et al, August 1998:
    
     3. URI Syntactic Components
    
        The URI syntax is dependent upon the scheme.  In general, absolute
        URI are written as follows:
    
           <scheme>:<scheme-specific-part>
    
        An absolute URI contains the name of the scheme being used (<scheme>)
        followed by a colon (":") and then a string (the <scheme-specific-
        part>) whose interpretation depends on the scheme.
    
    This requires that "url:http" be a valid scheme, which it is not.
    
    Once again, quoting from RFC 2396:
    
     E. Recommendations for Delimiting URI in Context
    
        ...
    
        In practice, URI are delimited in a variety of ways, but usually
        within double-quotes "http://test.com/", angle brackets
        <http://test.com/>, or just using whitespace ...
    
        These wrappers do not form part of the URI.
    
        ...
    
        For robustness, software that accepts user-typed URI should attempt
        to recognize and strip both delimiters and embedded whitespace.
    
    Since the URI bracketing in an HREF attribute value is done using "..." or
    '...', not "URL:..." or 'url:...', I cannot see any requirement on an HTML
    browser to find and strip out the "url:".
    
    Perhaps you might like to modify the validator to spot this form of error?
    
    DMA.