W3C home > Mailing lists > Public > www-validator@w3.org > August 2007

Re: Bug report - no validation of URIs, not at even the most basic level

From: olivier Thereaux <ot@w3.org>
Date: Tue, 7 Aug 2007 14:25:55 +0900
Message-Id: <38F7DCB7-C377-47CB-BEE3-A1BD1125981B@w3.org>
Cc: Cecil Ward <cecil@cecilward.com>
To: www-validator Community <www-validator@w3.org>


On Aug 5, 2007, at 21:52 , Olivier Thereaux wrote:
> On Sun, Aug 05, 2007, Cecil Ward wrote:
>
>> * Where then is the URI Validator? But if it is claimed that the
>> job of validation of URIs does not fall within the remit of the
>> HTML validator or the CSS validator, where is the standalone "W3C URI
>> Validator" deliverable?

> If someone wants to develop it as a component to Unicorn, it would
> indeed be really useful. Would you like to lead such a development?

Actually, it doesn't have to be a Unicorn component, it could indeed  
be directly included in the Markup Validator.

Let's see what would be needed:

1) a parser to check that a given string is a proper URI/IRI
This surely already exists, hopefully as open source code, or even  
better, as a perl module. Does anyone want to investigate this?

2) a list of all attributes which values are URIs
 at least for HTML, even if it would be better to have that for SVG,  
SMIL, mathml too
This could be extracted from the DTDs. In HTML 4.01 for example, a  
number of attributes are already marked as having a "URI" type
http://www.w3.org/TR/html4/struct/links.html#adef-href
although the URI type is simply an alias for CDATA
http://www.w3.org/TR/html4/sgml/dtd.html#URI

3) some SAX/OpenSP API code to trigger checking of URI values
This can be done in a modular fashion. SAX and OpenSP both are event- 
driven APIs that can perform some actions when an event is  
encountered. The event that interests us here is start_element

in pseudo code:

sub start_element() {
   if element_name in list_of_elements_with_uri_types {
     for each attribute element->attribute {
       if (element->attribute->name in  
list_of_attribute_with_uri_types) {
         check_attribute_value(element->attribute->value)
       }
     }
   }
}

Anyone wants to own translating this to perl?
You don't have to invent much, we already have something very similar  
to test the xmlns value for the root element, see:
sub W3C::Validator::SAXHandler::start_element
in
http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check


I have added an entry in bugzilla to track this:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=4916

-- 
olivier
Received on Tuesday, 7 August 2007 05:25:15 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:25 GMT