Re: Bug report - no validation of URIs, not at even the most basic level

On Aug 5, 2007, at 21:52 , Olivier Thereaux wrote:
> On Sun, Aug 05, 2007, Cecil Ward wrote:
>
>> * Where then is the URI Validator? But if it is claimed that the
>> job of validation of URIs does not fall within the remit of the
>> HTML validator or the CSS validator, where is the standalone "W3C URI
>> Validator" deliverable?

> If someone wants to develop it as a component to Unicorn, it would
> indeed be really useful. Would you like to lead such a development?

Actually, it doesn't have to be a Unicorn component, it could indeed  
be directly included in the Markup Validator.

Let's see what would be needed:

1) a parser to check that a given string is a proper URI/IRI
This surely already exists, hopefully as open source code, or even  
better, as a perl module. Does anyone want to investigate this?

2) a list of all attributes which values are URIs
… at least for HTML, even if it would be better to have that for SVG,  
SMIL, mathml too
This could be extracted from the DTDs. In HTML 4.01 for example, a  
number of attributes are already marked as having a "URI" type
http://www.w3.org/TR/html4/struct/links.html#adef-href
although the URI type is simply an alias for CDATA
http://www.w3.org/TR/html4/sgml/dtd.html#URI

3) some SAX/OpenSP API code to trigger checking of URI values
This can be done in a modular fashion. SAX and OpenSP both are event- 
driven APIs that can perform some actions when an event is  
encountered. The event that interests us here is start_element

in pseudo code:

sub start_element() {
   if element_name in list_of_elements_with_uri_types {
     for each attribute element->attribute {
       if (element->attribute->name in  
list_of_attribute_with_uri_types) {
         check_attribute_value(element->attribute->value)
       }
     }
   }
}

Anyone wants to own translating this to perl?
You don't have to invent much, we already have something very similar  
to test the xmlns value for the root element, see:
sub W3C::Validator::SAXHandler::start_element
in
http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check


I have added an entry in bugzilla to track this:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=4916

-- 
olivier

Received on Tuesday, 7 August 2007 05:25:15 UTC