W3C home > Mailing lists > Public > www-validator@w3.org > August 2007

Re: Bug report - no validation of URIs, not at even the most basic level

From: Cecil Ward <cecil@cecilward.com>
Date: Sun, 5 Aug 2007 13:39:01 +0100
To: <www-validator@w3.org>
Message-ID: <000001c7d75d$9b483700$d1d8a500$@com>

It is of course entirely understandable that someone might maintain validating URIs falls outside the scope of "validating (X)HTML". This is what I was expecting to hear from someone, and I wasn't disappointed.

But whether or not such a narrow attitude or definition concerning areas of responsibility as exemplified by the earlier poster can be agreed to be in some sense 'correct', it is hardly helpful. In fact, I cannot imagine anyone succeeding in explaining to the software development community just how such an 'silently unsafe' design approach does not simply let users down. We should be asking "Does that particular design choice work toward the betterment of the web?" and I would expect it to be common ground what the answer would be.

A few points.

* "Unicorn", I quote, is "W3C's universal conformance checker". The W3C's own description. I hardly need say any more. URI syntax validation is therefore required as URI is a W3C standard, and hardly an unimportant one. (I wanted to take a look, but was not able to get access to the Unicorn tool just now, it appeared to be down for some reason.)

* Secondly, even the earlier poster agrees that having zero-length, whitespace-only or other kinds of trivial URI's in (X)HTML @href is hardly "valid" by any common-sense definition or strict formal interpretation.
 
* Where then is the URI Validator? But if it is claimed that the job of validation of URIs does not fall within the remit of the HTML validator or the CSS validator, where is the standalone "W3C URI Validator" deliverable? Some of the earlier URI RFCs date from 1994, yet thirteen years on no URI validator is displayed in a prominent position.

* Safe by default: Although a standalone URI validator would be extremely useful, it makes absolutely no sense to exclude it from being invoked by, say, the HTML and CSS validators by default. After all, those validator tools that actually _dereference_ URIs are already doing validation on them. (Surely? - because if not, that's rather worrying.)

* Never more greatly needed: There have been a number of bugs in server software and UAs in the past and lack of validation tools certainly can not have helped. With the challenges of internationalization, definitive URI validation tools are more needed than ever by developers, since many of the software components in use today have URI-handling deficiencies which need to be explored. (In fact, that's one of my jobs for this afternoon. But I'm on my own.)

* Link checker: yes, I know that there is a link-checker tool, but that is not a syntax checking tool, but that is of course not the issue. Aside from the inadequacy of such a tool for this use-case, software developers who wish to test their understanding and interpretation of the specs need a real URI validator to give a system of detailed warnings and errors together with a description of the tool's _analysis of the URI_. This would allow the developer to experiment and check things such as character set handling and interpretation, interpretation of relative URIs for example.

Best wishes,

Cecil Ward.
Received on Sunday, 5 August 2007 12:39:12 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:25 GMT