Re: XML, namespaces, extensibility and validation from Tim Berners-Lee on 2007-06-22 (www-tag@w3.org from June 2007)

From: Tim Berners-Lee <timbl@w3.org>
Date: Fri, 22 Jun 2007 09:17:12 -0400
To: www-tag@w3.org
Message-Id: <52A7E35D-243E-4832-9295-98E6C6B24AD0@w3.org>
Oliver, you wrote in http://lists.w3.org/Archives/Public/www-tag/ 
2007Apr/0084.html :


> In the context of the development of the markup validator, I
> sometimes see questions related to XML validation,  extensibility and
> namespaces, and witness some frustration. I am, by far, not an expert
> on these topics, but I think many of you are: would you mind me
> picking the collective mind of the tag mailing-list to get a better
> idea of the road the markup validator should take to be more useful
> for more people, while obviously remaining true to the specs it's
> checking against?

Good question!   Let me give you my 2ç , not speaking for the TAG as  
a whole.


The W3C validator has filled a very important role, being a common  
and respected benchmark
against which anyone can test a web page.   The problem with it at  
the moment is, as you allude to, the relationship to extensiblity.

- The 'X' -n 'XML' is supposed to be for extensible

- The HTML language has always allowed for extension by saying that  
unknown tags or attributed should be ignored.

- Therefore, groups like ARIA ought to be able to extend XHTML by  
introducing new elements and attributes.

(- The namespace system allows new elements and attributes to be  
grounded in URI space, so that there is
no ambiguity, and so a person or machine can us the web to determine  
their appropriate interpretation.

- The new namespace would have a namespace document (such as an XML  
Schema document) which would explain how syntactically the new  
elements and/or attributes are allowed to it into HTML. )

This hasn't happened, and one factor has been that developers don't  
want to upset the W3C validator. So. let us make the validator more  
constructive.


- allow people to add new elements and tags, using namespaces.
- give a list of extensions used the validator does not know about.   
This is a warning, not an error.
- warn them if they are squatting on a namespace without the group's OK.
- congratulate them if a namespace document gives info about the new  
namespace
- if  XSD or RelaxNG from the namespace document etc can be used to  
check tha the new items are syntactically correct additions, do so
    (if not, warn them that it can't, or give error if the syntax can  
be demonstrated to be wrong)


Also on my wish-list would be:

- check the mime type, content-encoding and other HTTP headers  
intelligently
- check CSS linked and inline automatically
- check Javascript linked and inline for syntax.
- check for common procedural patterns which should be re-done  
declaratively
- give advice about safety (allow it to run in an safe e.g. email  
environment with scripting off)
- give advice about other things we believe in such as accessibility,  
i18n
- derive RDF data from the page using GRDDL, according to a putative  
spec  of what the current algorithm (GRDDL, embedded RDF syntaxes etc)


I'd like similar things for an RDF validator.

- request application/xml+rdf and text/rdf+n3
- check mime types returned
- for XML, check namespace & name of document element to decide how/ 
whether to parse for RDF
- understand and check RDF/XML and N3/turtle
- check links from HTML files, GRDDL etc
- check that each class and property used is mentioned in its  
namespace document (else warn)
- check that classes and properties have labels, ideally  in multiple  
languages (else weak warning)

Lots of work.  I think it would very valuable work.
I think the key is to check lots of things and give guidance, always  
in a positive helpful direction,
with a balance of severity levels carefully considered.

Tim
Received on Friday, 22 June 2007 13:17:18 UTC