- From: olivier Thereaux <ot@w3.org>
- Date: Tue, 9 Oct 2007 21:54:29 +0900
- To: openjade-devel@lists.sourceforge.net
- Cc: W3C Validator Community <www-validator@w3.org>
Dear opensp developers, As some of you know, opensp is used as the base sgml/xml parser in the W3C markup validator. http://validator.w3.org/ One of the issues I am facing, as one of the developers of that application, is the thorny subject of validation (against XML DTDs) and XML namespaces. Since the XML namespaces specification came after the first XML recommendation (and long after SGML), and since the namespaces specification did not address the question of validating XML with namespaces, the question "how can one build a document that is valid (wrt a DTD) and uses namespaces to define foreign elements and attributes", the answer is, generally "you don't". e.g: http://www.rpbourret.com/xml/NamespacesFAQ.htm#dtd_6 This has been the source of a lot of frustration, and, I assume, is one of the reasons why a lot of contemporary XML-based language design doesn't use DTDs but one of the more recent schema languages. It is also a pity, because it means that a number of XML-based languages can not at the same time retain the concept of validity, and be extended with namespaces. This has made the extensibility of XHTML difficult (making its name a painful irony), and made the validation of SVG (commonly used in combination with other languages/ namespaces) quasi-impossible. However much I am told that the solution "simply is to make DTD validation namespace-aware", I still have no clue how that would be done. Another solution however, which seems to be favored by e.g TimBL [1], would be to ignore, while parsing a document tree against a DTD, anything not in the current root namespace. [1] http://www.w3.org/DesignIssues/Architecture.html I am thinking of implementing such a thing in the markup validator. After all, the validator does know what the root namespace is, and uses the opensp API (through Bjoern's excellent perl wrapper). The idea I could think of implementing is to just ignore any message from the parser when between a StartElementEvent and EndElementEvent for an element not in the root namespace. Ditto for issues with attributes not in the root namespace. Make that an option in the validator, and people who really want to can extend their XHTML, SVG, etc. documents without forfeiting the possibility of checking the validity of their core document. However, before I do that, I'm curious about the following. Apologies if these are FAQs, I could not find the info anywhere yet: * is there a mechanism in opensp to ignore elements in foreign namespaces? * would there be any interest for such a mechanism? * were there any other possible solutions to the question of DTD validation / namespaces thought of before, that I should be aware of? Thank you. -- olivier Thereaux - W3C - http://www.w3.org/People/olivier/ W3C Open Source Software: http://www.w3.org/Status
Received on Tuesday, 9 October 2007 12:54:42 UTC