- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Sun, 22 Aug 2004 00:26:21 +0200
- To: "Sean B. Palmer" <sean+wv@infomesh.net>
- Cc: www-validator@w3.org
* Sean B. Palmer wrote: >Though SP doesn't include external identifiers in its ESIS output, it >does provide access to them through its generic API, making it possible >to build a tool on top of it in C++ that outputs just the PubID and >SysID (if present) on two consecutive lines. And I've done just that. Using Perl SAX2 (and XML::SAX::Expat) you could do #!perl package PrintXmlId; use base qw(XML::SAX::Base); sub start_dtd { my $self = shift; my $dtd = shift; printf "PUBLIC: %s\n", $dtd->{PublicId} if exists $dtd->{PublicId}; printf "SYSTEM: %s\n", $dtd->{SystemId} if exists $dtd->{SystemId}; } package main; use XML::SAX::Expat; die "Usage: $0 file.xml\n" unless @ARGV; XML::SAX::Expat->new(Handler=>PrintXmlId->new)->parse_uri(shift); For XML documents. Our current plan is to write a wrapper for OpenSP's generic interface to Perl that would be compatible with Perl SAX2, the PrintXmlId handler would thus work for all SGML/XML documents. Using just XML::Parser it is even simpler, % perl -MXML::Parser -e "XML::Parser->new(Handlers=>{Doctype=>sub{ \ printf qq(PUBLIC: %s\n), $_[3] if defined $_[3]; \ printf qq(SYSTEM: %s\n), $_[2] if defined $_[2]; \ }})->parsefile(shift)" ... However, whatever special things the Validator should do, it's best to write a XML::SAX::Base based handler to do it. >Note that SP may raise various errors along the way that you have to >redirect off to /dev/null as shown. You can avoid that using egp->inhibitMessages(true); >Whatever the tool used, I suggest you simply peek at the PubId and >SysId, and if they don't match raise some kind of achingly obvious >"Fatal Warning" to users: I don't believe that defaulting to either ID >is acceptable given that inconsistency will never be intentional on the >behalf of the user or their authoring tool. http://www.w3.org/TR/xhtml1-schema/ does that intentionally (unless you consider http://www.w3.org/2002/08/xhtml/xhtml1-strict.dtd to match the XHTML 1.0 Strict document type definition).
Received on Saturday, 21 August 2004 22:27:07 UTC