- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Sun, 22 Aug 2004 00:26:21 +0200
- To: "Sean B. Palmer" <sean+wv@infomesh.net>
- Cc: www-validator@w3.org
* Sean B. Palmer wrote:
>Though SP doesn't include external identifiers in its ESIS output, it
>does provide access to them through its generic API, making it possible
>to build a tool on top of it in C++ that outputs just the PubID and
>SysID (if present) on two consecutive lines. And I've done just that.
Using Perl SAX2 (and XML::SAX::Expat) you could do
#!perl
package PrintXmlId;
use base qw(XML::SAX::Base);
sub start_dtd
{
my $self = shift;
my $dtd = shift;
printf "PUBLIC: %s\n", $dtd->{PublicId}
if exists $dtd->{PublicId};
printf "SYSTEM: %s\n", $dtd->{SystemId}
if exists $dtd->{SystemId};
}
package main;
use XML::SAX::Expat;
die "Usage: $0 file.xml\n" unless @ARGV;
XML::SAX::Expat->new(Handler=>PrintXmlId->new)->parse_uri(shift);
For XML documents. Our current plan is to write a wrapper for OpenSP's
generic interface to Perl that would be compatible with Perl SAX2, the
PrintXmlId handler would thus work for all SGML/XML documents. Using
just XML::Parser it is even simpler,
% perl -MXML::Parser -e "XML::Parser->new(Handlers=>{Doctype=>sub{ \
printf qq(PUBLIC: %s\n), $_[3] if defined $_[3]; \
printf qq(SYSTEM: %s\n), $_[2] if defined $_[2]; \
}})->parsefile(shift)" ...
However, whatever special things the Validator should do, it's best to
write a XML::SAX::Base based handler to do it.
>Note that SP may raise various errors along the way that you have to
>redirect off to /dev/null as shown.
You can avoid that using egp->inhibitMessages(true);
>Whatever the tool used, I suggest you simply peek at the PubId and
>SysId, and if they don't match raise some kind of achingly obvious
>"Fatal Warning" to users: I don't believe that defaulting to either ID
>is acceptable given that inconsistency will never be intentional on the
>behalf of the user or their authoring tool.
http://www.w3.org/TR/xhtml1-schema/ does that intentionally (unless you
consider http://www.w3.org/2002/08/xhtml/xhtml1-strict.dtd to match the
XHTML 1.0 Strict document type definition).
Received on Saturday, 21 August 2004 22:27:07 UTC