- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Sun, 03 Nov 2002 03:55:30 +0100
- To: www-validator@w3.org
Hi, two documents, <?xml version='1.0' encoding='iso-8859-1'?> <!DOCTYPE foo SYSTEM "http://www.bjoernsworld.de/cgi-bin/dtd.pl?björn"> <foo> <bar/> </foo> and <?xml version='1.0' encoding='iso-8859-1'?> <!DOCTYPE foo SYSTEM "http://www.bjoernsworld.de/cgi-bin/dtd.pl?bj%c3%b6rn"> <foo> <bar/> </foo> As per XML 1.0 Second Edition section 4.2.2 XML processors must process these documents as beeing equivalent, the Validator however does not, it claims the second document beeing valid while the first document is said to be invalid. It's getting somehow confused by the system identifier in the first example. Typically, XML processors get it "right" but request ...?bj\xF6rn or ...?bj\xC3\xB6rn or ...?bj%f6rn instead of ...?bj%c3%b6rn, ...?bj%C3%b6rn or ...?bj%C3%B6rn (which are all equivalent). dtd.pl is a CGI script that outputs different DTDs depending on whether the processor is behaving correctly: #!/usr/local/bin/perl -w print "Content-Type: application/xml-dtd;charset=us-ascii\n\n"; print "<!ELEMENT foo (bar)>\n" if ($ENV{'QUERY_STRING'} eq "bj%c3%b6rn" or $ENV{'QUERY_STRING'} eq "bj%C3%b6rn" or $ENV{'QUERY_STRING'} eq "bj%C3%B6rn") { print "<!ELEMENT bar EMPTY>\n" } I.e. the document is valid for conforming processors, invalid for non-conforming processors. regards.
Received on Saturday, 2 November 2002 21:55:25 UTC