W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2000

RE: XHTML DOC Type DTD

From: Jelks Cabaniss <jelks@jelks.nu>
Date: Thu, 22 Jun 2000 17:25:23 -0400
To: <ODWORLD@aol.com>, <html-tidy@w3.org>
Message-ID: <NBBBICMNIPCICMKJECCBOEJPDKAA.jelks@jelks.nu>
ODWORLD@aol.com wrote:

> <!DOCTYPE html
>                        PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
>                        "DTD/xhtml1-transitional.dtd">
>
> but when I use HTML Tidy (as part of HTML-Kit) to convert a document to XHTML
> 1.0 Transitional, it shows the DOC Type should be:
>
> <?xml version="1.0"?>
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
>     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
>
> Which is correct?

The latter.  (Though you don't need the preceding XML declaration unless your
encoding is something besides UTF-8.  The XML declaration has nothing to do with
the DOCTYPE.)

[There are some errors in the XHTML 1.0 specification.  Another one, to give an
example, is that it says to treat form feeds (ASCII 12, or CTRL+L) as
whitespace.  The problem: form feeds aren't allowed in XML documents! :) ]

Back to the DOCTYPE... The relative SYSTEM identifier of
"DTD/xhtml1-transitional.dtd" will work fine with an XML-based validator *if*
(and only if) you have a subdirectory called DTD with the that particular DTD
(and the 3 external entities it references for Latin 1, Symbol, and Special
characters) in that subdirectory.

With an SGML-based validator like James Clark's SP (which is used by the
validator.w3.org and the HTMLhelp online validators), the PUBLIC identifier is
all that's needed -- the validator looks up "-//W3C//DTD XHTML 1.0
Transitional//EN" (or whatever you requested) in a local "catalog" file,
retrieves a local copy of the corresponding DTD, then validates the document
against it.

There's a big need for something like the "Formal Public Identifier" (the PUBLIC
"..." part, a carryover from SGML) for XML.  Right now, XML glosses over FPIs in
favor of SYSTEM identifiers -- the actual physical addresses of the DTDs.  That
means an XML-based validator has to go to the W3C site to download the DTD each
time.  The other alternative would be some sort of cache mechanism, but nothing
is standardized on that at this point (like "how do you tell it when to update
the cache?")...

In general, use the latter version, the one with the full SYSTEM URL to the
DTD -- the one created by Tidy -- for now.  But be prepared for XML validators
to take longer than SGML validators to validate your documents, since the XML
validators will have to actually fetch the DTDs from the W3C site.

/Jelks
Received on Thursday, 22 June 2000 17:27:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:44 GMT