Re: Using your own DTD (was Re: %flow and headers and address)

Earl Hood (ehood@isogen.com)
Mon, 30 Sep 1996 15:15:21 -0500


Message-Id: <199609302015.PAA09883@bonk.isogen.com>
To: galactus@htmlhelp.com (Arnoud "Galactus" Engelfriet)
cc: www-html@w3.org
Subject: Re: Using your own DTD (was Re: %flow and headers and address) 
In-reply-to: Your message of "Mon, 30 Sep 1996 20:00:12 +0200."
             <sqAUy4uYO98S089yn@htmlhelp.com> 
Date: Mon, 30 Sep 1996 15:15:21 -0500
From: Earl Hood <ehood@isogen.com>

> > if it can, such that any other client could implement any or all
> > of it rationally, and any provider could include any combination
> > in a document instance and pass that to a validator.
> 
> Interesting side-note: suppose I write my documents to adhere to
> such a non-official DTD. How can I pass them to a validator if
> that validator does not have that DTD available? Can I use the
> DOCTYPE declaration to point to the DTD (assuming I put it on the Web)?

Yes.  The public identifier (and/or system identifier) is used
to signify the document type declaration external subset.

> 
> I've seen many documents with
> <!DOCTYPE HTML PUBLIC "-//IETF/DTD HTML 3.0//" "html.dtd">
> which is obvious incorrect,

No it is not.  The declaration is perfectly valid.

> but does the last bit imply you can
> provide an URL to your own DTD to be used?

Yes.

With SGML, an external entity can be identified with a public
identifier, a system identifier, or both.  With a public identifier,
the processing system has to map the public identifier to a system
identifier (ie. the actual storage object).  The system id can be
a file, a URL, a SQL query, etc.  The definition of the mapping is
normally done by an entity catalog that tells the system what are
the system identifiers for public identifiers.

If just a system id is specified, then the processing system uses
it to find the entity.  Since this leads to portability problems,
some systems allow the mapping of a system id to another system id.
BTW, public identifiers should be used when a document is to
transmitted to, or processed by, other systems.

When both are defined, it is implementation defined on which
takes precedence.  SP allows you to specify which identifier takes
precedence.

Hence, if you publish a document on the web that does not adhere
to a standard DTD, you can use something like one of the following
doctype declarations:

<!DOCTYPE HTML SYSTEM "http://myorg.org/html.dtd">
<!DOCTYPE HTML PUBLIC "-//MyOrg/DTD HTML 3.0 Variant//EN">
<!DOCTYPE HTML PUBLIC "-//MyOrg/DTD HTML 3.0 Variant//EN"
		      "http://myorg.org/html.dtd">

Note: Formal system identifiers should used for system identifiers
      so the processing system can adequately determine how to
      resolve the system identifier.  Example:

    <!DOCTYPE HTML SYSTEM "<url>http://myorg.org/html.dtd">

      The "<url>" tag is added to signify to the parser that the
      system identifier is to be treated as a URL.


The problem in the long run is you would like to avoid specifying
system identifiers in your document since when the external entities
it references move, you must update your document.  Since you can
specify entity mappings outside of the document source, ideally, you
will only be required to update the mapping.  With respect to the WWW,
when you serve your document to a client, you also serve your entity
catalog so the client will now how to resolve any external entities.

Groups have already been working on this issue (for general SGML
document delivery on the WWW), so hopefully something will become
available to the masses in the near future.

	--ewh