W3C home > Mailing lists > Public > public-xhtml2@w3.org > July 2008

Re: [XHTML] document type

From: Roland Merrick <roland_merrick@uk.ibm.com>
Date: Thu, 31 Jul 2008 16:35:44 +0100
To: Shane McCarron <shane@aptest.com>
Cc: public-xhtml2@w3.org
Message-ID: <OF25E5C5F1.FDA0F449-ON80257497.00534CF4-80257497.0055AD1F@uk.ibm.com>
Greetings Shane, perhaps we can kill more than one bird with one stone 
here.  It would be really nice to simplify all the gorp at the front of a 
document that an author has to create as well as adding some features.

I am taken by your suggestion of using @version and enhancing it to allow 
CURIEs. I rather like the idea of being able to specify :

<html xmlns="http://www.w3.org/1999/xhtml" @version="XHTML10T">

rather than:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" 
 <html xmlns="http://www.w3.org/1999/xhtml">

The CURIE XHTML11 would allow the original gorp, and anything else that 
might be of interest to processors to be determined.

It is a pity we cannot make it simpler still, unless someone has a good 
idea. . .

I know you have told me that at present the vale of @version is fixed but 
I am trying to explore the possibilities. Perhaps it could only be 
introduced with a future version of XHTML.

@version could also take a list of CURIEs which would satisfy one of my 
goals stated at the start of this thread.

Regards, Roland

Shane McCarron <shane@aptest.com>
Roland Merrick/UK/IBM@IBMGB
26/06/2008 15:33
Re: [XHTML] document type

Roland started this thread, but I had been typing for some time already. 
As I see if there are a variety of requirements for supporting document 
1.      Authors want to declare what language (or language version) they 
wrote their document using to help ensure portability and consistency. 
Roland points out there is an ancillary requirement to say that a document 
is valid in a variety of (presumably similar) languages.
2.      Tool vendors need to know what version was used so they can 
correctly process the contents in the face of evolving languages.
3.      User agents use announcement as a way of deciding which parser / 
rendering engine to use.
4.      Validation technologies cannot validate a document without knowing 
its type.
5.      Servers use a combination of request headers and document types to 
determine what version of a document to send to a requestor.
In some of our most recent specifications, we have stepped away from 
requiring a DOCTYPE statement at the top of every document instance.  The 
reasons for this are confusing, but seem to be based on the assumption 
that the traditional use of DOCTYPE is tied to DTDs, and DTDs are bad.

I could argue that the DOCTYPE declaration is just that, a declaration (as 
in positive assertive statement), and in the real world satisfies all of 
the above requirements.  But let's take a step back.  Is there a way we 
can satisfy these requirements using some other mechanism?  And is it 
possible to do it at the protocol level, the document level, and possibly 
the DOM level?  Let's explore some possibilities:

Use DOCTYPE as it stands
DOCTYPE is a declaration mechanism, and it clearly conveys exactly what 
grammar was used when creating a document.  However, it is somewhat 
archaic, and there is a perception that it is tied to DTDs.  Is this a bad 
thing?  Maybe.  Also, if your markup language doesn't have a DTD (XHTML 2 
might not) then it is misleading at best.
Use DOCTYPE but do something clever
We could continue to use the DOCTYPE, but have the SYSTEM portion point to 
a generic URI instead of to a URI that maps directly to a DTD.  Then a 
client that actually CARED about the SYSTEM portion (validators are the 
only ones I know of) could use content negotiation to get a schema they 
understand (e.g., XML DTD, XML Schema, RelaxNG, NVDL... whatever). 
Alternately, we could use DOCTYPE but tell people who care to rely upon an 
RDF document to map the PUBLIC identifier to an appropriate schema.
Use attributes on the root element
It is possible that a parser could drill down into the root element of a 
document to ascertain its document type.  All XHTML family document types 
(except XHTML 1.0) have a version attribute on the html element.  Each has 
a unique value.  The values are sort of arbitrary right now, but they are 
unique.  A clever combination of the default namespace declaration and the 
version attribute could help a processor know exactly what it was 
processing.  We could even add @version to XHTML 1.0 when we update it in 
the near future.  Finally, I suppose we could extend the definition of 
@version to address some of the other requirements. 
Use attributes and the magic of RDF
Permit CURIEs in the @version attribute on the root element, and define a 
mapping from some specific CURIE vocabulary space we control to grammars 
we define.  Allows others to define their own mappings. This might be a 
better solution than just using @version as it stands.
Use media type / meta to indicate type and subtype via profile parameter
The application/xhtml+xml media type can have a profile parameter.  We 
could use that parameter to specify one or more grammars that a document 
adheres to.  We could use the same values in the meta http-equiv in the 
head so that documents not coming via HTTP could still be interpreted.  I 
think this is a mistake because it defers the decision to a higher level 
protocol, and lost of our customers won't have access to that.
Shane P. McCarron                          Phone: +1 763 786-8160 x120
Managing Director                            Fax: +1 763 786-8180
ApTest Minnesota                            Inet: shane@aptest.com

Shane P. McCarron                          Phone: +1 763 786-8160 x120
Managing Director                            Fax: +1 763 786-8180
ApTest Minnesota                            Inet: shane@aptest.com

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Received on Thursday, 31 July 2008 15:36:41 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:40:02 UTC