- From: David Megginson <david@megginson.com>
- Date: Mon, 1 Mar 1999 22:03:17 -0500 (EST)
- To: www-html-editor@w3.org
I would advise strongly against moving XHTML to PR in its current
form, because the WD seems to have fundamentally missed the potential
advantages of namespaces and self-identifying resources.
There are three fundamental flaws that need to be fixed before the
Director should reasonably consider moving this specification to PR:
1. The DOCTYPE declaration should not be required (Section 3.1, item
4), and if present, should be used only for (optional) DTD
validation -- the public ID in the DOCTYPE declaration is simply an
indirect pointer to an external DTD subset, not a unique identifier
for a document type. It was required for HTML 4.0 as a messy
kludge only because there was nothing better available; we have
namespaces now.
2. There should not be three separate HTML namespaces (Section 3.1,
item 3) -- the namespace is part of a unique identifier, not a DTD.
Unless the semantics (not the content model) change in the future,
and HTML 'cite' element should always have the same namespace, no
matter what particular DTD is in use.
3. Strict DTD conformance should be allowed but not required (Section
3.1, item 1); the use of namespaces will allow a processor to
determine what is and is not part of XHTML -- requiring strict DTD
conformance simply makes it impossible to add extensibility later,
and clean extensibility is really the only justification for XHTML
in the first place.
Further Discussion
------------------
For #1, I recommend making the DOCTYPE declaration optional because it
is really not necessary for non-validating XML processing and only
complicates things unnecessary.
For #2, consider the following use case: I want to ask a search engine
to find every instance of the word "wind" within an HTML <cite>
element (note that I don't want to find "wind" within other <cite>
elements, only within HTML's). With your current setup, I would have
to instruct the search engine to find every instance of "wind" within
{http://www.w3.org/Profiles/xhtml1-strict.dtd}cite *or*
{http://www.w3.org/Profiles/xhtml1-transitional.dtd}cite *or*
{http://www.w3.org/Profiles/xhtml1-frameset.dtd}, and when the next
version comes out, I will need to add three (or more) other namespaces
to the list, etc.
There should be a single, unique namespace for HTML that is persistent
across versions to enable search engines and other similar software to
function efficiently. To avoid confusion, it would also be best not
to include DTD files as part of the namespace URLs, since namespaces
are not schemas.
For #3, there is no reason to be arbitrary and restrictive; instead,
you simply need a set of rules governing how processors should react
to non-HTML (or unrecognised) element and attribute types. One
(hastily-thought-out) suggestion follows.
Suggestion
----------
Current processor behaviour is as follows:
- for unrecognised attribute types, the attribute should be ignored;
- for unrecognised element types, the elements contents are processed
as part of the parent element.
The attribute rule is fine for XHTML as well; you could refine the
element rule by adding a special attribute (say, 'html:default') with
allowed values along the lines of 'skip' or 'process' ('process' would
be the default) specifying what a processor should do if it does not
recognise an element type. In other words, if the processor found
<p>aaa <x:y html:default="process">zzz</x:y> bbb</p>
and it didn't recognise <x:y>, it would treat this as if it read
<p>aaa zzz bbb</p>
On the other hand, if it found
<p>aaa <x:y html:default="skip">zzz</x:y> bbb</p>
and it didn't recognise <x:y>, it would treat this as if it read
<p>aaa bbb</p>
The default value would be 'process', so the first case would also
apply for
<p>aaa <x:y>zzz</x:y> bbb</p>
All the best,
David
--
David Megginson david@megginson.com
http://www.megginson.com/
Received on Monday, 1 March 1999 22:04:37 UTC