[whatwg] HTML 5 vs. XHTML 2.0 from Laurens Holst on 2004-11-14 (public-whatwg-archive@w3.org from November 2004)

From: Laurens Holst <lholst@students.cs.uu.nl>
Date: Mon, 15 Nov 2004 00:58:53 +0100
Message-ID: <4197F13D.4010103@students.cs.uu.nl>
Matthew Thomas wrote:
> What Henri said. If long-term fidelity is important, HTML should be  
> something you convert to, not your native format.  
> <http://diveintomark.org/archives/2003/01/13/semantic_obsolescence>

I don't see why I can't use HTML (+ XHTML 2.0) as the base for my 
documents. It is a well-known, well-defined markup standard, which 
everybody knows and can easily be viewed by everyone using a tool they 
all have: a browser. I never heard of Docbook.

Of course stuff like XHTML 2.0 links won't work in currently existing 
browsers (the a tag is out in XHTML 2.0 in favour of an attribute which 
can turn any element into a link), but that's where the use of HTML 5 
would come in; with HTML 5, I could still use 'a' tags for them.


>> ...
>> I don't think this is a spec just for 'the ignorant mass'. A spec  
>> aimed at them can hardly be taken seriously, because it will take a  
>> lot to make them learn.
> 
> There are more of them than there are of you and me, and we benefit  
> from their documents on the Web. (You might reminisce about the days  
> before GeoCities and Xanga and Time Cube, but those were also the days  
> before Google and eBay and Wikipedia.)

Do you think 'they' will notice any spec at all? I very much doubt it.

No, they will just go on using the <font> tags they learned about 10 
years ago.


>> ...
>> Anyway, let me stress again that for 'HTML 5' I am highly in favour 
>> of  adopting XHTML 2.0 with the unused HTML 4.01 tags marked 
>> 'deprecated'  (this is an important difference from XHTML 2.0 which 
>> removes them  altogether), and perhaps some additions. Because XHTML 
>> 2.0 is  definitely a more serious markup language.
> 
> The Web is not, and since about 1995 has not been, a serious medium. It  
> is much more often used for selling books than for publishing them, for  
> simulating sex than for discussing it, and for posting opinions than  
> for posting facts. For the Web's pockets of seriousness you might use  
> XHTML 2.0, but XHTML 2.0 is rather primitive; why not use TEI P4 instead?

I never heard of TEI P4, and I doubt it will ever get UI support.

All I see is that XHTML 2.0 is an existing standard with some nice 
improvements over HTML 4.01, and a lot of overlap with issues dicussed 
for 'HTML 5'.


>> And I'd say HTML 5 being compatible with XHTML 2.0 is a great merit  
>> for both.
> 
> Maybe, but backward compatibility is expressly a design goal of HTML 5  
> <http://www.whatwg.org/charter#back-compat>, while it is expressly not  
> a design goal of XHTML 2.0  
> <http://www.w3.org/TR/2004/WD-xhtml2-20040722/ 
> introduction.html#backCompat>. Such divergent processes are unlikely to  
> produce the same result.

I think that by including XHTML 2.0, and deprecating the tags not in it, 
you can already get a long ways. One would have to look a little more 
into this, but I don't think there are many conflicts.

Backwards compatibility can be achieved by using a number of the 
deprecated tags, and styling some of the new tags. However, a new spec 
inevitably adds new tags and new ideas. The page author then has a 
choice: use the new tags, or be 100% backwards compatible, or do 
something inbetween which works satisfactory (last one is most logical 
to me).


>> I'm getting the impression that we are here discussing much that has  
>> already been through thoroughly on the XHTML 2.0 working group.
> 
> Probably, though for the compatibility reason given above, our  
> conclusions may often be different.

In the case of quote I see a similar conclusion. Or well - it may not be 
a conclusion yet, but I think using a new <quote> tag instead of <q> is 
probably the best solution :).


>> For example the quotes thing - in XHTML there's no <q> anymore but  
>> there's <quote>, a choice very likely made because of the exact same  
>> concerns raised overhere (being inconsistency between <q>  
>> functionality, which a new tag would solve). Or removing <acronym>,  
>> <big> and <small>. The accesskey functional choice they made sounds  
>> pretty decent, from what I hear here. <var> is used (just maybe not 
>> by  you, but I have several times),
> 
> I use <var> whenever appropriate, which is about once a year, but I  
> recognize that it is unlikely ever to have any semantic usefulness  
> (because variable names aren't unique enough).

Perhaps... In any case, if you're going to use XHTML 2.0 as a basis you 
would have to copy that tag as well. It won't really hurt anyone to 
leave it in (even though you may perhaps hardly have used it), and I 
hardly think it's an argument against.


> I use <q> much more  
> often, and I will weep hot tears if/when it is abolished, but I  
> recognize it is a poorly-supported, backward-incompatibly-confusing  
> element, with hardly any semantic usefulness, and an uneasy  
> relationship with English punctuation (except in en-GB-hixie and  
> similar dialects).

I don't see anything complex in the way <q> works. Creating a new 
<quote> tag... ah, well, we've been through this before.


>> and the functionality of <cite> is greatly enhanced, making it a much  
>> more useful tool,
> 
> My list of deprecable items included cite=, not <cite>. cite= is mostly  
> useless for three reasons. First, it's invisible, so authors don't use  
> it, so it can't be relied on or aggregated usefully. Second, it expects  
> a URI, but the cited material isn't necessarily represented online  
> <http://lists.w3.org/Archives/Public/www-html/2003May/0214.html>.

I may be mistaken, but can't an URI be anything, as long as there's a 
scheme for it? Isn't there an URI for ISBN? (if not, I think there 
should be). And there is that ISSN thing as well, although I don't 
really know how that works :).


> Third, it doesn't relieve authors from having to provide text citations  
> before/after the quote as well (if they didn't, the text would be  
> nonsensical in hypertext-less media such as printouts or telephone  
> conversations).

I yet have to see the fist person to have a telephone conversation in 
HTML :).

And the <quote> element is exactly for that purpose. Providing text 
citations(/quotes).

The specification of cite as an attribute, which can be put many things 
(not just <cite>) looks pretty ok:
http://www.w3.org/TR/xhtml2/mod-hyperAttributes.html#adef_hyperAttributes_cite

UA's could implement it by underlining it like abbr and offering a 
'follow citation link' option in its context menu. Or web site designers 
could create/use a small piece of Javascript which displays a 'follow 
link' box, perhaps even a box with the link and a description which is 
shown when hovering over the quote.


>> which can also be employed for data mining (I've seen a similar thing  
>> on dive into mark's blog once iirc).
>> ...
> 
> You mean posts by citation  
> <http://diveintomark.org/archives/2002/12/27/pushing_the_envelope>. I  
> hope "Hixie said I was using [<cite>] correctly"  
> <http://diveintomark.org/archives/2003/01/19/influences> was an  
> over-broad interpretation of Ian's words, because (a) Ian has mentioned  
> "'clarifying' the definition of <cite>"  
> <http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2004-November/ 
> 002329.html>, and (b) while Mark's uses of <cite> matched the example  
> given in the HTML 4.01 spec  
> <http://www.w3.org/TR/REC-html40/struct/text.html#edef-CITE>, they did  
> not match the default presentation in all visual UAs, nor the resultant  
> use by most Web authors.
> 
> (Specifically, I think the most coherent and backward-compatible  
> "clarification" would be to restrict <cite> to titles of works, because  
> inviting authors to use it for names of people as suggested in the HTML  
> 4.01 example would require authors to override <cite>'s italic-ness  
> frequently, making them more likely to abandon the element completely.)

Actually, in the cases where I used cite for that purpose, italics what 
exactly what I intended them to be rendered like.

Example:
"<p>On a side note, it seems that <cite>fantasai</cite> is getting
really busy with the alternate style sheet switcher (at least I?m
seeing a fair lot of activity on the bugs involved), so hopefully by
the time Firefox 1.0 gets released it will be back in. And perhaps we
will even see persistent style switching, though I wouldn?t count on
it.</p>"

Did just what I wanted it to.


Another thing I just spotted in the XHTML 2.0 spec is xml:base. 
http://www.w3.org/TR/xhtml2/mod-hyperAttributes.html#adef_hyperAttributes_xml:base 
That's a pretty nice thing as well. Though I wouldn't know how to port 
an attribute from another namespace to HTML (can SGML have colons in 
their tag names?). Maybe just skip it for the HTML version of 'HTML 5' 
and only have it in the XML version. Ah well.


~Grauw

-- 
Ushiko-san! Kimi wa doushite, Ushiko-san!!
Received on Sunday, 14 November 2004 15:58:53 UTC