- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Sun, 15 Feb 2004 18:58:15 +0200 (EET)
- To: www-html@w3.org
On Fri, 13 Feb 2004, Christoph Päper wrote: > *Jukka K. Korpela*: > > On Thu, 12 Feb 2004, Ernest Cline wrote: > > > >> The problem is tho, support for transclusion is extremely limited at > >> present. > > > > Yes and no - there's the SGML way that always was formally part > > of HTML but was never supported, > > Are you speaking about entities? Yes. The point is that if browser vendors had wanted to implement it, they could have done so and for once claim conformance to specifications. Remembering that useful features even simpler than that have remained unimplemented, I'm sceptic about any new features that are essentially more complex. In fact, I think HTML specifications should be augmented with what so many people want: simple include. It can be handled at a different level (preprocessing or server processing), but when added to HTML in a simple way, it would be useful in many cases, harmless in others: <include src="..."> alternate content for user agents that do not support include, such as a link </include> (defined as simple, "seamless" inclusion). > I wish they could be defined language dependent, thus > > <q lang="en">&sq;Life's a bitch. And then you die.&eq;</q> > > would be rendered with high-66 and high-99, whereas in > > <q lang="fr">&sq;L'État, c'est moi.&eq;</q> > > the entity references are computed to « + thin space and thin space + ». I think that is is very descriptive of the problems. Very knowledgeable people seem to think that quotation style should depend on the language of the quoted text, not on the language of the content. Moreover, the browser would need to have support to over 7 000 languages, or it would discriminate against some (most) languages and support just some of them. The first alternative is hardly realistic. The quotation styles have not even been _described_ adequately. Even version 3 of the Unicode standard presented wrong examples - it showed a French quotation without those thin spaces. So can you expect the programmers of a normal browser to create _correct_ support to the languages of the world? Language support is nice to have, when we get it, if we get it as reasonably correct. But basic rendering, such as the presence of quotation marks, should not depend on such support. > > If you write any software that tries to recognize quotations from > > Web pages, it would be just a theoretical exercise to play with > > <q> or <blockquote>, and the latter would give you wrong results > > far more often than not. Recognizing "..." would be much more relevant. > > That's not as simple as you make it sound here, though, realizing the very > different pairings of quotation marks throughout the Latin alphabet world > (e.g. »...« vs. «...» vs. »...»). I'm pretty sure that Ascii quotation marks "..." dominate over all other quotation marks, despite not being correct punctuation in any language, as far as I know. They are simply the de fact surrogate. Even guillemets are not used much, although they are technically almost as safe as Ascii quotation marks. Well, we have the IE misbehavior that it may split a line between a right-pointing guillemet and an immediately (i.e., no space) following word. Until leading browsers get such simple character-level things right, I wouldn't expect them to implement correctly anything i18n related that is essentially more complex. My point was that at present, by recognizing "..." as a quotation, you guess right far more often than by looking at <blockquote> and <q>. So whatever various programs _might_ do, it's not realistic to think that they will take any markup for quotations any more seriously than authors have done. In special applications, such as site-specific indexing in a well-managed authoring environment, things can be very different, but then again, the management can tell authors to write the quotations in a particular style at character level. > > (The whole block vs. inline distinction is a mess, and should not be > > carried over to any new markup elements.) > > <del>element</del><ins>language</ins> Maybe so, but I was thinking about XHTML 2.0, which I would classify as a dialect of HTML (as currently sketched, it's in practice closer to HTML 4 than HTML 4 was to HTML 3.2, though some of the differences imply that XHTML 2.0 documents would not display correctly on current browsers). Introducing <quote> as a text-level counterpart of <blockquote> means carrying the distinction to new elements. So would <blockcode> - which sounds like something we need, but it is simpler to allow <code> to contain block elements. -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Sunday, 15 February 2004 11:58:18 UTC