- From: Jonny Axelsson <jonny@metastasis.net>
- Date: Tue, 04 Apr 2000 22:11:46 +0200
- To: www-html@w3.org
- Cc: www-html@w3.org
We all agree that BLINK and FONT is bad, but if XHTML is to be a rethink of HTML, we should go back to the start, a lot that made sense a decade ago, isn't how it would be done today. (*) I and B is imperfect markup, but having it is better than not having it. It does have a use. (*) If we imagined that I and B didn't exist, EM might have been created, STRONG wouldn't. STRONG is there because B is there (I said with perfect telepathic sense. Correct me if I'm wrong, but I think Dan Conolly is the inventor of STRONG. In which case this wasn't the best of his creations). <offtopic> <? if you want to react to this, do it in a separate message ?> The original messages also had these two points: (-)Move CODE, VAR, KBD and SAMP into a separate module. While the above two can be considered polite conversation under the barrage of device upload, this one is serious and realistic. These four attributes are admirably clear *for a programmer*, they are also clearly a separate module. For fair- ness and regularity, it probably should be an optional module. For backward compatibility and cowardice it possibly should be an obligatory module. (-)Some *structural* markup is currently as imperfect as I and B, probably worse. </offtopic> As for structural vs presentational, it might be clearer to talk about device independent and device dependent markup, these concepts are roughly equivalent. The crux of your argument is that there is some necessity that I/B must be italic/boldface and italic/boldface only. It is natural to use italic/boldface in printed text, but if they are not available or proper, some other effect (or none) could be used. A CRT could use inverse text, a teletype underline, the Teletext system another colour and so on. "I" could have been short for INTENSE, and "B" for BRUTAL, and you could've used "X" and "Y" for all that I care. The main point is that italic is largely used to represent stress in spoken language, especially emphasis of course, but not only. Experiment: Read out loud a text with italics, then read it without italics. It will be read differently. If the rules had been just a little more consistent than they are, I would seriously have suggested <i type="emphasis">, <i type="title">, <i type="citation"> (many structural tags were made by deconstructing italic, this scheme would have had the advantage of the "catch-all" I's without a type). At 15:24 03.04.00 -0700, Tantek Çw==elik wrote: >From: Jonny Axelsson <jonny@metastasis.net> >Date: Mon, Apr 3, 2000, 1:55 PM >Here are some of my tenets (working assumptions): I should have seen this coming, as I grabbed two numbering schemes for myself. I renumber the points I'll reuse like this: [JA:1] There are relatively clear typographical rules for when to use I[TALIC] (in languages using italic) [JA:3] Underline is primarily "poor man's italic" (from the age of the typewriter), but is also used for special effects (like hypertext) [JA:B] It is important to discern between representation and presentation. EM /represent/ an emphasis, it might be /presented/ using an italic font, or by having "/" on each side of the content. [JA:D] People are inconsistent coders. No matter how structured XML becomes, you can't avoid this. [JA:E] Automated translations to/from XML is desirable, and so is minimization of information loss in the process. >[1]. If something is described as "typographic" or "typographical", it is >likely to be presentational, rather than semantic or structural. Rarely. Usually typographical rules are there to convey an idea in a regular way. The look is presentational (like which quotes to use), but the idea structural (short story titles should be in quotes). Typographical rules are not standardized (Norwegian typographical rules are similar but different to English rules, and the further away the language, the more different the rules) and they are not one-to-one. Still they give valuable metainformation. And you would want the final (print) result to be presented according to typographic rules. >[2]-[6]. [on the rottenness of word processors, and the greatness of the HTML4 + CSS combo] I have no beef with these ones. >And I'll use these statements in my arguments. >[A]. It has been clearly established by W3C Recommendations that B/"bold" >[B]. It has been clearly established by W3C Recommendations that I/"italic" >[C]. It has been clearly established by W3C Recommendations that U/"underline" This is the ortodoxy, and for HTML 4.0x the rules. I don't fully agree, hope it is clear where I agree and where I disagree. >[E]. Automatic translation of presentational documents (such as typical word >processor documents) to/from XML documents is best done using inline styles on >the spans of text that are styled. A word processed document is semi-structural. I'd like to take care of the "semi", but I know I can cop out with SPAN/CLASSes as needed. >> C. Non-HTML documents are semi-structured (as are HTML/XML documents). >Semi-structured might as well mean unstructured. This semi-structure is >typically ascertained by white space and styling, which can only be said to be >presentational, and certainly not necessarily structural [2]. There *is* structure in thar documents, enough that it is worth keeping. >> D. People are inconsistent coders. No matter how structured XML becomes, >> you can't avoid this. >Agreed. But it is much harder to code "tag soup" when your code must be well >formed. Wellformedness is immensely valuable for interoperability, but if you want your ADDRESS to have the same meaning as my ADDRESS more than wellformedness is needed. Almost dregging up another age-old discussion, structural elements with no consensus of meaning are worse than the "presentational" elements. >> I is used to represent a half-dozen meanings [JA:1], one of which is emphasis. >This is backwards. "italic" is one way of styling emphasis. Call it reverse engineering if you want. When in a normal text a word or a phrase is italicized, it is so for a reason. Some people overuse italic, but if they are in a publishing company or in a similar role, house rules will encourage them to stick to the standards, or editors may proof the formatting. >A better approach is to avoid presentational media-dependent tags, and to add >new semantic tags instead, e.g. use <shiptitle> in your DTD for the above >example, and then style them as appropriate for the audience, e.g. It is the best alternative, and also in the general context the least realistic one (a shipping company might). It would be nice to have coding like <person class="politician"><firstname>...</firstname> <lastname>...</lastname></person> in a free text, but I don't know if it is possible. >> Even the catch-all is useful, and often at the limit of what >> authors can handle (if they don't understand when to use italic, they won't >> understand how to use any other markup) [1CD]. >But where does it end? Do we replicate all presentational styling as markup? >Do you propose the FONT tag mess all over again? Some nineteenth century texts used gothic and roman typefaces for different kinds of texts, and <g> and <r> might have been suggested if HTML had been defined then. Otherwise typefaces have no semantic value, and neither does big/small (which I'm happy to see dropped from the XHTML 1.1 proposal). >Yes, and authors should only ever use a single exclamation point (!), but >there are certainly plenty of examples of people using double exclamation >points (!!) or more. The reality is that there are more than just two levels >of "em"phasis (none or some), and allowing EM EM acknowledges that. And it gives me the opportunity to "stylesheet away" this kind of overemphasizing. Now, if there were a way to remove superfluous exclamation marks using CSS... >HTML4 *by itself* is very poor at representing even what simple ten-year old HTML4 by itself (with class and span) is very good at representing, but poor at presenting. At 17:55 03.04.00 -0400, Jelks Cabaniss wrote: >Jonny Axelsson wrote: >> BI(TT) are in a different category. My thought about this here: >> <http://lists.w3.org/Archives/Public/www-html/2000Feb/0250.html> > >I read that when you first posted it and was mystified. I just re-read it, with >a similar reaction. Your summary: >> ... Of all HTML elements, I and B are the only truly universal ones. >is astouding. Terms such as Italic, Bold, Underline, Strike-through, and Font >apply to _visual_ media, such as a printed page or your PC's screen; to braille >and audio devices they are *meaningless*. Non-visual presentation benefits from clear and well understood rules too. Old typewriters couldn't represent italic (they could however represent bold by overtyping, fortunately this was rarely done), and they used _underline_ instead. [JA:3] Your argument says that since typewriters can't represent italic, italic shouldn't be used. But the mapping italic <--> underline is unambigious, as may any other mapping, like italic <--> female voice.
Received on Tuesday, 4 April 2000 16:17:25 UTC