- From: Christoph Päper <christoph.paeper@crissov.de>
- Date: Thu, 26 Apr 2007 21:20:13 +0200
- To: www-style <www-style@w3.org>
Mike Bremford: > > 1. It's generally felt that generated quotes should be copied in > most circumstances. > 2. Whether list bullets should be copied is debatable, but > certainly depends on the receiving application. Plain text, yes. MS > Word or HTML, no, because what we should be doing is copying rich > text with styling information that says "bullet this item". > 3. Generated newlines should generally be copied, although see (2). > 4. No word on other generated content yet - eg h1::before > { content:"Chapter" counter(chapter) } - which I hope we're going > to see more of as CSS3 rolls out. The semantics should be copied, not the style. How this works in practice depends on the kind of application pasted into. Plaintext is twofold, because semantics may be encoded to different degrees, e.g. sentences and their basic types are always encoded in most languages today. Hypertext is a superset of plaintext, therefore it inherits its micro-semantic features. Even the simplest forms of plain text allow short quotes to be marked with leading and trailing glyphs. Thus HTML 4's |q| element type was a mistake (and perhaps related CSS properties and values, too). "Advanced plaintext", e.g. as used on Usenet, in e-mail and certain wiki-languages, allows for a few more character-based semantic hints, which, like in traditional print, are indistinguishable from stylistic marks, e.g. /italic/, *bold* and list bullets. Clients for this kind of text should convert HTML content from the clipboard or other methods of import to use said marks. In most circumstances there is no commonly accepted standard for them, though. By the same means such applications might choose to substitute certain glyphs, depending on the capabilities of the technical environment. To phrase it differently, if I pasted some HTML containing an instance of |q| into the e-mail editing window I am using right now, I expected quote marks, but not (necessarily) those specified by some CSS just like I expect two line-breaks, i.e. one empty line, between | p|aragraphs. (Yes, indentation would be fine, too, but is much less common.) I cannot deny that these observations probably have a Western bias. Btw., I never understood why there are two Kanas on the character encoding level, because it seems Hiragana is comparable to the European use of upright, with Katakana being the equivalent to italic or any other font variant (although the looks are reversed, Hiragana being the cursive form).
Received on Thursday, 26 April 2007 19:20:17 UTC