- From: Christoph Päper <christoph.paeper@crissov.de>
- Date: Thu, 26 Apr 2007 21:20:13 +0200
- To: www-style <www-style@w3.org>
Mike Bremford:
>
> 1. It's generally felt that generated quotes should be copied in
> most circumstances.
> 2. Whether list bullets should be copied is debatable, but
> certainly depends on the receiving application. Plain text, yes. MS
> Word or HTML, no, because what we should be doing is copying rich
> text with styling information that says "bullet this item".
> 3. Generated newlines should generally be copied, although see (2).
> 4. No word on other generated content yet - eg h1::before
> { content:"Chapter" counter(chapter) } - which I hope we're going
> to see more of as CSS3 rolls out.
The semantics should be copied, not the style. How this works in
practice depends on the kind of application pasted into.
Plaintext is twofold, because semantics may be encoded to different
degrees, e.g. sentences and their basic types are always encoded in
most languages today. Hypertext is a superset of plaintext, therefore
it inherits its micro-semantic features. Even the simplest forms of
plain text allow short quotes to be marked with leading and trailing
glyphs. Thus HTML 4's |q| element type was a mistake (and perhaps
related CSS properties and values, too).
"Advanced plaintext", e.g. as used on Usenet, in e-mail and certain
wiki-languages, allows for a few more character-based semantic hints,
which, like in traditional print, are indistinguishable from
stylistic marks, e.g. /italic/, *bold* and list bullets. Clients for
this kind of text should convert HTML content from the clipboard or
other methods of import to use said marks. In most circumstances
there is no commonly accepted standard for them, though. By the same
means such applications might choose to substitute certain glyphs,
depending on the capabilities of the technical environment.
To phrase it differently, if I pasted some HTML containing an
instance of |q| into the e-mail editing window I am using right now,
I expected quote marks, but not (necessarily) those specified by some
CSS just like I expect two line-breaks, i.e. one empty line, between |
p|aragraphs. (Yes, indentation would be fine, too, but is much less
common.)
I cannot deny that these observations probably have a Western bias.
Btw., I never understood why there are two Kanas on the character
encoding level, because it seems Hiragana is comparable to the
European use of upright, with Katakana being the equivalent to italic
or any other font variant (although the looks are reversed, Hiragana
being the cursive form).
Received on Thursday, 26 April 2007 19:20:17 UTC