Re: CSS Generated content selection

Mike Bremford:
> 1. It's generally felt that generated quotes should be copied in  
> most circumstances.
> 2. Whether list bullets should be copied is debatable, but  
> certainly depends on the receiving application. Plain text, yes. MS  
> Word or HTML, no, because what we should be doing is copying rich  
> text with styling information that says "bullet this item".
> 3. Generated newlines should generally be copied, although see (2).
> 4. No word on other generated content yet - eg h1::before  
> { content:"Chapter" counter(chapter) } - which I hope we're going  
> to see more of as CSS3 rolls out.

The semantics should be copied, not the style. How this works in  
practice depends on the kind of application pasted into.

Plaintext is twofold, because semantics may be encoded to different  
degrees, e.g. sentences and their basic types are always encoded in  
most languages today. Hypertext is a superset of plaintext, therefore  
it inherits its micro-semantic features. Even the simplest forms of  
plain text allow short quotes to be marked with leading and trailing  
glyphs. Thus HTML 4's |q| element type was a mistake (and perhaps  
related CSS properties and values, too).

"Advanced plaintext", e.g. as used on Usenet, in e-mail and certain  
wiki-languages, allows for a few more character-based semantic hints,  
which, like in traditional print, are indistinguishable from  
stylistic marks, e.g. /italic/, *bold* and list bullets. Clients for  
this kind of text should convert HTML content from the clipboard or  
other methods of import to use said marks. In most circumstances  
there is no commonly accepted standard for them, though. By the same  
means such applications might choose to substitute certain glyphs,  
depending on the capabilities of the technical environment.

To phrase it differently, if I pasted some HTML containing an  
instance of |q| into the e-mail editing window I am using right now,  
I expected quote marks, but not (necessarily) those specified by some  
CSS just like I expect two line-breaks, i.e. one empty line, between | 
p|aragraphs. (Yes, indentation would be fine, too, but is much less  

I cannot deny that these observations probably have a Western bias.  
Btw., I never understood why there are two Kanas on the character  
encoding level, because it seems Hiragana is comparable to the  
European use of upright, with Katakana being the equivalent to italic  
or any other font variant (although the looks are reversed, Hiragana  
being the cursive form).

Received on Thursday, 26 April 2007 19:20:17 UTC