- From: Peter Moulder <peter.moulder@monash.edu>
- Date: Wed, 12 Oct 2011 23:51:35 +1100
- To: www-style@w3.org
On Fri, Oct 07, 2011 at 03:58:29PM +1100, I wrote:
> one ... issue to consider [is] whitespace handling.
A related issue is the question of what language the resulting
string(...) content should be assumed to have. This affects things like
glyph choice, the behaviour of text-transform (e.g. what the uppercase of
"i" is), and (of lesser importance, for gcpm purposes) pronunciation if
spoken. The corresponding string-set text might include multiple
languages. Thus, it seems clear that the best behaviour (ignoring
implementation cost) would be if language information were considered a
part of the "unstyled text".
If we were to take that approach for language, then an obvious proposal
for the white-space issue would be for whitespace significance
information to be similarly considered part of the "unstyled text" that
gets copied.
[I describe this approach as an "obvious" proposal only by considering
the XML's native mechanisms for representing language and whitespace
significance information, which are very similar to each other (xml:lang
and xml:space attributes). Whereas when considering HTML, it's not
quite so natural or appealing a proposal.]
As for how important it is for language-related behaviour to be preserved
("behave correctly"), for purposes of trading off against implementation
costs and likelihood/promptness of this and other features getting
implemented): it's quite natural for headings to include snippets of
foreign-language text, and I imagine (without having had a browse in a
library to check) that it's fairly common to want to apply text-transform
in margin box text. With the usual disclaimer that I'm not very familiar
with authoring practice & needs, I'd estimate the importance as greater
than the importance of doing the "correct" thing for white-space; though
I'd nevertheless expect that it affects only a very small proportion of
documents, and (as with the white-space issue) there's a workaround in
the form of using running elements instead of named strings.
pjrm.
Received on Wednesday, 12 October 2011 12:52:06 UTC