W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > October 2011

[Bug 14360] Count Unicode 'combining marks" together with "inter-element whitespace"

From: <bugzilla@jessica.w3.org>
Date: Mon, 03 Oct 2011 21:38:08 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1RAqDA-0004gt-2R@jessica.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=14360

--- Comment #7 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> 2011-10-03 21:38:07 UTC ---
(In reply to comment #6)
> (In reply to comment #0)

> > 1)  White space collapsing means that the combining character doesn't really
> >      combine with the space character
> 
> Why would this be a problem?

Because the assumption is that the author wants to represent the combining mark
as an letter in itself, as if it was a spacing character. The assumption is
also that he/she wants this character to behave equally regardless of where it
is placed, and not that it becomes extra difficult to control it each time it
(or space+combiningMark) is the first character(s) of the line.

> > 2)  Combing marks that combines with nothing or space, are hard to select with
> > the mouse
> 
> Why would they be any harder than combining with a letter?

Because it is typically difficult to select a combining character. One
typically selects the base character plus the combining character as a whole.
And then, if there is no base character, it is rather understandable that it
becomes hard to select it.

Section '5.11 Editing and Selection' of Unicode 6. 0 has some details about
these matters. For example, it says: "In some cases, editing a user-perceived
“character” or visual cluster element by element may be the preferred way." (On
Mac OS X, it is typically impossible, e.g. in TextEdit.app, to select the
individual parts of a visual cluster. It is perhaps no accident that Mellel.app
— which is made in Israel and thus cater well for Hebrew text, allows better
control of combining parts.)

You can also try this:

1. Go to saved Live DOM viewer page given in comment #0:
    http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1167
2. Double-click/select the red accent (and nothing else) on line 1
    (not sure you will be able to select the one in line 1 at all).
3. Try to copy it
4. Try to paste it somewhere in the editing field of Live DOM viewer

    RESULT: Nothing gets copied and nothning gets pasted. 
                 (Hm, well it did work in Opera and IE8. 
                  But not in Webkit or Firefox. I used a Mac, apart for iE)

Now, repeat step 1 - 4 for the red accent on the next line - the second line.

   RESULT: It is much simpler to select the nobreakspace+accent 
                 combination. The combo gets copied and it gets pasted.

> > 3)  Visually, such marks may look as if they combine with something outside the
> > element
> 
> They might well combine with something outside the element's border box. Why is
> this a problem?

The situation I described was one where it *looks* as if it it combines with
something (that is: with something unvisible) outside the element.  That is: A
situation where there is nothing to combine with. (For all I know, it combines
withe box - rather than a character - outside the element.)

If the combining character is inside an element with display:inline-block, and
combines with another character in a mathml element, then that is another
matter - and not a problem. 

> > 4)  When the first letter is a combnining mark, then the CSS *:first-letter{}
> > selector may seem, to authors, to not work
> 
> Why not? It would do exactly what CSS says it should, no?

I said "may seem to not". I did not say "does not". (In addition, there are
bugs.)

Take a look at this test case:
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1174

(1) The two first line represents the character as a spacing character and as a
combining mark that is combined with a no-break space. Those two lines are
almost identical in look.  The last two lines represents the character as
combing character alone and as combining character combined with space. As far
as I can tell, both of the last two lines are identkical.

(2) The lines are centered. And in every browser, the accent in the two first 
lines are either equally centererd or nearly equally centered, while in the two
last lines, the characters are to the left of center.

(3) CSS:firstletter - background color. In none of the browsers are there any
background color for the character on the last two lines, while the first two
lines have background color (at least in Opera and Webkit.) This is no wonder
when we consider that the character in the test is a **non-spacing** acute
[U+0301].

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Monday, 3 October 2011 21:38:09 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:02:05 UTC