W3C home > Mailing lists > Public > public-i18n-bidi@w3.org > July to September 2013

[Bug 23260] Make the dir attribute use isolation instead of embedding

From: <bugzilla@jessica.w3.org>
Date: Tue, 24 Sep 2013 22:09:04 +0000
To: public-i18n-bidi@w3.org
Message-ID: <bug-23260-3860-PSqkCnq6GY@http.www.w3.org/Bugs/Public/>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=23260

--- Comment #9 from Ian 'Hixie' Hickson <ian@hixie.ch> ---
> When we filed bug 10807, we were afraid to make backward-incompatible changes. 

Why are we not afraid any more? What's changed?

In bug 10807 comment 19, you wrote:

| Also, it is possible that adding isolation by default would break existing 
| documents. This is also the argument against doing isolation by default any 
| time the dir attribute is set.

In bug 10807 comment 14, you wrote:

| In fact, if <a> and <q> were being invented today, we would want isolation
| for them by default - but we dare not do that now because it would most
| certainly break some existing documents.

I agree! This change would "most certainly break some existing documents". This
usually makes it a non-starter. Do we have data now suggesting that we are
wrong to expect breakage?


> However, we did not suggest adding <bdi>. What we asked to do was to add a
> new boolean attribute that would control isolation. Thus, <span dir="rtl"
> ubi> and <span ubi> would be isolating; <span dir="rtl"> (or <span dir="rtl"
> ubi="off") would not.
> 
> That proposal was not accepted.

That proposal wasn't a problem, it was a solution. When the problems were
described, it turned out that an attribute didn't make sense to solve them. (I
note that with this bug, again, initially just a solution was described, with
no description of the problem.)

I'm still not sure I understand what has changed since then. Why are the use
cases that led to <bdi> no longer satisfied by <bdi>?


> You proposed doing isolation via an element instead of an attribute in bug 
> 10807. I objected several times in follow-up comments

Objecting without providing a reason ("I still prefer attribute as the more
easily used and less disruptive solution", bug 10807 comment 26) is not a
useful objection. Bug 10807 comment 14 is the only place where you attempted to
provide an actual reason to prefer a global attribute, but I provided counter
arguments in bug 10807 comment 17 and bug 10807 comment 27.


Indeed in bug 10807 comment 22 I specifically proposed exactly what this bug is
now asking for:

| How could it break an existing document? Might it not fix as many if not
| more documents than it breaks?
|
| Indeed, doing this automatically any time dir="" is explicitly set might
| not be a bad idea either... do we have any data on how many pages would
| change rendering if we did this? Might this not actually make more sense
| overall?
|
| I'm very much of the opinion that we should make this work as automatically
| as possible, because there's no way most authors are going to learn or
| understand this stuff.

...and it was only because of your detailed arguments that I abandoned that
line of research — see the last 17 paragraphs of bug 10807 comment 26 (also
quoted at the end of this comment).


(In reply to Aharon Lanin from comment #4)
> 
> <bdi> can not continue to be the only or even the primary way to achieve
> isolation in markup, since it relegates isolation to being a little-known
> power tool instead of the default for bidi content

I don't see why <bdi dir=""> need be a "little-known power tool". Plenty of
pages use <a href="">, which is arguably more complicated.


> and since using a
> special element for this purpose is impractical in some scenarios.

What scenarios? The scenarios for which <bdi> was invented are those listed in
bug 10807 comment 16, primarily the first of those two. Why would <bdi> not
work for those? Or are there new use cases that haven't been listed yet?


> - As long as isolates are more difficult to set up than embeddings,
> embeddings will be the default, and isolates the exception; the use of
> isolates will not replace the use of embeddings.

I don't see why

   <span dir="ltr">...</span>

...is any easier than:

   <bdi dir="ltr">...</bdi>


> - A single attribute has historically been and should continue to be
> sufficient to do all the bidi in HTML.

I disagree with both premises in this statement. It's never been sufficient,
and even if it was, that isn't a reason to continue that way. (Or not continue.
It's just not a relevant factor.)


> Why should the preferred way to embed
> opposite-direction content inline now require the use of both a
> special-purpose element (<bdi>) and a special attribute (dir)?

For the same reason that every semantic in HTML requires a special-purpose
element, basically, with the attribute to provide fine-grained control. That's
how HTML works.

Global attributes make sense when they apply globally, but isolation only makes
sense at the phrasing level (since all non-phrasing elements are always
isolated, due to earlier changes). So if we're not to just change dir=""'s
semantics, something which three years ago you convincingly argued we should
never do, then an element makes sense, as you agreed in bug 10807 comment 28.


> - HTML document authors must be instructed that when a “block” element like
> <p> gets opposite-direction content, they should indicate it by putting a
> dir attribute on that element. For “inline” elements, however, it depends.

You say this like it's complicated, but I don't think it is. All you have to
say is "Set your text directionality on your container element, such as <body>
or <p>, using the dir="" attribute. When you embed text from a different
directionality inside other text, use a <bdi> element".

(Note that HTML has neither "block" nor "inline" anymore. There are "flow"
elements, like <p>, <li>, <bdi>, or <span>, and some of those are also "phrase"
elements, like <bdi> and <span>.)


> An element like <textarea> or <input> or <option> whose content is
> inherently “out-of-flow” and thus directionally isolated can also get the
> dir attribute directly on it. However, when an “ordinary” “inline” element
> like <cite> gets opposite-direction content, they should not just put the
> dir attribute directly on it, but on a special <bdi> element especially
> inserted for that purpose either within the <cite> or around it. (Which, by
> the way?) The distinctions are impossible to justify or explain!

As I said in bug 10807 comment 17: "By that argument, we shouldn't have <bdo>,
or indeed <a> (many links are given on elements that are already in the markup)
or indeed many of the phrasing elements... I don't think that argument holds
water". Authors have no trouble figuring out that they can do <a><cite>, why
would they have trouble figuring out they can do <bdi><cite>?


> - When an HTML or XHTML document tags a data item with microformatting or
> some other form of data export, it makes good sense to also indicate the
> data item’s direction using an attribute on the tagged element, so that
> consumers of the data will know how to display it properly. It makes little
> sense to put it on a surrounding element, where consumers of the data will
> ignore it (unless they bother to ask for the tagged element’s computed
> direction style) or on an element especially introduced within the tagged
> element for the purpose of carrying the attribute, suddenly turning what had
> been a nice plain-text data item into HTML. If the attribute goes on the
> tagged element, and it happens to be inline, we want it to be isolated, so
> now the tagged element suddenly has to be <bdi>. Do we need to update the
> RFCs on microformatting to require the use of <bdi> for all microformatting
> (except where a “block” element is used)?

I'm not sure what you mean here.

If you're referring to microdata, then it completely ignores directionality, so
it doesn't matter where you put the dir="" attribute (microdata only operates
on text strings). If you mean microformats, then Tantek informs me that it
honours HTML's semantics transparently, so it doesn't matter if the dir="" is
on the element with the class attribute, its parent, or its child.


> In brief, we must make it possible to set up bidi isolates by using the dir
> attribute alone.

Please respond to comment 10807 comment 26 (reproduced below) explaining why
those comments are no longer true.


> > Why are we changing this stuff _again_?
> 
> Actually, we aren't changing *this* stuff again.

By "this stuff" I mean anything touching bidi. It seems to me we've changed
bidi stuff in the HTML spec at least once a year for the past four or five
years.


The end of bug 10807 comment 26 follows:
===============================================================================
> Indeed, doing this automatically any time dir="" is explicitly set might not 
> be a bad idea either... do we have any data on how many pages would change
> rendering if we did this?

Authors do not always know what they are doing, especially when it comes to
bidi. Consider the following:

i spoke to JOHN. <span dir=ltr>susan</span>, MIKE and ollie spoke to him too.

Of course, the dir=ltr on susan is unnecessary, while dir=rtl would have been a
good idea on JOHN and MIKE, but like I said, people often get really confused
when it comes to bidi. Currently, despite all the nonsense, it is rendered as
intended:

i spoke to NHOJ. susan, EKIM and ollie spoke to him too.

With isolation snuck in by default, though, one would get:

i spoke to EKIM ,susan .NHOJ and ollie spoke to him too.

Not convinced? Let's try this one, as might be output by a web app that is
trying to visualize some sort of relationship between FOO and BAR, which are
names from its database:

Summary: FOO <span dir=ltr>==&gt;</span> BAR

This gets rendered as

Summary: OOF ==> RAB

The dir=ltr on the ==> was put in to prevent it from being displayed as

Summary: RAB <== OOF

which might not be to the app UI designer's liking for some good reason. Of
course, another way to fix this would have been with an &lrm; somewhere between
FOO and BAR, but nearly no one knows how to use &lrm;. Also, dir=rtl on both
FOO and BAR would have been a good idea, but that would not have fixed the UI
designer's original problem, and it may be that they had not yet run into the
issue of the names themselves getting garbled yet, so they did not do it. This
scenario is very, very realistic. Unfortunately, the introduction of
isolate-by-default onto the dir=ltr will break their fix and make their
application suddenly regress.

> Might it not fix as many if not more
> documents than it breaks?

The breakage that one gets due to lack of isolation, when it happens, is quite
obvious. If the page gets any QA, it will be found and fixed, somehow - if
anyone cares enough about it. If the page doesn't get QA, it likely has a dozen
other bidi problems that we won't fix automatically. Besides, one bug added due
to lack of backward compatibility is worth about a hundred that got fixed "for
free" - but which apparently no one cared about enough to fix themselves.

I am therefore extremely against doing isolation automatically any time dir is
specified. If we were inventing the dir attribute today, I would be all for it,
but not as things stand today.

Similarly, I would not do it automatically when lang is specified. This has the
additional handicap of being unimportant due to the low incidence of lang
attribute use, especially inline.
===============================================================================

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Received on Tuesday, 24 September 2013 22:09:06 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:24:40 UTC