- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Tue, 01 Nov 2011 20:21:29 -0400
- To: Charles Pritchard <chuck@jumis.com>
- CC: www-style@w3.org
On 11/1/11 6:22 PM, Charles Pritchard wrote: > Worst case -- and I'm sure you'll correct me if I'm wrong -- the browser > can just create <span> elements inside the shadow dom and use all of the > existing optimizations. This still saves authors from having those DOM > elements cluttering the public DOM. You may well be wrong. In practice, authors tend to not want to style every 3rd letter or whatever in the document. They actually want to style every third letter in some small section of it. Doing that from script is actually easier, because you don't have to, for every dynamic change to the DOM, check whether it happens to be in the region you're interested in. Especially if you, the author, happen to know that region is completely static. That's something the browser _never_ knows. Let's be realistic here. We have current browsers that don't even implement the '+' combinator correctly because they think it's too slow to do that.... I'd really like implementors to get that sort of thing fixed before adding more purposefully-broken support for features. >>> nth-letter is specified in the same manner as ecmascript substr. >> >> That's a completely bogus definition once you're out of the BMP. >> >> Can we please stop defining these some-western-language-only kind of >> things? >> > These are based on byte ranges, not on western-language. But in practice, all Western languages live in the BMP, so people who only focus on those tend to not care about non-BMP issues (like using the ES definition of "letter", say!). > I understand > UTF8 as well as the ambigious nature of the word "word" and "letter" > when applying them universally. There are agglutinative languages where > a single word may comprise an entire sentence. There are scripts where > "letter" is not so easily defined. That's actually a completely separate issue from BMP vs non-BMP, and a quite valid one, but it's been brought up already and has nothing to do with the substr() definition of "letter". > I'm very happy to explore those issues, and I'm sure authors using those > languages and scripts are aware of the issues from the moment they start > programming with ECMAScript and styling with CSS. The problems come when the author of the ES or CSS is not the author of the content and knows nothing about the issues. > We can't "stop" definining these from a western-language bias. "en" is > the standard fallback in most specs; 7-bit ascii, roman script, in a > 1-byte is the most compatible way of working with content. That's just > historical cruft. My point is we should be trying to move away from that insofar as it affects the content, not adding more barriers to writing in whatever language people want to write in. > If you'd prefer express things in UTF-8, that's fine: WebIDL uses > DOMString. I get that. UTF8 has excellent support nowadays. None of which has to do with my issue with substr)(). > An author working with a script where nth-letter is not > functional/relevant is simply not going to use that selector. I was specifically talking about scripts where the concept of "letter" makes sense (so it's _relevant_), but that don't live in the BMP. I see no reason, if we do this at all, why we'd by-design make it not _functional_ for them. -Boris
Received on Wednesday, 2 November 2011 00:22:10 UTC