Re: [csswg-drafts] [css-text-3] Segment Break Transformation Rules for East Asian Width property of A (#337)

The CSS Working Group just discussed `Segment Break Transformation Rules for East Asian Width property of A`.

<details><summary>The full IRC log of that discussion</summary>
&lt;dael> Topic: Segment Break Transformation Rules for East Asian Width property of A<br>
&lt;dael> github: https://github.com/w3c/csswg-drafts/issues/337#issuecomment-446842105<br>
&lt;dael> Rossen: Brought back from week before if I recall<br>
&lt;dael> Rossen: Additional comments from koji. Wanted to have koji comment.<br>
&lt;dael> Rossen: Do we have koji or enough from his feedback that we can discuss?<br>
&lt;dael> myles: I think I understand koji's feedback<br>
&lt;dael> Rossen: So we can make progress and see if can resolve<br>
&lt;dael> florian: Context is suppressing segment breaks in source code. If you have word space word space the segmenet break is converted to a space, but in lang without spaces we're having a part of the spec deal with suppressing. Non-controversial part has been shipped. Characters on both sides of break are unambig CJK<br>
&lt;dael> florian: What do we do when one side is ambig like ". Initial proposal was when seg break is lang tag as CJ and one side is ambig and the other unambig we suppress.<br>
&lt;dael> florian: emoji, though, was inconsistant. Some wide, some narrow, some ambig. WE proposed in spec to treak all emoji as ambig so if you had unambig Asian on the other side you suppress the break<br>
&lt;dael> florian: koji pushed back and myles agrees with pushback<br>
&lt;dael> myles: I think we can all agree on goal. If you have Chinese text line break in the middle shouldn't turn into a space.<br>
&lt;dael> myles: When I was reading spec the whole section on how to determine if suppress space proposal looks at EA Width and then emoji and then elsewhere and it seemed this wouldn't work in a lot of cases we haevn't through of. The more we try and fix this section the more complex it gets and the more we'll miss<br>
&lt;dael> myles: I think that's similar to koji where if you add a case for emoji you'll have to add a case to reduce set of emoji b/c unicode says more is emoji then people think. Instead of spec behavior only one browser impl we should let browsers experiment and try and come up with a better way, perhaps involving unicode consortium.<br>
&lt;dael> florian: Agree with part, but not all. Languages are complicated so if we want to cover all cases rules will be complicated. If we are not careful here and we add to many things we later want to remove that would be problematic.<br>
&lt;dael> florian: Being cautious about what to add, I would agree.<br>
&lt;dael> florian: On the other hand letting UA experiment that's not reliable for authors so they can't do anything. If both sides are clearly Asian there's no worry and we should do it. Ambig on one side and break is Asian and other side is Asian we're safe.<br>
&lt;dael> florian: Emoji we went through everything and found that we thought adding all of it was safe. I'd be okay with you double checking.<br>
&lt;dael> florian: I suspect there will be more areas of inconsistant. We will at some point say this is rare enough and we're not handling it. I think we should solve enough that East Asian can have linebreaks.<br>
&lt;fantasai> proposal wrt emoji is in https://github.com/w3c/csswg-drafts/issues/337#issuecomment-444316214<br>
&lt;fantasai> you can see the entire list of affected characters<br>
&lt;dael> florian: I would say let's be careful with what we add. I thinkw e have been with emoji. There is a slope here, but we can decide how far down we go<br>
&lt;dael> myles: Rather then going half way down and saying no more, we should investigate another approach<br>
&lt;dael> florian: Do you have a suggestion on antoher type of approach? I feel this will be about subsets of unicode things. How to do it may have strategies.<br>
&lt;dael> myles: I don't have a specific prop and that's why I think more room to experiment. I don't think we're at a point where we can say some should and some should not react this way. I think we're at an early phase.<br>
&lt;dael> fantasai: I don't think that's the case. Spec has used EA prop and no one has said we shouldn't use that. THe details of how we're using it we're finding in some cases it needs to be tailored to do it in  the same way. Smyle face is neutral and grinning is wide, but author won't expect that.<br>
&lt;dael> fantasai: I don't think it makes sense for us to have an env where they can't know that their space will get eaten by changing  smily. Rule florian has is there are subsets of emoji where we don't know why they're wide.<br>
&lt;dael> florian: They're mostly classified due to what legacy encoding they came from<br>
&lt;dael> myles: I agree EA Width doesn't work well. Possible solution is don't use EA Width and I'd like to persue that<br>
&lt;dael> fantasai: Alternative is script + script extensions propery. Other then that it's creating a custom list which we won't do.<br>
&lt;dael> florian: That's because mantenence.<br>
&lt;dael> florian: EA Width spec says it should be tailored<br>
&lt;dael> koji: I agree with myles. EA Width is designed to be compat with encoding. Not designed for this purpose. We'll see lots of inconsitancies. Options are live with inconsitancies. If we don't want that, don't use EA Width<br>
&lt;dael> florian: My feeling is in terms of web compat if we add more cases to suppress it's safe. Removign is bad. If we find a more efficient approach later that characterizes more characers we can move to that. WE should be careful to not suppress spaces that really should be there. Even if way we reach char set is more complicated then you wish, that's nto a long term problem. If we find a better way in the future we can do that as long as we didn't include so many<br>
&lt;dael> florian: I think marking some of it at risk we can do that. But it's not going to do wrong behavior in a way that we can't walk back.<br>
&lt;dael> florian: So I propose mark as at risk, but leave it, and welcome experimentation<br>
&lt;dael> koji: If we find myles point logic on EA Width wasn't great it's not backwords compat<br>
&lt;dael> florian: Suggesting there are currently characters classified as wide that shouldn't suppress spaces? Because if there isn't any we're safe<br>
&lt;dael> myles: One happy medium is to say there are some sets of triggers that will or won't cause suppression. Other then that it's up to browser. Kinda like line breaking with some ristrictions<br>
&lt;dael> fantasai: Where you break the line isn't a big deal. It doesn't look really wrong with slight differences. But if there is a space in one impl and not antoher that's a real problem for the typesetting. If there isn't interop the user can't check their text, looks fine, load in another browser and there's lot sof space. We need interop and this isn't a good place for everyone decides.<br>
&lt;dael> myles: WE've go through to today<br>
&lt;dael> florian: No author uses line breaks in East Asian. We're trying to make it better<br>
&lt;dael> myles: Why not solve now because they're not using it<br>
&lt;dbaron> I think there are real web compatibility problems as a result of line breaking differences.<br>
&lt;dael> florian: Any solution will be a superset of what's speced today so I don't see why can't spec today. I'm willing to put part that you think is overkill at-risk<br>
&lt;dael> myles: As spec right now there is an algo where every string produces yes or no suppress.<br>
&lt;dael> florian: What I mean is if we say yes suppress to more things it won't cause web compat. If we say yes to fewer things it will. That's why I'm talking about supersets. If we add more things authors will be able to add more line breaks. So we can expand. Reducing is bad. So if there's a different solution later with a same size or larger set that's okay.<br>
&lt;dael> myles: I'd like to expand that set w/o going through WG<br>
&lt;dael> florian: I don't see how that works. Regardless of how we spec if browsers aren't interop it's not usable<br>
&lt;dael> myles: Already not<br>
&lt;dael> florian: Trying to make it usable<br>
&lt;dael> myles: So wait<br>
&lt;dael> florian: Wait until what? You say don't standardize and I say do.<br>
&lt;dael> Rossen: We're getting too argumentative and I'm not sure we're ready to resolve. Discussion is valuable and brings us closer to something where we can resolve. Doesn't feel we're there yet.<br>
&lt;dael> Rossen: Perhaps we can continue to work on this as part of the text inline focus group that will be proceeding F2F unles syou feel strongly we can resolve<br>
&lt;dael> florian: I don't feel we can. Taking offline for now and next time we meet we keep talking sounds...not as good as resolving, but we can't resolve<br>
&lt;dael> Rossen: But this conversation was great and gives room for people to continue<br>
&lt;dael> fantasai: I want to say I insist on 2 things. 1: we have defined rules all UAs must follow. 2: We're using unicode prop of some kind and not having CSS spec create a custom list<br>
&lt;tantek> +1 to fantasai's two rules<br>
&lt;dael> Rossen: We can rec. to people what they can do, we can't require it<br>
&lt;dael> fantasai: Then you'll be non-compat with spec<br>
&lt;dael> myles: I'd like to hear what unicode consortium has to say<br>
&lt;dael> fantasai: Their feedback is EA Width is not something they're putting effort into maintaining<br>
&lt;dael> florian: That convo goes off topic, it's hard to share it all.<br>
&lt;dael> fantasai: And we're explaining what we're doing and they ask if we're using UAX-14 properties and that's not helpful<br>
&lt;dael> Rossen: Your point is valid. This won't bring us closer to resolution.<br>
&lt;dael> Rossen: Let's table this and work more to get to something better for interop and for the web.<br>
</details>


-- 
GitHub Notification of comment by css-meeting-bot
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/337#issuecomment-448688033 using your GitHub account

Received on Wednesday, 19 December 2018 17:58:44 UTC