Re: strange behavior? in wpt / css-text line-breaking test issue

Hi Atsushi,

I'm not quite familiar with fonts in wpt, but I see that Ahem is recommended here:

  https://web-platform-tests.org/writing-tests/general-guidelines.html#be-cross-platform

And looking at the test in https://github.com/web-platform-tests/wpt/blob/f294f587fdba42782cf64cbb6f42108fc661387a/infrastructure/assumptions/ahem.html#L291-L311 , it seems to support at least some CJK characters.

~xfq

> On Dec 7, 2021, at 11:24, Atsushi Shimono (W3C Team) <atsushi@w3.org> wrote:
> 
>  hi all, (sorry in English for public-i18n-japanese)
> 
>  I'd want to ask help or advice from whom could know (or encountered to similar ones) on
> a possible issue of periodic process by wpt.
> 
>  TL;DR; (in short) possible broad issue on font in tests ('tofu' error)
> 
> 
> 1) The root issues which initiated this survey are:
> https://github.com/w3c/jlreq/issues/274 (for jpan-gap, line-break not working for some browsers)
> https://github.com/web-platform-tests/wpt/issues/31021 (results of line-break-loose-hyphens-001 seem not valid)
> 
>  For first one, even there are several not valid results exist in wpt results, there are
> some unimplemented cases in browsers, but again which failures (in wpt) are unimplemented
> and are not valid test outcome need to be distinguished.. (of course!)
> 
> 
>  For line-break-loose-hyphens-001, the most recent results are:
> https://wpt.fyi/results/css/css-text/line-break/line-break-loose-hyphens-001.html?label=experimental&label=master&aligned
>  live test files are:
> http://wpt.live/css/css-text/line-break/line-break-loose-hyphens-001.html
> http://wpt.live/css/css-text/line-break/reference/line-break-loose-hyphens-001-ref.html
>  screenshots for results in wpt are:
> chrome: https://wpt.fyi/analyzer?screenshot=sha1%3A62184ca0e5591687ca98b3702fb02e5078b3a727&screenshot=sha1%3Af948379acf4a4ff38703bceb6d1c737f7638a648
>  fonts are shown as 'tofu', which does not have 1em (~0.4em?), which makes test will never work
>  also real issue confirmed with local Chrome (Windows)
> edge: https://wpt.fyi/analyzer?screenshot=sha1%3Aac50bd0a260a1bfd3a589a9bfb5be8804fa15212&screenshot=sha1%3Aeb0a637f8f1263bb5d4d3a3efd4367edceee2246
>  fonts are shown correctly, real issue confirmed with local Edge (Windows)
> firefox: https://wpt.fyi/analyzer?screenshot=sha1%3A8dd71c721b7d50aca4e18043d075ed5ba3d6254b&screenshot=sha1%3A496d84adc1707136077d4424f82ee51be2c31ebe
>  fonts are shown as 'tofu', which does not have 1em (~0.8em?), which makes first test will never work (2x0.8 + hyphens <= 2em)
> safari: https://wpt.fyi/analyzer?screenshot=sha1%3A2e90a49c9fbddfe7666a066e0166266a479d43b6&screenshot=sha1%3A565fe0d1702106ebd2470ce57ce30e75051fa21f
>  fonts are shown correctly, real issue confirmed with local Safari (MacOS)
> 
> 
> 2) So, two root causes exist here. One is real issue of implementation (Chrome, Edge, Safari),
> and another is incorrectly picked font (Chrome, Firefox).
>  This tests hyphens with UAX#14 ID characters, and definition in css-text-3 is:
>> The following breaks are allowed for loose line breaking if the preceding character belongs to the Unicode line breaking class ID [UAX14] (including when the preceding character is treated as ID due to word-break: break-all), and are otherwise forbidden: 
> which does not have any condition relates to 'lang' attribute specified to the target element.
> So, even we have lang="en" on this test (as now), this test case is valid and should be handled
> correctly by implementations. Character used is U+6587:
> https://util.unicode.org/UnicodeJsps/character.jsp?a=6587
> 
> 2a) For second point, it does not happen on Safari or Edge (both seems picking zh-hant font?),
> and I thought we could just change lang to html element into something other like zh-hant/hans
> or ja, which should have 1em glyph by default (for the first moment).
>  But looking into other tests in wpt, it seems there are several similar cases of 'tofu',
> with some confused outcomes...
> 
> 
> 3) And, I've checked several others to find solution and/or possible similar cases.
> 
>  In css-ruby, ruby-intrinsic-isize-001, whose test cases has html lang="ja"
> test: http://wpt.live/css/css-ruby/ruby-intrinsic-isize-001.html
> results: https://wpt.fyi/results/css/css-ruby/ruby-intrinsic-isize-001.html?label=experimental&label=master&aligned
> chrome: https://wpt.fyi/analyzer?screenshot=sha1%3A39ea5f5efcf9845ff20643f8e5e9756cff284d6f&screenshot=sha1%3A16cce8d6657ed23bba754e755fed55aa07468d29
>  Firefox passes on this test (no screenshot provided). Chrome screenshot show 'tofu' fonts
> for this screenshot.
> 
>  In css-contnet, there are several multi language tests on quotes. quotes-016 tests Japanese
> quote, and all browsers pass.
> test: http://wpt.live/css/css-content/quotes-016.html
> result: https://wpt.fyi/results/css/css-content/quotes-016.html?label=experimental&label=master&aligned
>  This test is consisted of two lines, one with 'q' element, one replacing with &#xXXXX;.
>  Considering this format with possible 'tofu' replacement, I believe test should pass if
> browser implements correctly, even with glyph as 'tofu'. (two lines will have identical
> width of glyph)
> 
>  In the same suite, test for fallback of multiple region (like ja to ja-JA) is added as
> quotes-034, which fails in chrome and firefox with screenshot.
> test: http://wpt.live/css/css-content/quotes-034.html
> result: https://wpt.fyi/results/css/css-content/quotes-034.html?label=experimental&label=master&aligned
> chrome screenshot: https://wpt.fyi/analyzer?screenshot=sha1%3A16cb9a1153f028415b7bef0896a945df2a1e2b31&screenshot=sha1%3A730970202b6b9a65dddc5aacaac044e33a102426
> firefox screenshot: https://wpt.fyi/analyzer?screenshot=sha1%3A1288b3218b0ab384f993fd89c8be35ffbb479248&screenshot=sha1%3A8b062b8e99ae2fa30dc4a996e8fa14168be301a2
>  This test has html lang="en" and lang is specified per line (wrapped as p). Reference is
> written with &$xXXXX; presentation.
>  Both screenshot has 'tofu' character for Japanese lines.
> 
> 
> 4) I think some tests contributed from i18n WG to wpt have lines for web fonts, like Arabic,
> Nko, or Mongolian, but I thought these are added not to have side-effect from glyph in font
> for complex shaping tests.
>  Although I haven't encountered any test contribution using CJK, I thought we don't need
> to have similar lines to load web font files if one does not have such complexity...
> 
> 
>  Does anyone have some advice or knowledge? Like, if we rely on non-Latin characters (or
> something,, say sets of DejaVu - widely used in old age although...), we need to include
> lines of web font...
>  I should miss some manual on this area, and sorry if there is something clearly stating
> this in wpt manuals.
> 
> 

Received on Tuesday, 7 December 2021 05:45:57 UTC