Re: strange behavior? in wpt / css-text line-breaking test issue

   hi Fuqiao,

On 2021/12/07 14:45, Fuqiao Xue wrote:
> Hi Atsushi,
> 
> I'm not quite familiar with fonts in wpt, but I see that Ahem is recommended here:
> 
>    https://web-platform-tests.org/writing-tests/general-guidelines.html#be-cross-platform


   Yes,, we might need to read
> Fonts cannot be relied on to be either installed or to have specific metrics. As such, 
> in most cases when a known font is needed,
as to be either installed other than us-ascii (or western?), or something...

> And looking at the test in https://github.com/web-platform-tests/wpt/blob/f294f587fdba42782cf64cbb6f42108fc661387a/infrastructure/assumptions/ahem.html#L291-L311 , it seems to support at least some CJK characters.

   Ah, yes. Aham has 278 glyph in total, 8 glyph not mapped, and all available characters are listed
at the above page (I've just checked with dump of ttf), like U+0020 to U+007E except for U+0027,
U+00A0 to U+00FF.
   most of all glyph are simply 1em black square box, even for US-ASCII, as written at:
https://web-platform-tests.org/writing-tests/ahem.html


   Considering our target,,,

1. line-break property

   84 files in css/css-text/line-break has font line:
- Ahem.css: 39 (line-break-anywhere, line-break-anywhere-and-white-space, line-break-anywhere-overrides-uax-behavior)
- mplus-1p-regular.woff: 32 (line-break-loose, line-break-normal, line-break-strict)
- NotoNaskhArabic-regular.woff2: 1 (line-break-shaping-001)
- no font specified: 12
     line-break-anywhere 001 - 003
     line-break-loose-hyphens 001 - 003
     line-break-normal-hyphens 001 - 003
     line-break-strict-hyphens 001 - 003

   For anywhere 001 to 003 (3 files), failure of 002 at firefox seems false positive (= FAIL detected),
and could update test with Ahem, editing some characters, and editing CSS (box size).
   For hyphens tests (9 files), I believe we can write valid test with using Ahem, but might
need serious consideration (not just replacing CJK Han character to available one in Ahem),
such as changing width of box and references.


2. other tests mentioned in previous email

   Since tests target specific character, like punctuation marks, we may be better to change
to mplus-1p woff?


3. for letter-spacing tests, under review as i18n test

   most uses mplus-1p woff as their font, where punctuation marks are tested, and should be ok.


   Still I'm not unsure whether analysis above are correct or not, although...


> ~xfq
> 
>> On Dec 7, 2021, at 11:24, Atsushi Shimono (W3C Team) <atsushi@w3.org> wrote:
>>
>>   hi all, (sorry in English for public-i18n-japanese)
>>
>>   I'd want to ask help or advice from whom could know (or encountered to similar ones) on
>> a possible issue of periodic process by wpt.
>>
>>   TL;DR; (in short) possible broad issue on font in tests ('tofu' error)
>>
>>
>> 1) The root issues which initiated this survey are:
>> https://github.com/w3c/jlreq/issues/274 (for jpan-gap, line-break not working for some browsers)
>> https://github.com/web-platform-tests/wpt/issues/31021 (results of line-break-loose-hyphens-001 seem not valid)
>>
>>   For first one, even there are several not valid results exist in wpt results, there are
>> some unimplemented cases in browsers, but again which failures (in wpt) are unimplemented
>> and are not valid test outcome need to be distinguished.. (of course!)
>>
>>
>>   For line-break-loose-hyphens-001, the most recent results are:
>> https://wpt.fyi/results/css/css-text/line-break/line-break-loose-hyphens-001.html?label=experimental&label=master&aligned

>>   live test files are:
>> http://wpt.live/css/css-text/line-break/line-break-loose-hyphens-001.html

>> http://wpt.live/css/css-text/line-break/reference/line-break-loose-hyphens-001-ref.html

>>   screenshots for results in wpt are:
>> chrome: https://wpt.fyi/analyzer?screenshot=sha1%3A62184ca0e5591687ca98b3702fb02e5078b3a727&screenshot=sha1%3Af948379acf4a4ff38703bceb6d1c737f7638a648

>>   fonts are shown as 'tofu', which does not have 1em (~0.4em?), which makes test will never work
>>   also real issue confirmed with local Chrome (Windows)
>> edge: https://wpt.fyi/analyzer?screenshot=sha1%3Aac50bd0a260a1bfd3a589a9bfb5be8804fa15212&screenshot=sha1%3Aeb0a637f8f1263bb5d4d3a3efd4367edceee2246

>>   fonts are shown correctly, real issue confirmed with local Edge (Windows)
>> firefox: https://wpt.fyi/analyzer?screenshot=sha1%3A8dd71c721b7d50aca4e18043d075ed5ba3d6254b&screenshot=sha1%3A496d84adc1707136077d4424f82ee51be2c31ebe

>>   fonts are shown as 'tofu', which does not have 1em (~0.8em?), which makes first test will never work (2x0.8 + hyphens <= 2em)
>> safari: https://wpt.fyi/analyzer?screenshot=sha1%3A2e90a49c9fbddfe7666a066e0166266a479d43b6&screenshot=sha1%3A565fe0d1702106ebd2470ce57ce30e75051fa21f

>>   fonts are shown correctly, real issue confirmed with local Safari (MacOS)
>>
>>
>> 2) So, two root causes exist here. One is real issue of implementation (Chrome, Edge, Safari),
>> and another is incorrectly picked font (Chrome, Firefox).
>>   This tests hyphens with UAX#14 ID characters, and definition in css-text-3 is:
>>> The following breaks are allowed for loose line breaking if the preceding character belongs to the Unicode line breaking class ID [UAX14] (including when the preceding character is treated as ID due to word-break: break-all), and are otherwise forbidden:
>> which does not have any condition relates to 'lang' attribute specified to the target element.
>> So, even we have lang="en" on this test (as now), this test case is valid and should be handled
>> correctly by implementations. Character used is U+6587:
>> https://util.unicode.org/UnicodeJsps/character.jsp?a=6587

>>
>> 2a) For second point, it does not happen on Safari or Edge (both seems picking zh-hant font?),
>> and I thought we could just change lang to html element into something other like zh-hant/hans
>> or ja, which should have 1em glyph by default (for the first moment).
>>   But looking into other tests in wpt, it seems there are several similar cases of 'tofu',
>> with some confused outcomes...
>>
>>
>> 3) And, I've checked several others to find solution and/or possible similar cases.
>>
>>   In css-ruby, ruby-intrinsic-isize-001, whose test cases has html lang="ja"
>> test: http://wpt.live/css/css-ruby/ruby-intrinsic-isize-001.html

>> results: https://wpt.fyi/results/css/css-ruby/ruby-intrinsic-isize-001.html?label=experimental&label=master&aligned

>> chrome: https://wpt.fyi/analyzer?screenshot=sha1%3A39ea5f5efcf9845ff20643f8e5e9756cff284d6f&screenshot=sha1%3A16cce8d6657ed23bba754e755fed55aa07468d29

>>   Firefox passes on this test (no screenshot provided). Chrome screenshot show 'tofu' fonts
>> for this screenshot.
>>
>>   In css-contnet, there are several multi language tests on quotes. quotes-016 tests Japanese
>> quote, and all browsers pass.
>> test: http://wpt.live/css/css-content/quotes-016.html

>> result: https://wpt.fyi/results/css/css-content/quotes-016.html?label=experimental&label=master&aligned

>>   This test is consisted of two lines, one with 'q' element, one replacing with &#xXXXX;.
>>   Considering this format with possible 'tofu' replacement, I believe test should pass if
>> browser implements correctly, even with glyph as 'tofu'. (two lines will have identical
>> width of glyph)
>>
>>   In the same suite, test for fallback of multiple region (like ja to ja-JA) is added as
>> quotes-034, which fails in chrome and firefox with screenshot.
>> test: http://wpt.live/css/css-content/quotes-034.html

>> result: https://wpt.fyi/results/css/css-content/quotes-034.html?label=experimental&label=master&aligned

>> chrome screenshot: https://wpt.fyi/analyzer?screenshot=sha1%3A16cb9a1153f028415b7bef0896a945df2a1e2b31&screenshot=sha1%3A730970202b6b9a65dddc5aacaac044e33a102426

>> firefox screenshot: https://wpt.fyi/analyzer?screenshot=sha1%3A1288b3218b0ab384f993fd89c8be35ffbb479248&screenshot=sha1%3A8b062b8e99ae2fa30dc4a996e8fa14168be301a2

>>   This test has html lang="en" and lang is specified per line (wrapped as p). Reference is
>> written with &$xXXXX; presentation.
>>   Both screenshot has 'tofu' character for Japanese lines.
>>
>>
>> 4) I think some tests contributed from i18n WG to wpt have lines for web fonts, like Arabic,
>> Nko, or Mongolian, but I thought these are added not to have side-effect from glyph in font
>> for complex shaping tests.
>>   Although I haven't encountered any test contribution using CJK, I thought we don't need
>> to have similar lines to load web font files if one does not have such complexity...
>>
>>
>>   Does anyone have some advice or knowledge? Like, if we rely on non-Latin characters (or
>> something,, say sets of DejaVu - widely used in old age although...), we need to include
>> lines of web font...
>>   I should miss some manual on this area, and sorry if there is something clearly stating
>> this in wpt manuals.
>>
>>
> 

Received on Thursday, 9 December 2021 09:05:24 UTC