- From: r12a <ishida@w3.org>
- Date: Thu, 8 Mar 2018 17:05:03 +0000
- To: www International <www-international@w3.org>
https://www.w3.org/2018/03/08-i18n-minutes.html text extract follows: – DRAFT – Internationalization Working Group Teleconference 08 March 2018 [2]Agenda [3]IRC log [2] https://lists.w3.org/Archives/Member/member-i18n-core/2018Mar/0000.html [3] https://www.w3.org/2018/03/08-i18n-irc Attendees Present addison, Bert, Fuqiao, JcK, Katy, Nigel, pal, stpeter Regrets Chair Addison Phillips Scribe stpeter Contents * [4]Meeting Minutes 1. [5]Agenda 2. [6]IMSC visiting us! 3. [7]What Time is This Meeting At? * [8]Summary of Action Items Meeting Minutes Agenda <JcK> No IMSC visiting us! <nigel> [9]IMSC Issue 236 [9] https://github.com/w3c/imsc/issues/236 r12a: background ... ISMC uses Unicode characters, glyphs come out of fonts, rendering algos/engines are needed for complex scripts at times before glyphs are assigned; important in this discussion to be clear on terminology of character/codepoint vs. glyphs JcK: are you talking about single code points or multiple that might result in a single grapheme? r12a: single code points for this discussion <r12a> [10]https://www.w3.org/TR/ ttml-imsc1.0.1/#recommended-unicode-code-points-per-language [10] https://www.w3.org/TR/ttml-imsc1.0.1/#recommended-unicode-code-points-per-language pal: purpose is to provide guidance regarding subtitles; enhance chance that if author chooses text it will be supported by the user agent and properly rendered pal: the intent is not to disallow certain code points or to require a rendering engine to not render certain code points addison: I think this is an extremely tricky thing to specify addison: first, implementers might see this as a required set, the only thing they have to support, etc. addison: for example, you wouldn't necessarily have enough code points to properly render Arabic pal: actually we have the common code points addison: doesn't deal with the need for more glyphs in your font pal: that's why worded in terms of code points, not glyphs addison: naive implementation would have glyph per code point pal: should we add a note about that? addison: most people build a system there's an instance of it for Arabic users or whatever script is in play addison: second point, CLDR has sets of characters like this by language (exemplar sets) addison: it might be helpful to reference CLDR instead of defining your own pal: we do reference CLDR - recommended set is a union of CLDR and ??? r12a: I'm worried about implementers too, but this section is about authors r12a: my worry is that implementers won't see this as clearly r12a: make it clear that this is a guide for a minimum set and for real support you should go further r12a: also make it clear that implementers need to enable the display of the following sets of characters, not selecting those sets of characters pal: output document should only contain those characters addison: output document is displayed somewhere and needs to be displayed faithfully addison: depends on how system that receives it is implemented addison: shaping engine etc. pal: annex is intended to be used by validator implementation pal: validator that sees a character that's not in the recommended character set can flag a warning addison: is this really a good idea? pal: what's a bad idea is showing unsupported characters pal: realistically no implementation is going to support all Unicode code points addison: some implementations support everything but rather obscure code points (plane 2 Chinese, ancient scripts, etc.) addison: what I see happen is trying to legislate fairly narrow character sets, whereas many rendering systems are more capable pal: this is targeting not just browsers but embedded systems like TVs pal: also, this has already proved useful addison: implementers do have font and space limitations, but it's a slippery slope when recommending subsets of characters r12a: I understand the intent, my concern is in how we describe that to people r12a: e.g., if we said "these are the safe characters to use" makes more sense to me r12a: this comes across as "these are the Hebrew (etc.) characters you should support" but these sets tend to grow to support new code points pal: this is why we reference CLDR r12a: unfortunately CLDR is not a panacea - it's missing things pal: so let's fix CLDR pal: not displaying a character is way worse r12a: the crux is specifying a safe set of characters for authors without implying that implementers should limit the sets of characters they support pal: what about starting the annex with that text? r12a: that's the kind of thing I was looking for <Zakim> nigel, you wanted to ask what action we can take to address the remaining concerns. nigel: the struggle here is understanding exactly what the concern is and coming up with a proposal to address the concern nigel: this discussion is helping nigel: any other concerns we can surface here? JcK: I'm concerned about where this might be leading; displaying the wrong character is much worse than displaying parts of a string and not other parts (for instance) JcK: part of the concern is that there are many edge cases which can't be handled by this kind of approach JcK: e.g., if you get text in Hebrew script but another language then you might not have the right code points to display things properly JcK: there are traps here about writing this particular language with this particular script, but not other languages pal: I captured another concern earlier about cautioning implementers that one code point != one glyph r12a: if you're dealing with a complex script like Myanmar, there are more difficulties addison: when people go font shopping, they can be satisfied with an inferior font and the rendering engine doesn't have the glyph that's necessary pal: that's true regardless r12a: that's part of my concern - we shouldn't let implementers off the hook and stymie forward progress (yes, these are embedded systems that aren't updated often) pal: hard to phrase this in a technical document addison: these things tend to ossify into a lowest common denominator or institutionalizes some particular set of characters pal: I think we're safe in the sense that systems support all of Unicode - we're not trying to create a chokepoint for code points addison: not at document level but at the validator and authoring tool levels pal: that's why we don't reference a particular version of CLDR for instance JcK: the fact that CLDR exists does not imply that CLDR is correct Katy: even defining a list of safe characters can vary quite wildly Katy: to clarify, managing author expectations is difficult here Katy: not just glyph display but processing and the like nigel: maybe clarify for authors that you can't just get a glyph but there is more complexity - there might fallback fonts and such (not just safe characters) nigel: is there a document we can reference? nigel: an informative document about rendering different characters correctly? addison: a different place to look might be the various font standards, which have introduced language codes that are supported <nigel> I heard r12a and katy express support for adding a note to explain that correct rendering of scripts goes beyond mapping code points to glyphs in a font addison: there might be standardization there to look at - a different way of accomplishing the goal here r12a: two questions: (1) the safe list here is presumably based on lowest common denominator for various devices? pal: tables were built using a study of TV and motion picture content pal: collecting all code points that were used in that context r12a: (2) why are we not just referencing CLDR? pal: there are longstanding issue against CLDR to add flag for text commonly appearing in subtitles r12a: I think what would help is to add some text cautioning against ossification pal: [summarizes feedback received so far] pal: we can try to formulate text along those lines and come back for further feedback stpeter: why not attack the problem at the CLDR level if they aren't properly supporting text needed in subtitles? pal: everyone's goal is to move this to CLDR addison: we'd be happy to support that as well addison: we do have a liaison agreement pal: subtitles and captions are becoming a global requirement and there are unique needs here; great example is musical note character <Zakim> nigel, you wanted to note that ossification is not a feature of the list of characters but a wider issue nigel: this point about ossification is a tricky one; e.g., if you deploy player code to a device, updates might not be available nigel: e.g., a downloadable font could be possible, but more work is needed to support the right characters nigel: how do we phrase this? addison: good question <r12a> [11]https://github.com/w3c/imsc/issues/ 236#issuecomment-367713408 [11] https://github.com/w3c/imsc/issues/236#issuecomment-367713408 r12a: that link has some suggested text but it might not be exactly what we need here - encourage folks to re-read pal: I'll try to craft text based on the terms we used in this call today addison: would you like us to say something to the CLDR folks? pal: +1 nigel: +1 pal: I plan to propose text soon for review by folks here addison: any concerns about supporting the CLDR trac? JcK: I'm nervous because it would be great to get down to one standard instead of two; at the same time, CLDR has been criticized for being opaque to folks with actual language expertise and not just character coding expertise addison: I'll take an action to focus it on the issue at hand Action: addison: write to cldr on WG behalf about Trac 8915 including wording about getting exemplars right <trackbot> Created ACTION-699 - Write to cldr on wg behalf about trac 8915 including wording about getting exemplars right [on Addison Phillips - due 2018-03-15]. pal: I will let you know when the proposed text is ready Action: addison: make pal's new draft part of homework <trackbot> Created ACTION-700 - Make pal's new draft part of homework [on Addison Phillips - due 2018-03-15]. addison: anything else on this topic? What Time is This Meeting At? <Katy> +1 r12a: typically don't change time until UK changes to Summer Time addison: in favor <Bert> (So no change for me then? That's good :-) ) <r12a> s/<JcK> No// <r12a> s/<addison> trackbot, prepare teleconference// Summary of Action Items 1. [12]addison: write to cldr on WG behalf about Trac 8915 including wording about getting exemplars right 2. [13]addison: make pal's new draft part of homework
Received on Thursday, 8 March 2018 17:05:11 UTC