- From: Glenn Maynard <glenn@zewt.org>
- Date: Thu, 1 Dec 2011 23:21:11 -0500
- To: public-texttracks@w3.org
- Cc: "whatwg@whatwg.org" <whatwg@whatwg.org>
- Message-ID: <CABirCh-_wo9ESSAAdL_DkvjYSz=rmM0JpU-7aPz_bV6Run3K=Q@mail.gmail.com>
On Thu, Dec 1, 2011 at 7:34 PM, Ian Hickson <ian@hixie.ch> wrote: > > But it doesn't have to, since HTML does this with @lang. > > HTML doesn't do any font selection or word wrapping. > > Per the HTML and CSS specs, lang="" has no effect on rendering. > Huh? I'm confused: http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#the-lang-and-xml:lang-attributes "User agents may use the element's language to determine proper processing or rendering (e.g. in the selection of appropriate fonts or pronunciations, or for dictionary selection)." > I've never seen a Japanese font that didn't look terrible for English > > text. Also, I don't want my font selection to be severely limited due > > to the need to use a single font for both languages, instead of using > > the right font for the right text. > > Instead of working around poor fonts in all our various languages, we > should just fix the fonts. > Even if the English glyphs in a Japanese font aren't ugly, it's still not what I want to see for English text; I want to choose that font from the mountains of Western fonts available. Browsers already allow the user to set fonts on a per-language basis, and we already have a language tagging mechanism to take advantage of it. CJK glyph selection is a more serious problem than font prettiness, though. The renderer needs to know the language to do this correctly (without guesswork). If we are to add language information to the language, there's four ways > to do it: inline, cue-level, block-level (a section of the file, e.g. > setting a default at different points in the file), and file-level. > > Inline would look like this: > > WEBVTT > > cue id > 00:00:00.000 --> 00:00:01.000 > <lang en>cue text says <lang fr>bonjour</lang></lang> > > File-level would look like this: > > WEBVTT > language: fr > > cue id > 00:00:00.000 --> 00:00:01.000 > bonjour > > I suppose we'd need both. I wouldn't propose cue-level or block-level. > > How important is this for v1? > I think it's fine for inline tagging to wait for v2, as long as it's on the roadmap eventually. It would be nice to have a file-level language tag earlier. That way, less content and tools written before v2 will lack the language tag. Of course, people will still omit language tags, especially in hand-authored files, because the content will probably look right to *them* using the UA's default language. I don't have a solution to this (at least, none that vendors will actually implement; "always be wrong by default" would do it), but this is no different than @lang. In English, at least, it seems pretty common. Almost all cues seem to get > manually line-broken so as to maintain line length balance, for instance. > That's exactly what should be discouraged, just as authors should be letting the HTML renderer handle line breaking, not inserting <br>s manually. If wrapping to similar line lengths is what's wanted, then the VTT renderer (and so CSS) should support that. Actually the size isn't such a big deal since the font size is just based > on the video size. > But that's not fixed; as soon as users set a minimum font size in their UA (due to eyesight, or for readability on a phone), the text will be rendered at a larger size than it was authored in, which breaks manual word wrapping. Authors should be encouraged to let the renderer handle wrapping, just as with HTML. What's the use case, though? If it's notes to a translator, or notes about > uncertain captioning, presumably you would want to strip those out before > publishing the captions. > Commercial translations would probably strip comments, but we just left them in--if people want to poke at the comments to see why we translated something one way or another, that's cool. I'm fine with leaving this to styling (removing them if necessary for publishing); this isn't important enough to do more than that for. (Note that this may lead to content in the wild that breaks if CSS is disabled, especially if tools use this method; people won't always strip comments. I'm not worried about that--I don't personally consider it reasonable to expect all content to render sanely with stylesheets disabled.) -- Glenn Maynard
Received on Friday, 2 December 2011 04:21:40 UTC