- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Fri, 8 Jun 2012 11:19:53 +1000
- To: Glenn Maynard <glenn@zewt.org>
- Cc: David Singer <singer@apple.com>, public-texttracks@w3.org
On Fri, Jun 8, 2012 at 10:58 AM, Glenn Maynard <glenn@zewt.org> wrote: > On Thu, Jun 7, 2012 at 7:22 PM, David Singer <singer@apple.com> wrote: >> >> While we can't do away with script detection, we can strongly encourage >> people to tag things properly. Especially if we don't have time to specify >> uniform sniffing. > > > That's much easier with WebVTT than HTML. WebVTT will have a primary > language specified (from <track>) much more regularly than HTML, where > people omit @lang more often than not and there's no good way to encourage > them to set it. > > But, I'd suggest that it's possible to do away with script detection for > WebVTT (not for HTML). Specify an ordered list of scripts. If a character > isn't in the current language (or if no language was given), choose the > first script applicable to the character. This would give a simple > fallback, and users would be expected to always specify a language, > including when mixing languages. A script detection heuristic would be > better (as long as it's simple, reasonably accurate, and interoperable), but > if that's too hard, this would be much better than defaulting to the > locale-sensitive nightmare HTML is stuck in. (I don't know how > implementable this is with the font rendering engines in browsers.) > > (I don't have much hope of browsers switching away from HTML language and > charset detection that depends on the user's system language towards > something interoperable, but if somebody is more optimistic than I am I'd > love to see it tried. Keeping that problem from leaking into WebVTT seems > like a more immediate problem, though, before we're stuck with it here too.) I've somewhat lost track of what we are arguing about on this thread (it started off with may topics). I thought the introduction of script detection was suggested as a solution for automatically switching between identified languages, i.e. to avoid the introduction of a <lang> elements for VTT (which would be converted into <span lang=en> in HTML). In the vein of this, while I don't mind having automated script detection for VTT files as an advanced feature, I think a video player should not be expected to implement complex script detection just so we can support mixed language cues. There are other use cases than just rendering script correctly that make the introduction of a <lang> element useful, such as getting the pronunciation right by speech synthesizers or picking the right dictionary for spell checkers. See also bug https://www.w3.org/Bugs/Public/show_bug.cgi?id=15922 . In short, <lang> is required now so we can get all the associated use cases sorted out and enable developers to create well-authored files. Automated script detection is then an advanced feature that is useful in the absence of well-authored files. Cheers, Silvia.
Received on Friday, 8 June 2012 01:20:42 UTC