- From: Glenn Maynard <glenn@zewt.org>
- Date: Tue, 10 Apr 2012 17:11:18 -0500
- To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Cc: Frank Olivier <Frank.Olivier@microsoft.com>, "public-texttracks@w3.org" <public-texttracks@w3.org>
- Message-ID: <CABirCh9F2PxE=GG_8-gWPG4NE3wfYt-K=h5dW4hXu-nhNo-eXw@mail.gmail.com>
On Mon, Apr 9, 2012 at 11:03 PM, Silvia Pfeiffer <silviapfeiffer1@gmail.com>wrote: > > I'm not sure, exactly. Users probably have different preferences, so I'd > > suggest leaving this up to browsers. (Since you can't precisely control > > font rendering, sites can't depend on captions coming out a precise size > on > > all browsers anyway, so I don't think this reduces interop.) > > They're roughly the same, which is, I believe, sufficient for interop. > They're not. If I configure my browser to use a minimum font size of 14pt, and your captions are authored for 10pt, then it's going to be significantly different. Letting people pick their own font size is critical (anyone with less than 20/20 vision who has tried to read a webpage authored by someone with 20/10 understands this all too well). Also, font sizes for videos displayed on a phone may be in a different proportion to the video than font sizes on a TV, depending on the size and resolution of the display. OK, but there is a large number of existing content that uses those > hand-crafted newlines. I think they should continue to be supported. > I'm not suggesting that it not be supported, of course, as I said. If a user instead prefers to have the browser do the line breaks, they > can always remove the newlines when they are converting from existing > content to WebVTT and specify a "size" one the cues to determine at > which width the line break should occur. > This isn't something that users should have to enable. That's just creating extra modes, which means more testing for every author and/or a mode that will never work. Users should never have to care about this. If it takes a non-default mode to get sane wrapping, that just means 99% of users will never have sane wrapping. The thing is: right now we are supporting both (automated line breaks > and hard line breaks) in a simple manner. If we required <br>s for > line breaks, that would bring extra overhead for no apparent advantage > (at least none that I could directly point out). > Most critically, it encourages authors, especially those coming from SRT and HTML, to not manually hand-wrap data. If you have to say <br> to get a line break, that's going to go a long way towards getting people to realize that they're not supposed to be wrapping every line by hand. With the current model, I'm think it's a very strong guarantee that a majority of authors will always hand-wrap content. Not because they're following any particular style guide--just because they think it's the only way to do it. (It allowing authors to wrap long lines in their editor without causing line breaks in captions, just as in HTML.) > Note that SSA/ASS captions (the most common formats for fansubbing) > usually > > does use automatic word-wrapping. > > That's likely because their cues are specified on one line [1]. In > order to force a new line, you have to insert {\N}, making the cue > even less readable. I assume people would rather author another cue > instead of doing this. > The point is that the SSA formats handle wrapping correctly. I think there are good arguments for both positions: explicitly > calling out newlines makes it clear to people where their cue text may > be broken, but makes it harder to read. When you're using manual line breaks "correctly"--that is, for the occasional times when you really do legitimately need a line break--it doesn't make it hard to read. If you're pessimistically converting from SRT, you'd need to insert <br> on each line, since you can't tell for sure in SRT whether a newline was actually an important line break or not, but I don't think that's a problem. (It's not that hard to read, and who's sitting around, reading the source code of automatically converted caption files?) > I guess it depends on whether > we can find a good enough "line balancing" algorithm that will provide > for the quality of captions that people have come to expect [2]. > > For example, the caption key clearly states that this is an > inappropriate caption rendering: > Mark pushed his black > truck. > > While in contrast this is appropriate: > Mark pushed > his black truck. > This part is easy; the algorithm I suggested before handles it. Basically, take the regular word-wrapping algorithm, which results in the first version. Note the number of line breaks it results in: 1. Then, insert that number of line breaks evenly along the line, to result in the least deviation in each lines' length. (The last part would need to be more explicit, of course, but I don't think it's difficult.) Here are some of the rules it states: > * Do not break a modifier from the word it modifies. > * Do not break a prepositional phrase. > * Do not break a person’s name nor a title from the name with which it > is associated. > * Do not break a line after a conjunction. > * Do not break an auxiliary verb from the word it modifies. > These are exactly the sorts of things is for. If you want to carefully edit subtitles to follow these rules, then that just means using it appropriately. That's much saner than baking word wrapping into the file. (I've suggested supporting before--having to insert literal U+00A0 NO-BREAK SPACE into documents is essentially impossible to edit, without a specialized editor, so I'll just reiterate that with the addition of the above use cases.) * Never end a sentence and begin a new sentence on the same line > unless they are short, related sentences containing one or two words. > <br> is fine here. That - in my mind - is, > however, a different issue to whether we introduce explicit markup for > line breaks or not. I don't think we need the extra markup. I do think > though that we need the extra line balancing algorithm. > A good balancing word-wrapping mode is a prerequisite for telling people that they shouldn't break lines by hand. I do agree that it seems important, even on its own. -- Glenn Maynard
Received on Tuesday, 10 April 2012 22:11:47 UTC