- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Tue, 14 Sep 2010 18:27:28 +1000
On Tue, Sep 14, 2010 at 6:11 PM, Philip J?genstedt <philipj at opera.com>wrote: > On Mon, 13 Sep 2010 15:50:09 +0200, Silvia Pfeiffer < > silviapfeiffer1 at gmail.com> wrote: > > On Mon, Sep 13, 2010 at 5:55 PM, Philip J?genstedt <philipj at opera.com >> >wrote: >> >> On Sat, 11 Sep 2010 01:27:48 +0200, Silvia Pfeiffer < >>> silviapfeiffer1 at gmail.com> wrote: >>> >>> On Fri, Sep 10, 2010 at 11:00 PM, Philip J?genstedt <philipj at opera.com >>> >>>> >wrote: >>>> >>>> On Thu, 09 Sep 2010 15:08:43 +0200, Silvia Pfeiffer >>>> >>>>> <silviapfeiffer1 at gmail.com> wrote: >>>>> >>>>> On Wed, Sep 8, 2010 at 9:19 AM, Ian Hickson <ian at hixie.ch> wrote: >>>>> >>>>> >>>>>> >>>>>> On Fri, 23 Jul 2010, Philip J?genstedt wrote: >>>>>> >>>>>>> >>>>>>> If we must have both kind=subtitles and kind=captions, then I'd >>>>>>> suggest >>>>>>> > making the default subtitles, as that is without a doubt the most >>>>>>> common >>>>>>> > kind of timed text. Making captions the default only means that >>>>>>> most >>>>>>> > timed text will be mislabeled as being appropriate for the HoH when >>>>>>> it >>>>>>> > is not. >>>>>>> >>>>>>> Ok, I've changed the default. However, I'm not fighting this battle >>>>>>> if >>>>>>> it >>>>>>> comes up again, and will just change it back if people don't defend >>>>>>> having >>>>>>> this as the default. (And then change it back again if the browsers >>>>>>> pick >>>>>>> "subtitles" in their implementations after all, of course.) >>>>>>> >>>>>>> Note that captions aren't just for users that are hard-of-hearing. >>>>>>> Most >>>>>>> of >>>>>>> the time when I use timed tracks, I want captions, because the reason >>>>>>> I >>>>>>> have them enabled is that I have the sound muted. >>>>>>> >>>>>>> >>>>>>> Hmm, you both have good points. Maybe we should choose something as >>>>>>> >>>>>> the >>>>>> default that is not visible on screen, such as "descriptions"? That >>>>>> would >>>>>> avoid the issue and make it explicit for people who provide captions >>>>>> or >>>>>> subtitles that they have to make a choice. >>>>>> >>>>>> >>>>>> If we want people to make an explicit choice, we should make kind a >>>>> required attribute and make browsers ignore <track>s without it. (I >>>>> think >>>>> subtitles is a good default though.) >>>>> >>>>> >>>> >>>> >>>> I think you misunderstood - my explanation probably wasn't very good. >>>> I'm >>>> looking at it from the authoring POV. >>>> >>>> What I meant was: if I author a text track that is supposed to be >>>> visible >>>> on >>>> screen as the video plays back and if we choose either @kind=subtitle or >>>> @kind=caption as the default, then I don't have to really think through >>>> about what I authored as it will be displayed on screen. This invites >>>> people >>>> to not distinguish between whether they authored subtitles or captions, >>>> which is a bad thing, because a deaf user may then get tracks with the >>>> wrong >>>> label and expectations. If, however, we choose as a default something >>>> that >>>> is not visible on screen, e.g. @kind=description or @kind=metadata, then >>>> the >>>> author who wants their text track to be visible on screen has to give it >>>> a >>>> label, i.e. make an explicit choice between @kind=subtitle and >>>> @kind=caption. I believe this will lead to more correctly labeled >>>> content. >>>> I >>>> am therefore strongly against default labeling with either subtitle or >>>> caption. We could make @kind a required attribute instead as you are >>>> saying. >>>> >>>> >>> OK, I think we mostly agree. Any default will sometimes be wrong, so to >>> not >>> have to choose between subtitles and captions, I'd still really prefer if >>> specific HoH-tags like <sound> can be shown or hidden depending on user >>> preference. I think that would lead to more content actually being >>> written >>> for HoH users, as it doesn't requiring maintaining 2 different files. >>> >> >> >> >> Ah, you are talking about some kind of CSS marker for the audio events >> that >> are marked up in a caption file and that could just simple be "display: >> none" if they are viewed as a subtitle. Interesting idea... not sure that >> matches with the current spec though. >> > > The spec already has <sound>, what's missing is making the default styling > of it depend on user preference and making this the recommended way of > delivering HoH content. > > > many new files will not play in the software created for the old spec. >>> >>>> >>>>> >>>>>> >>>>>> As long as we don't add a header, the files will play in most >>>>> existing >>>>> software. Apart from parsers that assume that SRT is plain text (and >>>>> thus >>>>> would be unsuitable for much existing SRT content), what kind of >>>>> breakage >>>>> have you found with WebSRT-specific syntax in existing software? >>>>> >>>>> >>>>> I think we need to add a header - and possibly other things in the >>>> future. >>>> Will we forever have the SRT restrictions hold back the introduction of >>>> new >>>> features into WebSRT? >>>> >>>> >>> Yes, if we extend SRT we can't break compatibility. However, it seems >>> that >>> all the extensibility needed already exists, as arbitrary tag names are >>> handled by the parser. >>> >> >> >> Your analysis of what format for headers we can introduce without breaking >> old SRT files speaks against that. Whatever extensions we introduce beyond >> what we currently have will break compatibility with some and increasingly >> more old SRT parsing software. Not to speak of format compatibility, which >> is already a non-given. >> > > You're right, adding a header breaks SRT compat. > > Allowing anything as part of the syntax is a bit >>> >>>> dangerous though, as most unrecognized stuff between cues are likely >>>>> broken cues. Validators should warn about it, not treat it as a >>>>> comment. >>>>> >>>>> >>>>> I wasn't aware of the effect of the standardised parsing algorithm for >>>> WebSRT allowing "broken cues" to be dealt with. This will effectively >>>> mean >>>> that a parser will be required to parse all files that it is given from >>>> beginning to end and discard all non-conformant lines - even if that >>>> file >>>> may be a 100GB large movie file. In this case, I would really recommend >>>> that >>>> we put a magic identifier at the beginning of Web SRT files so we can be >>>> sure that the intention of the file was to be a WebSRT file. Let's have >>>> the >>>> string "WebSRT" at the beginning of the files. >>>> >>>> >>> That's a good point. I don't suppose it's a huge problem in practice that >>> errors can't be detected until EOF, but it's certainly not a desirable >>> feature. To maintain some sanity, we probably ought to either require the >>> correct MIME type or require the correct magic bytes. From the <video> >>> MIME >>> type debacle, I think I slightly prefer magic bytes to be checked by the >>> parser. >>> >>> I've also argued for the inclusion of metadata, so I'm beginning to warm >>> up >>> to the idea of adding a header beginning with "WebSRT" or some such. If >>> we >>> do this, no existing SRT content can be reused, but we can still try to >>> make >>> it possible for WebSRT files to be reusable in desktop applications, by >>> keeping the syntax highly compatible so that the same parser can be used >>> for >>> both without a mode switch. >>> >> >> >> Sounds good to me. I'm sure browsers would find a way to have old SRTfiles >> slip through the cracks, but that's not what we should bespecifying for. SRT >> could IMHO be a second format to support in <track>elements, but WebSRT >> should be the baseline. >> > > The point of a header is that browsers can identify WebSRT files and not > keep parsing through a 100GB movie file, so if we do add a header then no > existing SRT files will work. I certainly don't want to support SRT and > WebSRT as *different* formats. > > > So, thinking about that header: from your analysis of the existing files: >> did you have many starting with @.. ? >> > > 22/10000 files have lines starting with @, but since this is only in the > header, I don't think it matters. > > > I'd be happy for the name-value pairs spec that Ian mentioned, which could >> then lead to something like the following as header: >> >> WebSRT >> @language --> en-US >> @kind --> subtitle >> @cueformat --> plain/minimal/metadata >> @author --> Frank, Charlie, Anna >> @date --> 20th September 2010 >> @copyright --> WGBH, 2010 >> @license --> CC-BY-SA, http://creativecommons.org/licenses/by-sa/3.0/ >> > > I'd say that the simplest approach is probably requiring the first line to > be "WebSRT", and then all lines up to the first blank line are defined as > the header. I'm not sure what the point of using @ is, and using --> here > seems weird as it's used for a range in the timing line, something quite > different. I thought the argument for this was that it makes for backwards compatibility with existing SRT parsers. I would be very happy to drop these and just use "name": "value", or "name": "value1", "value2" and possibly even "name": { "value1", "value2" } where the latter works for multi-line metadata. > I think the following would be simpler: > > WebSRT > language: en-US > author: Frank > date: 2010-09-20 > > (allowing free form dates makes it non-machine-readable, so why bother?) Yup, sure. I was more concerned to figure out which fields might be important. > Further, with your analysis, it seemed like the following could be >> acceptable for comments: >> >> // Lines starting with // are comments >> > > Yes, but do we need comments in the cues at all? Since SRT has no comments, > this would make the cue format incompatible too, in which case we can just > stop pretending that there's any relationship to SRT. Would comments be in cues? I would think they would only be allowed in between cues, thus making them a broken cue for existing SRT parsers. Silvia. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20100914/fbe62ed0/attachment-0001.htm>
Received on Tuesday, 14 September 2010 01:27:28 UTC