Re: Response to call for public review of WebVTT FPWD from David Singer on 2015-02-24 (public-texttracks@w3.org from February 2015)

From: David Singer <singer@apple.com>
Date: Tue, 24 Feb 2015 09:51:11 -0800
To: Andreas Tai <tai@irt.de>
Cc: Silvia Pfieffer <silviapfeiffer1@gmail.com>, "public-texttracks@w3.org" <public-texttracks@w3.org>
Message-id: <106DC302-B272-4EBF-B24D-8FD43D59AD63@apple.com>
> On Feb 24, 2015, at 9:48 , Andreas Tai <tai@irt.de> wrote:
> 
> Hi Silvia,
> 
> Thanks for all the detailed feedback on my comments. This makes it really encouraging to comment on this spec!!

Than YOU for the comments.  I hope this encourages others to do the same!

> 
> I need a bit of time to go through your feedback and reply to it but will certainly do!
> 
> Best regards,
> 
> Andreas
> 
> Am 22.02.2015 um 11:46 schrieb Silvia Pfeiffer:
>> Hi Andreas,
>> 
>> Thanks for all your feedback and sorry for the late reply - I'll give you some responses inline.
>> 
>> 
>> On Wed, Feb 18, 2015 at 3:42 AM, Andreas Tai <tai@irt.de> wrote:
>> Sorry for cross posting, but I think that this is relevant for both groups.
>> 
>> Best regards,
>> 
>> Andreas
>> 
>> 
>> -------- Weitergeleitete Nachricht --------
>> Betreff: Response to call for public review of WebVTT FPWD
>> Weitersenden-Datum: Tue, 17 Feb 2015 16:40:38 +0000
>> Weitersenden-Von: public-timed-text@w3.org
>> Datum: Tue, 17 Feb 2015 17:39:07 +0100
>> Von: Andreas Tai <tai@irt.de>
>> An: public-timed-text@w3.org
>> Kopie (CC): David Singer <singer@apple.com>
>> 
>> Dear all,
>> 
>> I am following the WebVTT spec for quite some time and wanted to respond 
>> to the general call for public review. My comments are observations and 
>> I hope they can be helpful for the WebVTT editor and the specification 
>> group. They are not thought as change requests. It is in the hand of the 
>> editor and /or the specification group to decide if any changes are 
>> needed or possible.
>> 
>> I post this on the TTML mailing list and on the text track community 
>> group list as subscribers may not have been merged yet.
>> 
>> ----------------------------------------------------------------------------
>> 
>> One concept of the WebVTT spec is to cleanly separate the following areas:
>> 
>> - data model
>> - syntax
>> - parsing
>> - rendering
>> - API
>> 
>> GENERAL OBSERVATIONS
>> 
>> It is an interesting approach to provide different sections to different 
>> target groups (e.g. WebVTT authors and WebVTT parser implementers) so 
>> they do not have to read the complete spec. My experience is (after 
>> reading different versions of WebVTT) that even for a specific task it 
>> is difficult to get the necessary information without reading through 
>> the complete spec.
>> 
>> If you are an author of WebVTT who wants to get the normative (!) text 
>> how to write a timed subtitle in WebVTT that should appear at a specific 
>> position at the bottom of the screen, where the text should have a 
>> specific font size in relation to the video height and the text color of 
>> the first line should be white and the text colour of the second line 
>> shall be yellow than it is not sufficient to just read through the data 
>> model and syntax sections. You have to read the rendering section which 
>> also refers back to concepts of the parsing section.
>> 
>> 
>> You shouldn't need to read the rendering section, but you are right. You will need to read the CSS extensions section for the color changes only thought. Would it help to make the syntax of the CSS extensions a separate section? 
>> 
>> If for the above task you want to get the information about a specific 
>> presentation feature like positioning or writing direction you have to 
>> extract from every section the different information. Often a part of a 
>> section stand is dependent of other parts. You have to know the general 
>> concepts that are outlined in a section (e.g. the concept of WebVTT 
>> nodes in the parsing section). Also you presentation features depend on 
>> each other (e.g. writing direction and positioning).
>> 
>> 
>> Positioning and writing direction should be sufficiently specified in the syntax. Of course, the syntax section is not a complete authoring guide - we have https://docs.webplatform.org/wiki/concepts/VTT_Captioning and other articles or tutorials on the Web for that.
>> 
>> Also, why would you need to understand the concept of WebVTT nodes in the syntax section? I don't follow. Can you explain?
>> 
>> To re-assure you that you have authored a WebVTT file that will be 
>> processed exactly as you want (based on normative text) and also if you 
>> want to write a WebVTT compliant parser you most probably have to read 
>> the complete spec.
>> 
>>  
>> There's a validator at https://quuz.org/webvtt/ that will help write valid WebVTT files.
>> If you want to write a parser, yes, you will need to read more than the syntax - also the parser section.
>> 
>> In that case it would be a big help if the information about a "feature" 
>> like positioning is all in one place (how this is represented in the 
>> data model, how the syntax looks like, how it is parsed and at the 
>> rendered).
>> 
>> That doesn't work, because features are interdepentent. What you are after is an authoring guide like https://docs.webplatform.org/wiki/concepts/VTT_Captioning . This spec follows the modern approach of writing W3C specs that that UI implementers are able to implement interoperably.
>> 
>> It is clear that how the spec is written already a re-organisation of 
>> the spec text is difficult (e.g. the parsing section is one continuous 
>> algorithm). But an additional normative section that brings all together 
>> may be a useful guidance.
>> 
>> 
>> We cannot specify something twice normatively - that causes contraditions.
>> 
>> In general the informative text that was added in the later stage 
>> editing process of the documents helps a lot. Sometimes I think this is 
>> actually necessary normative text (e.g. the notes on positioning in 
>> section 4.5).
>> 
>> 
>> We can help by adding more such descriptive text. It can, however, only be informative. If you have any concrete suggestions on what is under-described, please add a bug at https://www.w3.org/Bugs/Public/enter_bug.cgi?product=TextTracks%20CG .
>> 
>> Graphical representations would help a lot to understand the abstract 
>> concepts. One example case is  positioning. The terms line position and 
>> text position have been difficult for me to relate to the concepts they 
>> represent. Pictures that visualize cue boxes the writing directions and 
>> the positioning concepts would be great.
>> 
>> 
>> I agree - we should add some more visual examples. Do register some bugs so we won't forget about it.
>> 
>> Although the syntax seems consistent to me it would be a great help in 
>> addition to the prose there is a formal representation e.g. in Extended 
>> Backus–Naur Form (EBNF). I remember that this was already proposed in 
>> the community group.
>> 
>> 
>> Can you propose an EBNF that covers the features set? I doubt it's possible without making too many simplifications.
>> 
>> In the following I list some observations to specific sections which 
>> sometimes highlight as well more general issues.
>> 
>> -----------------------
>> Data Model
>> -----------------------
>> - Some concepts from the HTML spec are so vital to WebVTT that a short 
>> summary would be of help (most importantly text track and text track cue)
>> 
>> 
>> This is a problem that several W3C specs share. I'm waiting for the ReSpec authoring tools to make it possible to pull in text from another spec without having to re-type the text (because that would cause inconsistencies). 
>> 
>> - The title of 3.1 should possibly be "WebVTT cues" instead of Text 
>> track cues. The first sentence in 3.1 indicate that WebVTT cues are 
>> instances of text track cues and all the following applies to WebVTT cues.
>> 
>> 
>> Bug registered: https://www.w3.org/Bugs/Public/show_bug.cgi?id=28070
>> 
>> - In the section the term "text track cue" is used which actually should 
>> be WebVTT cue?
>> 
>> 
>> Also: https://www.w3.org/Bugs/Public/show_bug.cgi?id=28070
>> 
>> - writing direction
>>     * Maybe it would help if it is made more concrete how "vertical" 
>> and "horizontal" relate to the rendering pane of the video?
>> 
>> 
>> How can you misunderstand horizontal and vertical? It's pretty well defined within a Web page and more generally, too. What is a "rendering pane"? I don't know better words to describe the dimensions.
>> 
>>     * Statements are made that apply to concepts that have not been 
>> explained until this part of the spec. So it is explained how the 
>> percentage of line position depends on the writing direction but the 
>> concept of line position is explained further down. This also true for 
>> other parts in the text.
>> 
>> 
>> OK, the statements about line position could be moved down to line position.
>> Bug registered: https://www.w3.org/Bugs/Public/show_bug.cgi?id=28071
>>  
>>     On the one hand linear reading is necessary because all normative 
>> statements that have been made apply for all what follows. On the other 
>> hand you have to read non-linear and jump between parts of the spec. You 
>> may argue that this is the new form of reading but the mixture of the 
>> concepts makes it difficult to get the complete picture.
>> 
>> 
>> Yes, we're trying to avoid using concepts that have been defined later where possible.
>>  
>>  In this case 
>> the statements on line position would fit better in the paragraph about 
>> "line position".
>> 
>> Agreed. 
>> 
>> - snap-to-lines flag
>>     * The snap-to-lines flag is of type Boolean that can be "set" to 
>> "true" or "false". Instead of referring to these values sometimes the 
>> setting of the values is described with the verbs "set" (for setting to 
>> "true") and "unset" (for setting to "false"). It would be more 
>> consistent and help the reader if the operation is always described as 
>> "set to true" and "set to false".
>> 
>> 
>> OK. Bug registered: https://www.w3.org/Bugs/Public/show_bug.cgi?id=28072
>>  
>> - line position
>>     * The line position is actually more referring to the concept of a 
>> cue box than to the concept of a line. The first sentence states "The 
>> line position defines the position of the cue box.". It would be could 
>> to have a term that describe this "feature" as an property of the cue 
>> box instead of lines. As the syntax maybe hard to change it could help 
>> if a synonymous word could be found.
>> 
>> I've actually thought about this a lot and haven't come up with a better word. Maybe "cue position"?
>> It's "line position" for now for historic reasons, because it was started that way.
>> Bug registered: https://www.w3.org/Bugs/Public/show_bug.cgi?id=28073
>>  
>>  Furthermore a relationship to text 
>> position could be used in addition. While text position "defines the 
>> positioning of the cue box in the direction defined by the writing 
>> direction", the line positon defines the position of the cue box 
>> orthogonal to the direction defined by the writing direction (?).
>> 
>> 
>> Same bug. https://www.w3.org/Bugs/Public/show_bug.cgi?id=28073
>> 
>> - text position
>>   * There should be a link to the region part when the region is mentioned.
>> 
>> 
>> Bug registered: https://www.w3.org/Bugs/Public/show_bug.cgi?id=28074
>>  
>>   * The text position depends on the text alignment (which is explained 
>> further down). The next model element after text position is text 
>> position alignment. If you read linear through the document you easily 
>> confuse the two and refer the dependency to text postion alignment 
>> instead to text alignment.
>> 
>> Yeah, text position should also relate to cue box. So, should we talk about horizontal and vertical cue box position rather than line and text position? Problem with that is that we would need to rename the cue settings, which people have now become used to. Really not sure how to resolve this. If you have a good suggestion, please register a bug.
>> 
>>   * It would be clearer if it is stated explicitly that steps 2 to 4 
>> apply when text position is not set explicitly.
>> 
>> 
>> That's what the text in 1. already says: "Otherwise, the text track cue text position is the special value auto." . 
>> 
>> - text alignment
>>     * Paragraph direction is used as a term but the concept is not 
>> explained. start side of the cue box
>> 
>> 
>> There is a reference to [BIDI] . That's where it is defined.
>> 
>> Regions
>>   * The visual representation would help to get the difference between 
>> region anchor point and region viewport anchor point.
>> 
>> 
>> OK.  Bug registered: https://www.w3.org/Bugs/Public/show_bug.cgi?id=28075
>> 
>> Syntax
>>   * In 4.1 a cue setting list, cue setting name and cue setting value 
>> seem not to be constrained. But they are further constrained to values 
>> by "4.5 WebVTT cue setting". A reference may be helpful.
>> 
>> 
>> They are deliberately not constrained in 4.1, because the text there just introduces the concepts. This allows us to introduce more cue settings at a later stage. 
>> 
>> Thanks for all the feedback. I'd like to encourage you to register more bugs where you would like to see improvements to the specification text. It's the best way to keep track.
>> 
>> Best Regards,
>> Silvia.
>> 
>> 
>> Best regards,
>> 
>> Andreas
>> 
>> -- 
>> ------------------------------------------------
>> Andreas Tai
>> Production Systems Television IRT - Institut fuer Rundfunktechnik GmbH
>> R&D Institute of ARD, ZDF, DRadio, ORF and SRG/SSR
>> Floriansmuehlstrasse 60, D-80939 Munich, Germany
>> 
>> Phone: +49 89 32399-389 | Fax: +49 89 32399-200
>> http: 
>> www.irt.de | Email: tai@irt.de
>> 
>> ------------------------------------------------
>> 
>> registration court&  managing director:
>> Munich Commercial, RegNo. B 5191
>> Dr. Klaus Illgner-Fehns
>> ------------------------------------------------
>> 
>> 
>> 
> 
> 
> -- 
> ------------------------------------------------
> Andreas Tai
> Production Systems Television IRT - Institut fuer Rundfunktechnik GmbH
> R&D Institute of ARD, ZDF, DRadio, ORF and SRG/SSR
> Floriansmuehlstrasse 60, D-80939 Munich, Germany
> 
> Phone: +49 89 32399-389 | Fax: +49 89 32399-200
> http: 
> www.irt.de | Email: tai@irt.de
> 
> ------------------------------------------------
> 
> registration court&  managing director:
> Munich Commercial, RegNo. B 5191
> Dr. Klaus Illgner-Fehns
> ------------------------------------------------
> 

David Singer
Manager, Software Standards, Apple Inc.
Received on Tuesday, 24 February 2015 17:51:41 UTC