- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Wed, 18 Oct 2017 05:47:40 +1000
- To: "Michael A. Peters" <mpeters@domblogger.net>
- Cc: WHAT Working Group <whatwg@lists.whatwg.org>
We could specify that WebVTT cues of type metadata should contain valid JSON - that would make sense to me. Cues of type captions or subtitles stupid get parsed dune by the addCue() function of the texttrack API - but not all browsers implement this yet. Would be worth registering bugs on browsers. Cheers, Silvia. Best Regards, Silvia. On 18 Oct. 2017 2:51 am, "Michael A. Peters" <mpeters@domblogger.net> wrote: > On 10/16/2017 10:08 AM, Roger Hågensen wrote: > >> On 2017-10-14 10:13, Michael A. Peters wrote: >> >>> I use TextTrack API but it's documention does not specify that it >>> closes open tags within a cue, in fact I'm fairly certain it doesn't >>> because some people use it for json and other related none tag related >>> content. >>> >> Looking at https://www.html5rocks.com/en/tutorials/track/basics/ >> it seems JSON can be used, no idea if content type is different or not >> for that. >> >> Some errors using the tracks in XML were solved by the innerHTML trick >>> where I create a separate html document, append the cue, and then grab >>> the innerHTML but that doesn't always work to close tags when html >>> entities are part of the cue string. >>> >> >> Mixing XML and HTML is not a good idea. Would it not be easier to have >> the server send out proper XML instead of hTML? Valid XML is also valid >> HTML (the reverse is not always true). >> > > I agree, but what I was using an html document for - when using JS > innerHTML it has closing tags so the only issue would be tags that html > itself does not close (e.g. br) but those are not applicable with a WebVTT > cue - which is only suppose to support a very small number of tags, all > which have closing tags. > > The problem is WebVTT does not require tags be closed in a cue, e.g. > > 04:05.000 --> 04:07.250 > <c.foo>This is a cue. > > That's allowed in WebVTT > > I convert c.foo into > > <span class="foo">This is a cue. > > and when I add that to the html document and use innerHTML it then has the > closing </span> on it. > > While it seems to work with some html entities, it breaks with others like >   > > So for now I have to just make sure all my WebVTT are closed and not use > the hack that adds closing tags - but since WebVTT cues do not have to have > closing tags, but the cues need to work in XML documents, a built-in parser > in JS that can add missing closing tags I think would be a good thing. > > > And if XML and HTML is giving you issues then use JSON instead. >> I did not see JSON mentioned in the W3C spec though. >> > > I think the JSON in WebVTT cues is not spec but some are using it. > > Basically the textrack API seems to allow almost any string, it really has > to as WebVTT is not static and the spec changes. I wouldn't mind JSON being > added to WebVTT as it would be a handy way to encode metadata about the > media but that's another topic. > > A built in JS HTML parser may also be of benefit in preventing code > injection, e.g. stripping out tags from a WebVTT cue that a website does > not allow. > > The TextTrack API doesn't filter out things like script or other tags that > aren't part of WebVTT which means any site that allows users to upload > WebVTT files is creating a potential code injection vulnerability. > > Server-side code should filter it on upload, but it would be nice to > *someday* be able to pass a string through a native JS filter much the same > way we can with htmltidy server-side and remove all but white-listed tags > and attributes and get back a cleaned string with all tags closed. > > It looks like Google has a library that does that but it isn't intended > for client-side JS and may not be fast enough for things like phones to > process time-sensitive cues (I don't know). > > I might be wrong but it looked like the google library I found was > intended for server-side Node.js use. > >
Received on Tuesday, 17 October 2017 19:48:10 UTC