W3C home > Mailing lists > Public > whatwg@whatwg.org > October 2017

Re: [whatwg] JavaScript function for closing tags

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Wed, 18 Oct 2017 05:47:40 +1000
Message-ID: <CAHp8n2=q_K9ayKW3sb7Ot+7qPsFNLRTQQvwdxrAAX_jS4+NJNQ@mail.gmail.com>
To: "Michael A. Peters" <mpeters@domblogger.net>
Cc: WHAT Working Group <whatwg@lists.whatwg.org>
We could specify that WebVTT cues of type metadata should contain valid
JSON - that would make sense to me.

Cues of type captions or subtitles stupid get parsed dune by the addCue()
function of the texttrack API - but not all browsers implement this yet.
Would be worth registering bugs on browsers.

Cheers,
Silvia.

Best Regards,
Silvia.

On 18 Oct. 2017 2:51 am, "Michael A. Peters" <mpeters@domblogger.net> wrote:

> On 10/16/2017 10:08 AM, Roger H├ągensen wrote:
>
>> On 2017-10-14 10:13, Michael A. Peters wrote:
>>
>>> I use TextTrack API but it's documention does not specify that it
>>> closes open tags within a cue, in fact I'm fairly certain it doesn't
>>> because some people use it for json and other related none tag related
>>> content.
>>>
>> Looking at https://www.html5rocks.com/en/tutorials/track/basics/
>> it seems JSON can be used, no idea if content type is different or not
>> for that.
>>
>> Some errors using the tracks in XML were solved by the innerHTML trick
>>> where I create a separate html document, append the cue, and then grab
>>> the innerHTML but that doesn't always work to close tags when html
>>> entities are part of the cue string.
>>>
>>
>> Mixing XML and HTML is not a good idea. Would it not be easier to have
>> the server send out proper XML instead of hTML? Valid XML is also valid
>> HTML (the reverse is not always true).
>>
>
> I agree, but what I was using an html document for - when using JS
> innerHTML it has closing tags so the only issue would be tags that html
> itself does not close (e.g. br) but those are not applicable with a WebVTT
> cue - which is only suppose to support a very small number of tags, all
> which have closing tags.
>
> The problem is WebVTT does not require tags be closed in a cue, e.g.
>
> 04:05.000 --> 04:07.250
> <c.foo>This is a cue.
>
> That's allowed in WebVTT
>
> I convert c.foo into
>
> <span class="foo">This is a cue.
>
> and when I add that to the html document and use innerHTML it then has the
> closing </span> on it.
>
> While it seems to work with some html entities, it breaks with others like
> &#160;
>
> So for now I have to just make sure all my WebVTT are closed and not use
> the hack that adds closing tags - but since WebVTT cues do not have to have
> closing tags, but the cues need to work in XML documents, a built-in parser
> in JS that can add missing closing tags I think would be a good thing.
>
>
> And if XML and HTML is giving you issues then use JSON instead.
>> I did not see JSON mentioned in the W3C spec though.
>>
>
> I think the JSON in WebVTT cues is not spec but some are using it.
>
> Basically the textrack API seems to allow almost any string, it really has
> to as WebVTT is not static and the spec changes. I wouldn't mind JSON being
> added to WebVTT as it would be a handy way to encode metadata about the
> media but that's another topic.
>
> A built in JS HTML parser may also be of benefit in preventing code
> injection, e.g. stripping out tags from a WebVTT cue that a website does
> not allow.
>
> The TextTrack API doesn't filter out things like script or other tags that
> aren't part of WebVTT which means any site that allows users to upload
> WebVTT files is creating a potential code injection vulnerability.
>
> Server-side code should filter it on upload, but it would be nice to
> *someday* be able to pass a string through a native JS filter much the same
> way we can with htmltidy server-side and remove all but white-listed tags
> and attributes and get back a cleaned string with all tags closed.
>
> It looks like Google has a library that does that but it isn't intended
> for client-side JS and may not be fast enough for things like phones to
> process time-sensitive cues (I don't know).
>
> I might be wrong but it looked like the google library I found was
> intended for server-side Node.js use.
>
>
Received on Tuesday, 17 October 2017 19:48:10 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 17 October 2017 19:48:10 UTC