[webtv] minutes - 23 July 2014 from Daniel Davis on 2014-07-23 (public-web-and-tv@w3.org from July 2014)

From: Daniel Davis <ddavis@w3.org>
Date: Wed, 23 Jul 2014 23:38:08 +0900
To: "public-web-and-tv@w3.org" <public-web-and-tv@w3.org>
Message-ID: <53CFC8D0.5050905@w3.org>

Minutes are available at:
http://www.w3.org/2014/07/23-webtv-minutes.html<http://www.w3.org/2014/07/09-webtv-minutes.html>

also as text below. Thanks to Kaz for the minute-tidying.

Daniel

---

Web&TV IG

23 Jul 2014

[2]Agenda

[2] http://lists.w3.org/Archives/Public/public-web-and-tv/2014Jul/0003.html

See also: [3]IRC log

[3] http://www.w3.org/2014/07/23-webtv-irc

Attendees

Present
Paul_Higgs, Kazuyuki, Bin_Hu, yosuke, ddavis, CyrilRa,
kawada

Regrets
Chair
Yosuke

Scribe
ddavis

Contents

* [4]Topics
1. [5]UC2-1
2. [6]UC2-2
3. [7]UC2-3
4. [8]UC2-4
5. [9]UC2-5
6. [10]UC2-6
7. [11]Joint meeting with the Accessibility TF
* [12]Summary of Action Items
__________________________________________________________

<yosuke> [13]https://www.w3.org/2011/webtv/wiki/New_Ideas

[13] https://www.w3.org/2011/webtv/wiki/New_Ideas

yosuke: Let's look through the use cases.
... The reviewer of the first use case is me.
... This is a simple use case.

[14]https://www.w3.org/2011/webtv/wiki/New_Ideas

[14] https://www.w3.org/2011/webtv/wiki/New_Ideas

UC2-1

<kaz>
[15]https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-1_Audio_Fi
ngerprinting

[15] https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-1_Audio_Fingerprinting

yosuke: I think there are three entities web developers need to
specify.
... The first is audio source - mic, etc.
... The second is finger-print generation algorithm.
... The third is the finger-print database, e.g. on the web.
... These three things are enough to declare a fingerprinting
service.
... In addition, if we have timeout or duration we can have
better control.
... So, this interface should be an asynchronous interface,
e.g. JavaScript promises.
... Because it will take time to resolve the fingerprint from
an online service.

PaulHiggs: What do you mean generation?

yosuke: You need to generate a finger print from the audio
source.

PaulHiggs: What you're trying to recognise is in the audio
source.
... and then process it. I don't think you're creating
anything, rather returning an identifier for the audio source.

yosuke: In many cases, fingerprinting services use only one
algorithm for their services.
... In that case, we don't need to specify which algorithm we
need.

PaulHiggs: Are we confusing watermarking and fingerprinting?
... Watermarking would take an extra identifier encoded as an
inaudible tone.

CyrilRa: On the backend you have to have a hash.
... Would you have the hashing done on the server-side or send
a sample?

PaulHiggs: I thought fingerprinting was sending an audio
sample.

CyrilRa: So then you'd send it to a recognition service.

PaulHiggs: You could do a local hash but that's not generation,
that's hashing.
... If the second item said hashing that would be fine.

yosuke: So the front end gets the audio, the back-end service
generates a fingerprint.
... In other services, the front-end gets a hash and sends that
to the backend.
... I'll do some research about existing fingerprinting
services.
... If it's just in audio clips then we don't need to clarify
the generation in the use case.

Bin_Hu: It seems we have two functions - one is a database
service and one is audio clip matching.
... The algorithm to match the audio is in the implementation
so I think it's a good starting point to do more research about
what existing services provide.

kaz: I was wondering if we should think of a model like EME for
this.
... EME has a model for the idea of the mechanism.
... Maybe we could use that as a starting point for the
fingerprinting discussion.

<kaz> [16]EME

[16] https://dvcs.w3.org/hg/html-media/raw-file/tip/encrypted-media/encrypted-media.html

yosuke: You mean we should create a diagram to understand the
architecture?

kaz: Right

yosuke: OK, I'll create a diagram based on my understanding.
... I'll create that and Daniel can check it.

kaz: That's great.

UC2-2

<kaz>
[17]https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-2_Audio_Wa
termarking

[17] https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-2_Audio_Watermarking

yosuke: Next use case is audio watermarking
... I think watermarking is much simpler than fingerprinting
because we don't need a backend service to generate the
watermark.

ddavis: Do you still need a backend?

PaulHiggs: No, the data is within the audio stream, as long as
you know the algorithm.
... If someone wanted to, they could encode a link to another
service.

CyrilRa: Fingerprinting helps you identify what audio was
played, watermarking helps you take action.
... You have to have audio triggers that can be recognised by
your client with watermarking.

PaulHiggs: You can think of it like old Teletext scanlines that
used to be in the signal.

Bin_Hu: For watermarking, the service provider has to encode
something within the stream.
... Are standards such as the MPEG standard planning to create
a standard for embedding this?

CyrilRa: There's no standard I can think of.

Bin_Hu: From a W3C perspective are we planning to support such
a format or accept that it's out of scope.
... Lots of new codecs are coming out, e.g. within MPEG, so are
we going to look into the method of multiplexing?
... Or will it be left to the implementation so we won't be
directly involved.

CyrilRa: That's one of the main issues with watermarking - you
need to know what you're looking for first.
... There's some work being done on the fingerprinting side
where you have backend side, and the frontend (player) is
capturing samples constantly.
... You then match between these two at exactly the right time.
... That's a way of overcoming the burden of having something
inaudible embedded, and also of having to know what to look
for.

UC2-3

[18]https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-3_Identica
l_Media_Stream_Synchronization

[18] https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-3_Identical_Media_Stream_Synchronization

kaz: Originally this had the HTML task force and Web sockets.
... I also added SMIL by the Timed Text WG.
... Also SCXML is a new version of SMIL and these can be used
for synchronization.

yosuke: As a next step, we need to clarify what the
requirements are.

kaz: My understanding is delivery a single stream to multiple
destinations at the same time.

yosuke: In many cases, the bandwidth or transport system is
different so they have different buffers or time lag.
... The exact timing could be different, so we need to think
about how to adjust the synchronisation between different
devices.

kaz: You mean how to keep the multiple streams (with identical
content) synchronised.

yosuke: Yes. Maybe a player on a "better" device would have to
wait to achieve synchronization with other slower devices.

kaz: So maybe we should add that point.
... What if the system is using DASH?
... It's even more complicated but we should think about that
as well.

yosuke: DASH can help with this use case
... If DASH is used, the client will have better presentation
timing
... Probably we need a more generic API to synchronize.
... The use case is simple but the technology could be
complicated.
... For example, I have a video element and my girlfriend has
the same content separately. We'd like to match the timing to
achieve synchronisation.

kaz: Maybe we could use WebRTC.

PaulHiggs: I don't know if we need WebRTC. This is not sharing
streams.
... I'm watching something and a friend is watching the same
thing from the same source, not re-streaming it.

kaz: So probably without WebRTC.

CyrilRa: What you'd need is a sync service.

kaz: yes, what we need is a very generic timeline mechanism.

yosuke: Could you make a note please on the wiki?

kaz: Will do.

yosuke: Next use case.

UC2-4

[19]https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-4_Related_
Media_Stream_Synchronization

[19] https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-4_Related_Media_Stream_Synchronization

kaz: This is similar

yosuke: We can talk about this next time.

UC2-5

<kaz>
[20]https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-5_Triggere
d_Interactive_Overlay

[20] https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-5_Triggered_Interactive_Overlay

ddavis: What key events were you thinking of?

Bin_Hu: E.g. during the world cup, if there's a goal that would
trigger an event.

yosuke: The basic way to deliver such metadata is using a text
track.
... What you're talking about is additional information. If we
implement that we could use HTML5 text track.
... Is that correct?

Bin_Hu: Text track may be a fundamental way but in the live
event it's not predictable - it may not be possible to add that
in a text track.
... Maybe additional information could be pulled in
out-of-band.
... Text track could be possible if not a live event.
... Or advertising is another situation.

kaz: Maybe the event can be sent to another channel. The
destination channel is what the viewer is looking at.
... E.g. if we're watching Harry Potter the info could be in
the text track for some event.
... There is a service in Japan like YouTube called NicoNico
... You can add lots of annotations to a video using timings.
... Those kind of annotations could be a trigger to these
events.

Bin_Hu: Exactly.
... This would encoded in-band.
... The platform implementation would be able to decode this
and dispatch the events.

kaz: So the point of the use case is to send such events and
show an overlay.

Bin_Hu: Events like start overlay, dismiss overlay should be
supported.

yosuke: Next use case: Clean Audio

UC2-6

[21]https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-6_Clean_Au
dio

[21] https://www.w3.org/2011/webtv/wiki/New_Ideas#UC2-6_Clean_Audio

yosuke: I added a section called Initial Gap Analysis
... If clean audio tracks are provided through an HTML5 audio
element, you can select them through existing interfaces.
... If they're provided in-band, you can use the in-progress
in-band resource tracks specification.
... There's another feature that a therapist can adjust the
acoustic features of audio tracks to assist a disabled user.
... You can achieve this using the Web Audio API.
... There are examples of audio equalizers already.
... So only one point remains - if you use encrypted media
extensions for your media tracks it's extremely unlikely the
audio could be modified.
... So I think we should ask the accessibility task force about
this point. EME can decrease media accessibility.
... I thought I should check dependencies with existing web
standards and we can basically achieve this use case with
existing standards.
... From a practical viewpoint, clean audio is helpful for
disabled people.
... An API to achieve this use case is not so helpful.
... Promoting the use case itself or encouraging media service
providers is a key point to improve accessibility.

ddavis: So it's more about awareness

yosuke: The EME part is important, but apart from that we can
achieve this use case with existing standards.
... We could make a note about how to do this which can help
service providers.
... We can also ask the EME guys and accessibility task force
about the potential drawback of using EME.

ddavis: Sounds like a good idea.

<kaz> [22]Media Accessibility User Requirements

[22] http://www.w3.org/WAI/PF/media-accessibility-reqs/

kaz: The current draft of the Media Accessibility User
Requirements doesn't include encryption.
... We can talk about it with the media accessibility task
force and HTML media task force.

yosuke: Kaz or Daniel, could you make this feedback?

kaz: Yes, next Monday is the next media accessibility call.

yosuke: I'll create a note about how to implement clean audio
with existing web standards.
... After that I'd like to ask the accessibility task force to
review it.

ddavis: I'm sure they'd be happy to do that.

Joint meeting with the Accessibility TF

yosuke: Any further questions or comments?

kaz: During the previous call I had a task to speak with the
media accessibility task force about meeting during TPAC in
October.
... They're also interested in a joint session.

yosuke: We could have a joint session during the TV IG meeting
or we can join their meeting. Do you have any ideas?

kaz: My suggestion is to join their meeting.
... We already have the TV Control API CG joining our TV IG
meeting.

yosuke: What's the next step?

kaz: If it's OK, let's ask them to join their meeting. I can
suggest this at our next joint call.

yosuke: If we have an accessibility session, it would not be a
long session.
... They can deliver it to more people if they come to our
meeting.
... We could give them a 10-20 minute session and people could
learn from them.
... Then, if IG people are interested in accessibility, they
could join their meeting.

kaz: ... We could have our meeting with them joining on Monday,
and then we join them on Tuesday.

<kaz> [23]TPAC schedule

[23] http://www.w3.org/2014/11/TPAC/

yosuke: Any other business?
... Thank you - meeting is adjourned.

<yosuke> Thank you very much for scribing the meeting, Daniel.

You're welcome.

Thanks Kaz

Summary of Action Items

[End of minutes]

Received on Wednesday, 23 July 2014 14:38:48 UTC