Re: Media synchronization - wiki page

Thanks, Scott. This is especially useful since I have had little time to continue my research. I was able to download the complete PDF copy of the second article you linked to by using my university library account. I will take some time to review it and summarize pertinent information.

Be on shortly...

--Steve




Steve Noble
Instructional Designer, Accessibility
Psychometrics & Testing Services

Pearson

502 969 3088
steve.noble@pearson.com<mailto:steve.noble@pearson.com>

[https://ci3.googleusercontent.com/proxy/xFjftXlwMzpdFeTtDgc4_IwyMYm8ThtQHIsgElkS8fyiCO2M7ZM0WaO7r2uy-bmKAe5S2sIcg7d-mwbD4ArkJhyafHke-SgJ2ui8DoGoBhZw4YIyWeK3LUozNMwBff4JR2tdu8nZ2fvoNvkkA06KNw9-s3P9UvYsHSTphHss6X0=s0-d-e1-ft#http://accessibility4school.pearson.com/access/4c49fe02-e204-46b4-b6f0-82f5a3f159cb/pearson-accessibility.jpg]


________________________________
From: Scott Hollier <scott@hollier.info>
Sent: Wednesday, September 30, 2020 2:40 AM
To: public-rqtf@w3.org <public-rqtf@w3.org>
Subject: RE: Media synchronization - wiki page




To the RQTF



Following on from our discussion last week, I thought this was particularly interesting that might help to build on Steveˇ¦s discussion.



Source: https://cdn.ttgtmedia.com/searchUnifiedCommunications/downloads/VideoConf_CH07.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__cdn.ttgtmedia.com_searchUnifiedCommunications_downloads_VideoConf-5FCH07.pdf&d=DwMGaQ&c=0YLnzTkWOdJlub_y7qAx8Q&r=e0Vlq1-H9s-GydHZ8dXqhyYdB-jv9NvThaezSlozh9I&m=kledxOFSfTpkSoxsr-z1Ujfvr8rCVPBjiEPBzX9zE1I&s=zcC-gNqlIaZQqMIGtXkoPtcJVDfFoHoI0o2_h_cFYU8&e=>



*** BEGIN QUOTE

Understanding Lip Sync Skew Lip sync is the general term for audio/video synchronization, and literally refers to the fact that visual lip movements of a speaker must match the sound of the spoken words. If the video and audio displayed at the receiving endpoint are not in sync, the misalignment between audio and video is referred to as skew. Without a mechanism to ensure lip sync, audio often plays ahead of video, because the latencies involved in processing and sending video frames are greater than the latencies for audio. Human Perceptions User-perceived objection to unsynchronized media streams varies with the amount of skewˇX for instance, a misalignment of audio and video of less than 20 milliseconds (ms) is considered imperceptible. As the skew approaches 50 ms, some viewers will begin to notice the audio/video mismatch but will be unable to determine whether video is leading or lagging audio. As the skew increases, viewers detect that video and audio are out of sync and can also determine whether video is leading or lagging audio. At this point, the video/audio offset distracts users from the   video conference. When the skew approaches one second, the video signal provides no benefitˇX viewers will ignore the video and focus on the audio. Human sensitivity to skew differs greatly from person to person. For the same audio/video skew, one person might be able to detect that one stream is clearly leading another stream, whereas another person might not be able to detect any skew at all. A research paper published by the IEEE reveals that most viewers are more sensitive to audio/ video misalignment when audio plays before the corresponding video, because hearing the spoken word before seeing the lips move is more ˇ§unnaturalˇ¨ to a viewer (Blakowski and Steinmetz 1996). Sensitivity to skew is also determined by the frame rate and resolution: Viewers are more sensitive to skew when watching higher video resolution or higher frame rate. Report IS-191 issued by the Advanced Television Systems Committee (ATSC) recommends guidelines for maximum skew tolerances for broadcast systems to achieve acceptable quality. The guidelines model the end-to-end path by assuming that a single encoder at the distribution center receives both audio and video streams, digitizes the streams, assigns time stamps, encodes the streams, and then sends the encoded data over a network to a receiver. The guidelines specify that on the sending side, at the input to the encoder, the audio should not lead the video by more than 15 ms and should not lag the video by more than 45 ms. This possible lead or lag might arise from uncertainty in the latencies through the digitizing/capture hardware and occurs before the encoder assigns time stamps to the digitized media streams. At the receiving side, the receiver plays the audio and video streams according to time stamps assigned by the encoder. But again, there is an uncertainty in the latency of each stream through the playout hardware. The guidelines stipulate that for each stream, this uncertainty should not exceed ˇÓ15 ms; this tolerance is an absolute tolerance that applies to each stream. Based on these guidelines, two requirements emerge for acceptable lip sync tolerance: ˇ˝ Criterion for leading audioˇXIn the worst-case-permitted scenario, audio leads video at the input to the encoder by 15 ms. The receiver plays the audio stream too far ahead by 15 ms while playing the video stream too far behind by 15 ms.

*** END QUOTE



Also hereˇ¦s an article that builds on Janinaˇ¦s comments a few weeks ago about language interpretation. Itˇ¦s an a PDF and Iˇ¦m having trouble accessing all its contents, but he abstract looks interesting.

https://www.researchgate.net/publication/257436740_Assessing_the_importance_of_audiovideo_synchronization_for_simultaneous_translation_of_video_sequences<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.researchgate.net_publication_257436740-5FAssessing-5Fthe-5Fimportance-5Fof-5Faudiovideo-5Fsynchronization-5Ffor-5Fsimultaneous-5Ftranslation-5Fof-5Fvideo-5Fsequences&d=DwMGaQ&c=0YLnzTkWOdJlub_y7qAx8Q&r=e0Vlq1-H9s-GydHZ8dXqhyYdB-jv9NvThaezSlozh9I&m=kledxOFSfTpkSoxsr-z1Ujfvr8rCVPBjiEPBzX9zE1I&s=hpKGhJ_KUYLtPboJZ__6TU4OehiQIt9TlpcibNNaoWY&e=>



Thanks everyone,



Scott.





[Scott Hollier logo]Dr Scott Hollier

Digital Access Specialist

Mobile: +61 (0)430 351 909

Web: www.hollier.info<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.hollier.info_&d=DwMGaQ&c=0YLnzTkWOdJlub_y7qAx8Q&r=e0Vlq1-H9s-GydHZ8dXqhyYdB-jv9NvThaezSlozh9I&m=kledxOFSfTpkSoxsr-z1Ujfvr8rCVPBjiEPBzX9zE1I&s=Sx-hPigH4pK0ciOrYxUDpV3kULZ-z13dOuu3viidR-o&e=>



Technology for everyone



Keep up with digital access news by following @scotthollier on Twitter<https://urldefense.proofpoint.com/v2/url?u=https-3A__twitter.com_scotthollier&d=DwMGaQ&c=0YLnzTkWOdJlub_y7qAx8Q&r=e0Vlq1-H9s-GydHZ8dXqhyYdB-jv9NvThaezSlozh9I&m=kledxOFSfTpkSoxsr-z1Ujfvr8rCVPBjiEPBzX9zE1I&s=GCImNkzmM9P69xfcBHGs-LC7k9gkCdp-8ExZVXWUnBw&e=> and subscribing to Scottˇ¦s newsletter<mailto:newsletter@hollier.info?subject=subscribe>.



From: White, Jason J <jjwhite@ets.org>
Sent: Wednesday, 30 September 2020 4:14 AM
To: public-rqtf@w3.org
Subject: Media synchronization - wiki page



Dear colleagues,



I have updated the wiki page to correct markup issues that I unintentionally introduced earlier. Also, as I recall, there are further references to add that were discussed at the meeting last week.
https://www.w3.org/WAI/APA/task-forces/research-questions/wiki/Media_Synchronization_Requirements<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.w3.org_WAI_APA_task-2Dforces_research-2Dquestions_wiki_Media-5FSynchronization-5FRequirements&d=DwMGaQ&c=0YLnzTkWOdJlub_y7qAx8Q&r=e0Vlq1-H9s-GydHZ8dXqhyYdB-jv9NvThaezSlozh9I&m=kledxOFSfTpkSoxsr-z1Ujfvr8rCVPBjiEPBzX9zE1I&s=OsBhnpPO03vedfzpsd0J_IJ5NjZEz98jPj-IMc58R48&e=>



We should probably document our observations on that page as well.





________________________________

This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited.



Thank you for your compliance.

________________________________

Received on Wednesday, 30 September 2020 12:36:09 UTC