- From: <janina@rednote.net>
- Date: Wed, 18 Sep 2019 07:37:38 -0400
- To: "Phillips, Addison" <addison@lab126.com>;, wq@rednote.net
- Cc: "ishida@w3.org" <ishida@w3.org>, "atsushi@w3.org" <atsushi@w3.org>, "xfq@w3.org" <xfq@w3.org>, W3C WAI Accessible Platform Architectures <public-apa@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>, "Fourney, David" <david.fourney@usask.ca>, Christian Galinski <christian.galinski@chello.at>, "'klaus.miesenberger'" <klaus.miesenberger@jku.at>, "hoeckner@hilfsgemeinschaft.at" <hoeckner@hilfsgemeinschaft.at>, "shadi@w3.org" <shadi@w3.org>, "alejandro.moledo@edf-feph.org" <alejandro.moledo@edf-feph.org>, "lisa.seeman@zoho.com" <lisa.seeman@zoho.com>, "'Kasinskaite, Irmgarda'" <I.Kasinskaite@unesco.org>, "drude@xs4all.nl" <drude@xs4all.nl>, "stevelee@w3.org" <stevelee@w3.org>, "'FERRES Mercè'" <FERRES@iso.org>, Charles LaPierre <charlesl@benetech.org>, "p13n@rednote.net" <p13n@rednote.net>
Colleagues: Those of you joining our meeting tomorrow on sign language and AAC designations by telephone should follow the remote participation teleconference directions at: htt://www.w3.org/WAI/APA/wiki/Meetings/TPAC_2019 Resource: Webex & Teleconference Logistics Webex Best Practices: https://www.w3.org/2006/tools/wiki/WebExBestPractices W3C uses IRC to capture minutes and otherwise manage our discussions.*IRC Logistics IRC: server: irc.w3.org, channel: #APA IMPORTANT: Upon joining IRC, please identify yourself: present+ [your_name] Ex: present+ Janina_Sajka To raise your hand to speak, enter q+ Janina Sajka writes: > Hi, Addison: > > Let's then do it Thursday at 5PM. > > Does that work for others on this thread? > > I'll adjust our APA planning accordingly, and I'll ask our staff contact > Michael Cooper to set up a one-time Webex we can share with our non W3C > colleagues who have raised some of the questions in this thread. > > I may not have mentioned it previously, but part of APA's direct > interest is in identifying AAC languages appropriately. Our > Personalization TF will have a demo in hand during TPAC of web content > auto transformed for Bliss symbol users. The technology they're > prototyping should allow the AAC user to specify their preferred AAC > lang and get similar results. > > Our Personalization TF Co-Facilitators will be in Japan and will want to > participate in our conversation. > > Janina > > Phillips, Addison writes: > > Hi Janina, > > > > Thanks for the note. > > > > I personally can't do Friday at 5 PM, since my flight to Tokyo is at 4:00 PM. I could do Thursday. I'm also happy to do some other evening or to host a call as part of the I18N teleconference outside of TPAC. Others in the I18N WG might be able to accommodate different days or times. > > > > How do you want to resolve this? > > > > Addison > > > > > -----Original Message----- > > > From: janina@rednote.net [mailto:janina@rednote.net] > > > Sent: Thursday, September 05, 2019 12:39 PM > > > To: Phillips, Addison <addison@lab126.com> > > > Cc: ishida@w3.org; atsushi@w3.org; xfq@w3.org; W3C WAI Accessible > > > Platform Architectures <public-apa@w3.org>; public-i18n-core@w3.org; > > > Fourney, David <david.fourney@usask.ca>; Christian Galinski > > > <christian.galinski@chello.at>; 'klaus.miesenberger' > > > <klaus.miesenberger@jku.at>; hoeckner@hilfsgemeinschaft.at; shadi@w3.org; > > > alejandro.moledo@edf-feph.org; lisa.seeman@zoho.com; 'Kasinskaite, > > > Irmgarda' <I.Kasinskaite@unesco.org>; drude@xs4all.nl; stevelee@w3.org; > > > 'FERRES Mercè' <FERRES@iso.org>; Charles LaPierre <charlesl@benetech.org>; > > > p13n@rednote.net > > > Subject: Re: W3C I18N & Accessibility; ISO 639 language codes > > > > > > Thank you, Addison, for the very prompt and positive response. And thank you > > > for offering to make room on your Monday-Tuesday agenda. However, I will be > > > wearing a different badge representing a different contracted consulting > > > interest Monday-Tuesday, and I hesitate to step away on those days for APA > > > agenda. > > > > > > I believe many of the people cc'd who have raised these questions with us are in > > > Europe. So, if we're to offer them a reasonable opportunity to dial in, the very > > > end of the day is likely the most congenial opportunity, though admittedly > > > horrible for North Americans. > > > > > > What if we took some time at the very end of the week? Say starting at 5PM > > > Friday? I believe that would be 9AM for our friends in Europe. > > > > > > Would that work for I18N? For whoever is still at TPAC? > > > > > > > > > Janina > > > > > > > > > Phillips, Addison writes: > > > > <chair hat on> > > > > I would be happy to meet with our A11Y colleagues during a portion of the > > > I18N meeting Monday/Tuesday. I would also be glad to meet with A11Y folks on > > > Thursday or part of Friday (speaking personally) and I'm sure others in our group > > > who are present would also attend. > > > > > > > > <chair hat off> > > > > This thread seems confused? BCP 47 includes support for ISO 639, parts 1, 2, > > > and 3, including a large number of sign languages. Alpha2 subtags are used for > > > languages that have alpha2 codes assigned by ISO 639-1. Languages that have > > > no 639-1 code but which are assigned codes by 639-2/3 use the alpha3 subtag to > > > form language tags. These subtags are widely and thoroughly supported in > > > HTML, CSS and other Web standards. Some other standards (in the structured > > > data space and notably related to DC) have not fully embraced BCP47, which is a > > > source of woe for them. Some of the other considerations, such as length, are > > > dealt with already by BCP47 and in actual fact the use and adoption of Unicode > > > Locale Identifiers have placed truly huge language tags into production. > > > > > > > > I'd be glad to discuss the details here. A more thorough reading and/or in > > > depth response is probably warranted on my part. Please let me know how best > > > to meet. > > > > > > > > Addison > > > > > > > > Addison Phillips > > > > Sr. Principal SDE – I18N (Amazon) > > > > Chair (W3C I18N WG) > > > > Editor (IETF BCP 47) > > > > > > > > Internationalization is not a feature. > > > > It is an architecture. > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: janina@rednote.net [mailto:janina@rednote.net] > > > > > Sent: Thursday, September 05, 2019 11:13 AM > > > > > To: Phillips, Addison <addison@lab126.com>; ishida@w3.org; > > > > > atsushi@w3.org; xfq@w3.org > > > > > Cc: W3C WAI Accessible Platform Architectures <public-apa@w3.org>; > > > > > public- i18n-core@w3.org; Fourney, David <david.fourney@usask.ca>; > > > > > Christian Galinski <christian.galinski@chello.at>; 'klaus.miesenberger' > > > > > <klaus.miesenberger@jku.at>; hoeckner@hilfsgemeinschaft.at; > > > > > shadi@w3.org; alejandro.moledo@edf-feph.org; lisa.seeman@zoho.com; > > > > > 'Kasinskaite, Irmgarda' <I.Kasinskaite@unesco.org>; drude@xs4all.nl; > > > > > stevelee@w3.org; 'FERRES Mercè' <FERRES@iso.org>; Charles LaPierre > > > > > <charlesl@benetech.org>; p13n@rednote.net > > > > > Subject: W3C I18N & Accessibility; ISO 639 language codes > > > > > > > > > > Dear W3C I18N Colleagues: > > > > > > > > > > With a growing list of cc's accumulated from email exchanged in the > > > > > past few days ... > > > > > > > > > > APA would like an opportunity to explore what actions W3C can and > > > > > should take toward more useful language specification in web content. > > > > > > > > > > Unfortunately, we meet on different days at TPAC. Also, our TPAC > > > > > calendar has become a little crowded. However, we still have some > > > > > remaining open slots where we might have a preliminary conversation, > > > > > should any I18N people still be in Fukuoka and available later in > > > > > the week. APA will have dialin capability, should a conversation during TPAC > > > prove possible: > > > > > > > > > > https://www.w3.org/WAI/APA/wiki/Meetings/TPAC_2019 > > > > > > > > > > Or, it may be simpler to say we should take this topic up post TPAC, > > > > > as a number of the principals with specific knowledge of the > > > > > accessibility issues we want to discuss will NOT be in Japan. > > > > > > > > > > I will defer to your judgement whether a brief introductory > > > > > conversation in Fukuoka makes sense given limited availability. > > > > > > > > > > However we calendar the conversation, I would request, on behalf of > > > > > APA and particularly our Personalization Task Force that we look for > > > > > an opportunity to address the issues detailed in the email thread > > > > > forwarded here.Our TF is moving forward with technology that should > > > > > significantly improve the web experience of many people living with > > > > > various cognitive and learning disabilities. APA also continues to > > > > > have an interest in uptake of the work we began during the > > > > > development of HTML 5.0 on media accessibility, which brings in our interest > > > in correctly identifying sign language videos. > > > > > > > > > > The above is the simplest agenda description I can come up with at the > > > moment. > > > > > Below are some interesting details that should help better explain > > > > > the concern and hope for improved content markup. > > > > > > > > > > Looking forward to greeting many of you in person in Fukuoka, > > > > > > > > > > Janina > > > > > > > > > > Fourney, David writes: > > > > > > Hi Janina, > > > > > > > > > > > > With respect to standardizing lang codes for AAC (i.e., > > > > > > Augmentative and alternative communication), Chritian is better > > > > > > able to update you on status and timelines. > > > > > > > > > > > > I am responding to your question because I wanted to point out > > > > > > that this proposal (or at least answering the question of whether > > > > > > 3-letter support is sufficiently in place) solves several issues relating to AAC. > > > > > > > > > > > > For example, the ability to use the ISO 639-3 language code for > > > > > > Blissymbols (lang="zbl") would be possible / better supported on > > > > > > the web if we can be certain that both HTML and user agents > > > > > > support such 3-letter encoding. (There remains, of course, the > > > > > > issue of getting Blissymbolic script into the ISO script code > > > > > > and/or Unicode so they are properly displayed.) > > > > > > > > > > > > On the issue of scripts, as I said earlier, it would be useful for > > > > > > users to be able to specify (either as the creator of the content > > > > > > or its user) any preferred scripts. My example below is Russian > > > > > > presented in a different script, but the issue also applies to specific AAC. > > > > > > (e.g., This issue would aid the arguments supporting the > > > > > > development of standards for Blissymbolic script and adding > > > > > > appropriate script > > > > > > codes.) > > > > > > > > > > > > As for the signed modality (including sign languages, but also > > > > > > other manual-visual systems), this proposal tries to capture this > > > > > > AAC technique by using language codes for the natural sign > > > > > > languages (e.g., > > > > > > lang="ase") and the more generic "sgn" for all others. > > > > > > > > > > > > As I mentioned to Christian, the current implementation of HTML5 > > > > > > may already address some of these issues. As mentioned below, > > > > > > BCP47 may need to to be expanded to support a longer length, which will > > > impact HTML. > > > > > > Further BCP47 (and HTML) could eventually specify a minimum 3 > > > > > > character length. > > > > > > > > > > > > Thus the need for user agent support for three-character codes > > > > > > (status > > > > > > unknown) and the need for W3C to begin transitioning to the wider > > > > > > use of the 3-character code (i.e., lang="eng" rather than > > > > > > lang="en") is the main meat of the discussion/proposal. Updating > > > > > > W3C documentation will impact all examples currently using > > > > > > lang="xx" (e.g., this will impact the supporting documents of WCAG 2.1). > > > > > > > > > > > > I hope this further information helps. Please feel free to contact > > > > > > me if you have any questions or concerns. > > > > > > > > > > > > Thanks, > > > > > > David Fourney > > > > > > > > > > > > > > > > > > On 2019-09-04 3:23 p.m., Christian Galinski wrote: > > > > > > > Hi, Janina, > > > > > > > > > > > > > > Thank you for your positive reply. I am sorry that I cannot > > > > > > > attend the TCAP meeting – unless there is the possibility to > > > > > > > attend through teleconferencing. > > > > > > > > > > > > > > This would also be the ideal way to participate for David > > > > > > > Fourney, who could represent ISO/IEC-JTC 1/SC 35 in this matter. > > > > > > > > > > > > > > Please be so kind as to put the issue of language > > > > > > > identifiers/codes for sign languages explained below on the > > > > > > > agenda of the upcoming TCAP meeting in Japan and discuss how it > > > > > > > could be solved, duly taking into account that language codes > > > > > > > increasingly (for a variety of purposes) have to be combined with other > > > coding schemes. > > > > > > > > > > > > > > Below please find a summary of the discussion concerning (1) alpha-2 vs. > > > > > > > alpha-3 language identifiers for sign languages in video > > > > > > > programs and apps and (2) the combination of codes to further > > > > > > > specify the language used, the regional and other language > > > > > > > variety and the script in which a written file is rendered. > > > > > > > > > > > > > > Technically speaking there may be more complexity or deeper > > > > > > > issues behind the questions raised. There may also be new needs > > > > > > > for coordination. We are looking forward to your comments. If > > > > > > > there would be a slot for the discussion of the issues at the > > > > > > > TCAP meeting, David Fourney and me could join by calling in. > > > > > > > > > > > > > > Best regards > > > > > > > > > > > > > > Christian > > > > > > > > > > > > > > *1 Background:* > > > > > > > > > > > > > > The issue at hand is a technical problem that occurs when you > > > > > > > want to assign language identifiers to sign languages, if the > > > > > > > code length of the identifier is limited to alpha-2. However, > > > > > > > ISO 639-1:2002 “Codes for the representation of names of languages – > > > Part 1: > > > > > > > Alpha-2 code” does not provide identifiers for sign languages. > > > > > > > There are estimates of the number of sign languages between more > > > > > > > than 300 and up to 500. About 150 are assigned 3-letter language > > > > > > > identifiers in ISO 639-3 “Codes for the representation of names > > > > > > > of languages – Part 3: Alpha-3 code for comprehensive coverage > > > > > > > of languages”. In this connection, David Fourney also referred > > > > > > > to 2019 as UN's International Year of Indigenous Languages – in > > > > > > > some indigenous language communities sign languages exist. ‘Sign > > > > > > > languages’ differ from ‘signed languages’ insofar as they are > > > > > > > the main language for Deaf and Hard of Hearing persons to > > > > > > > express themselves and largely differ from the language > > > > > > > spoken/written by the language community in which the respective Deaf > > > and Hard of Hearing persons are living. > > > > > > > Compared to ‘sign languages’, ‘signed language’ is a language > > > > > > > modality largely representing the spoken or written form of a > > > > > > > language (e.g. “Signed Exact English”) – thus any language can > > > > > > > be signed in this way which can be identified by adding the > > > > > > > identifier “sgn” to > > > > > the respective language identifier. > > > > > > > > > > > > > > *2 Request to W3C/TCAP:* > > > > > > > > > > > > > > The issue was raised at the ISO/IEC-JTC 1/SC 35 meeting in 2018 > > > > > > > in Okayama “User interfaces” where I reported on standardizing > > > > > > > activities of ISO/TC 37 “Language and terminology” referring to > > > > > > > language > > > > > coding. > > > > > > > David Fourney made TC 37 aware of the fact that there is a “deficiency” > > > > > > > in the ISO 639 series when it comes to the coding of sign > > > > > > > languages in video technology. The issue was taken up by two WGs > > > > > > > in ISO/TC 37 working on the fundamental terminology of language > > > > > > > coding and language varieties in a coordinated way. Out of the > > > > > > > discussions emerged the clarification of the above-mentioned > > > > > > > distinction of ‘sign language and ‘signed language’. The WGs > > > > > > > formulated a request to ISO/IEC-JTC 1/SC 35 to clarify the > > > > > > > matter and formulate a recommendation to ISO/IEC-JTC 1/SC 35. At > > > > > > > its last meeting ISO/IEC-JTC 1/SC 35 in Shanghai on 2 August > > > > > > > unanimously approved > > > > > > > > > > > > > > *Resolution 2019-69: Requests that Alpha-3 codes be used and > > > > > > > recommended * > > > > > > > > > > > > > > ISO/IEC JTC1/SC35 > > > > > > > > > > > > > > * recognizes that the application of the 2-letter (alpha-2) code today > > > > > > > is not sufficient for use in programs and apps related to user > > > > > > > interfaces which is particularly detrimental when needed for > > > > > > > identifying individual languages (including individual sign > > > > > > > languages) in user interfaces. > > > > > > > * resolves to recommend the use of 3-letter codes for language > > > > > > > identification, wherever they can be applied > > > > > > > * requests its chair to contact W3C to ask that they recommend the use > > > > > > > of 3-letter identifiers for the names of languages wherever used > > > > > > > according to: > > > > > > > o ISO 639-2 "Codes for the representation of names of languages > > > > > > > -Part 2: Alpha-3 code" and > > > > > > > o ISO 639-3 "Codes for the representation of names of languages - > > > > > > > Part 3: Alpha-3 code for comprehensive coverage of languages" > > > > > > > (which includes additional languages beyond those in ISO > > > > > > > 639-2) > > > > > > > > > > > > > > These can be recommended either in addition to or in replacement > > > > > > > for the 2-letter language identifiers as defined in ISO 639-1 > > > > > > > "Codes for the representation of names of languages - Part 1: Alpha-2 > > > code". > > > > > > > > > > > > > > Here the issue as explained by David Fourney: > > > > > > > > > > > > > > The technical issue lies primarily with the HTML5 <video> > > > > > > > element and how it supports the HTML lang attribute. > > > > > > > > > > > > > > A <video> allows for one or more <source> files (which can be > > > > > > > audio and or video tracks) as well as one or more <track> files > > > > > > > (for subtitles, captions, transcripts, etc.).As a developer, I > > > > > > > want to specify the language of the captions, audio, and video > > > > > > > so I can meet WCAG's > > > > > SCs. > > > > > > > (WCAG SC 3.1.1 and SC 3.1.2 require the specification of the > > > > > > > language of > > > > > > > content.) > > > > > > > > > > > > > > HTML allows the specification of the language of content on > > > > > > > pretty much any element using HTML5's lang attribute. This means > > > > > > > that I can specify the language of a caption file, an audio > > > > > > > track, or > > > > > > > (presumably) a video track. > > > > > > > > > > > > > > As a user, if my media player supports it, I can select an audio > > > > > > > track in one language (e.g., French) and a caption track in > > > > > > > another (e.g., Norwegian). Theoretically, I can also select a > > > > > > > video track in whatever language I want. > > > > > > > > > > > > > > *That's where the problem lies*. If the audio is embedded in the > > > > > > > video file, then obviously the language of the video is the > > > > > > > language of the audio. This can be any spoken language. > > > > > > > Typically, this is indicated with a two-character code. (This is > > > > > > > also true with audio sources and > > > > > > > captioning.) > > > > > > > > > > > > > > Many languages do NOT have a two-character code. (Many many > > > > > > > languages face this issue. The SIL code tables provides a list > > > > > > > of languages that have one or both types of codes: > > > > > > > https://iso639-3.sil.org/code_tables/639/data) > > > > > > > > > > > > > > But, what if there is no audio in the video? What if the > > > > > > > language of the video is in fact a visual language? What if it is a sign > > > language? > > > > > > > > > > > > > > I should be able to specify the language of the content (e.g., > > > > > > > lang="ase"). Since no sign languages have a two-character code, > > > > > > > this must be a three-character code. > > > > > > > > > > > > > > *3 Combinations of codes:* > > > > > > > > > > > > > > Increasingly a higher degree of granularity is becoming > > > > > > > necessary for identifying not only languages and their regional > > > > > > > varieties, but also other dimensions of language variation, such > > > > > > > as a speaker’s language register or communication anomaly. So > > > > > > > far ISO 639 series deals with combinations of the language > > > > > > > identifiers with the country (or major > > > > > > > subdivision) code acc. to ISO 3166 series and script code acc. > > > > > > > to ISO 15924. > > > > > > > > > > > > > > Here again David Fourney’s explanation: > > > > > > > > > > > > > > With respect to the size of the string used to fully specify > > > > > > > languages, I recommend looking at IETF's BCP47 > > > > > https://tools.ietf.org/html/bcp47. > > > > > > > BCP47 is the document HTML seems to rely upon as well. > > > > > > > > > > > > > > W3C could ask the authors of BCP47 to require a new minimum > > > > > > > string size (if it is not already large enough) and recommend > > > > > > > the expected use of separators. I suggest using a larger string > > > > > > > than 12 characters to future proof this decision. > > > > > > > > > > > > > > I recommend W3C provide examples in all of their discussions on > > > > > > > the use of the lang attribute. These examples should all start > > > > > > > with the 3-character code as its base. All examples using the > > > > > > > 2-character code should be updated. > > > > > > > > > > > > > > With respect to scripts, as I recall, HTML relies entirely on > > > > > > > the specification of the character set. Typically, this is now > > > > > > > set to Unicode which is thought to provide the necessary > > > > > > > characters to write in various languages. As I understand the > > > > > > > situation (and I could be wrong), authors do not have the > > > > > > > ability to specify the script of their > > > > > content. > > > > > > > > > > > > > > You are correct that it would be exceedingly useful to be able > > > > > > > to deliberately specify a script (rather than a character set). > > > > > > > I envisioned this when I wrote ISO/IEC 24756:2009 and, to a > > > > > > > lesser extent, ISO/IEC 20071-23. For example, in languages that > > > > > > > have more than one script, it would be useful for users to be > > > > > > > able to specify that they want captions in one preferred script > > > > > > > (e.g., a user might want Russian captions to be presented in Roman script > > > rather than Cyrillic). > > > > > > > > > > > > > > -----Ursprüngliche Nachricht----- > > > > > > > > > > > > > > Von: Janina Sajka <janina@rednote.net> > > > > > > > > > > > > > > Gesendet: Donnerstag, 29. August 2019 18:17 > > > > > > > > > > > > > > An: lisa.seeman <lisa.seeman@zoho.com> > > > > > > > > > > > > > > Cc: christian.galinski@chello.at; W3C WAI Accessible Platform > > > > > > > Architectures <public-apa@w3.org> > > > > > > > > > > > > > > Betreff: Re: Language codes and iso639 series > > > > > > > > > > > > > > Hi, Lisa, Christian, All: > > > > > > > > > > > > > > It's unclear to me what kind of assistance you're seeking, and > > > > > > > specifically what agendum we might propose for a joint meeting > > > > > > > during TPAC. Christian, are you planning to attend TPAC? It > > > > > > > would be helpful, as I don't see us effectively carrying your concerns > > > second hand. > > > > > > > > > > > > > > I'm aware, at least to a degree, of ISO and IETF standardization > > > > > > > on language coding to include support for specifying sign > > > > > > > language usage,[1] but those are not activities directly in > > > > > > > W3C's I18N remit,[2] though working in coordination with those groups > > > clearly is. > > > > > > > > > > > > > > Is there a W3C i18n document Christian is looking to affect? Or > > > > > > > perhaps you're proposing something W3C might publish? APA would > > > > > > > clearly be interested, but the specifics just aren't in your > > > > > > > email so I'm left guessing. > > > > > > > > > > > > > > We were certainly aware of the multiplicity of sign languages > > > > > > > when we created our "Media Accessibility User Requirements > > > > > > > (MAUR)"[3] document during the process of defining HTML 5.0, and > > > > > > > I believe HTML > > > > > > > 5 supports that well for alternative media. But, I don't think > > > > > > > we've done anything specifically beyond that activity in this space. > > > > > > > > > > > > > > PS: Any news on standardizing lang codes for AAC? > > > > > > > > > > > > > > Please feel free to say more. I'd like to be helpful if I can. > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > Janina > > > > > > > > > > > > > > [1] https://www.evertype.com/standards/iso639/sgn.html > > > > > > > > > > > > > > [2] https://www.w3.org/i18n > > > > > > > > > > > > > > [3] http://www.w3.org/TR/media-accessibility-reqs/ > > > > > > > > > > > > > > Lisa Seeman writes: > > > > > > > > > > > > > >> Hi Janina > > > > > > > > > > > > > >> Christian, who is cc'd is working on improving language code > > > > > > >> support so > > > > > that it works for sign langage and the combinations. For example > > > > > English sign language with Canadian dialect. > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> Can we bring this up at TPAC with internationalisation? > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> All the best > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> Lisa Seeman > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > LinkedIn, Twitter > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -----Ursprüngliche Nachricht----- > > > > > > > Von: Fourney, David <david.fourney@usask.ca> > > > > > > > Gesendet: Montag, 19. August 2019 13:20 > > > > > > > An: christian.galinski@chello.at christian.galinski@chello.at > > > > > > > <christian.galinski@chello.at> > > > > > > > Cc: klaus.miesenberger <klaus.miesenberger@jku.at> > > > > > > > Betreff: Re: Re: HTML etc. and ISO 639-1 2-letter code > > > > > > > > > > > > > > Hi Christian, > > > > > > > > > > > > > > With respect to the size of the string used to fully specify > > > > > > > languages, I recommend looking at IETF's BCP47 > > > > > > > > > > > > > > https://tools.ietf.org/html/bcp47 > > > > > > > > > > > > > > BCP47 is the document HTML seems to rely upon as well. > > > > > > > > > > > > > > W3C could ask the authors of BCP47 to require a new minimum > > > > > > > string size (if it is not already large enough) and recommend > > > > > > > the expected use of separators. I suggest using a larger string > > > > > > > than 12 characters to future proof this decision. > > > > > > > > > > > > > > I recommend W3C provide examples in all of their discussions on > > > > > > > the use of the lang attribute. These examples should all start > > > > > > > with the 3-character code as its base. All examples using the > > > > > > > 2-character code should be updated. > > > > > > > > > > > > > > With respect to scripts, as I recall, HTML relies entirely on > > > > > > > the specification of the character set. Typically, this is now > > > > > > > set to Unicode which is thought to provide the necessary > > > > > > > characters to write in various languages. As I understand the > > > > > > > situation (and I could be wrong), authors do not have the > > > > > > > ability to specify the script of their > > > > > content. > > > > > > > > > > > > > > You are correct that it would be exceedingly useful to be able > > > > > > > to deliberately specify a script (rather than a character set). > > > > > > > I envisioned this when I wrote ISO/IEC 24756:2009 and, to a > > > > > > > lesser extent, ISO/IEC 20071-23. For example, in languages that > > > > > > > have more than one script, it would be useful for users to be > > > > > > > able to specify that they want captions in one preferred script > > > > > > > (e.g., a user might want Russian captions to be presented in Roman script > > > rather than Cyrillic). > > > > > > > > > > > > > > Finally, on the choice of codes. I strongly recommend that ISO > > > > > > > and W3C set an explicit recommendation on exactly which code set to > > > use. > > > > > > > The existence of multiple 3-character sets will add to the > > > > > > > problem rather than solve anything. ISO will need to unify this > > > > > > > work to help ease the confusion. > > > > > > > > > > > > > > David. > > > > > > > > > > > > > > ________________________________________ > > > > > > > > > > > > > > From: christian.galinski@chello.at > > > > > > > <mailto:christian.galinski@chello.at> > > > > > > > christian.galinski@chello.at > > > > > > > <mailto:christian.galinski@chello.at> > > > > > > > <christian.galinski@chello.at > > > > > > > <mailto:christian.galinski@chello.at>> > > > > > > > > > > > > > > Sent: Monday, August 19, 2019 3:06 AM > > > > > > > > > > > > > > To: Fourney, David > > > > > > > > > > > > > > Cc: klaus.miesenberger > > > > > > > > > > > > > > Subject: Fwd: Re: HTML etc. and ISO 639-1 2-letter code > > > > > > > > > > > > > > Hi David, > > > > > > > > > > > > > > Great thanks to you for this excellent clarification! > > > > > > > > > > > > > > The recommendation to use only the 3-letter code for languages > > > > > > > obviously is only one step in the direction of handling language > > > > > > > codes in various combinations with other codes and thus > > > > > > > indicating language varieties to some extent. At present > > > > > > > language varieties can only be indicated in a rudimentary form. > > > > > > > ISO/TR 21636 "Indication and description of language varieties" > > > > > > > will pave the way for a future much more detailed coding of varieties. > > > > > > > > > > > > > > At present we have at our disposal for coding languages > > > > > > > (disregarding the 2-letter code according to ISO 639-1): > > > > > > > > > > > > > > - 3-letter language codes (all small caps) according to ISO > > > > > > > 639-2 and 639-3 > > > > > > > > > > > > > > - 3-letter codes for countries and their subdivisions (all > > > > > > > capitalized) according to ISO 3166-1 and 3166-2 > > > > > > > > > > > > > > (I think we should recommend also here the use of the > > > > > > > 3-letter > > > > > > > code) > > > > > > > > > > > > > > - 4-letter code for scripts /and script variants/ (first letter > > > > > > > capitalized) With 10 digits (12 - if separators are added) we > > > > > > > can thus cope with a lot of variation, under given limitations. > > > > > > > > > > > > > > In the case of sign languages (being true sign languages - i.e. > > > > > > > mother tongues for the Deaf and Hard-of-Hearing) we have at our > > > disposal: > > > > > > > > > > > > > > - 3-letter language code (all small caps) according to ISO 639-3 > > > > > > > > > > > > > > (to be extended towards including further sign languages) > > > > > > > > > > > > > > - 3-letter codes for countries and their subdivisions (all > > > > > > > capitalized) according to ISO 3166-1 and 3166-2 With 6 digits (7 > > > > > > > - if separators are > > > > > > > added) we can thus cope with some variation, under given limitations. > > > > > > > > > > > > > > In the case of the language variety "signed language" (e.g. > > > > > > > Signed Exact > > > > > > > English) we have at our disposal: > > > > > > > > > > > > > > - "sgn" as indicator for "signed language" > > > > > > > > > > > > > > - 3-letter language codes (all small caps) according to ISO > > > > > > > 639-2 and 639-3 > > > > > > > > > > > > > > - 3-letter codes for countries and their subdivisions (all > > > > > > > capitalized) according to ISO 3166-1 and 3166-2 With 9 digits > > > > > > > (11 - if separators are > > > > > > > added) we can cope with a lot of variation, under given limitations. > > > > > > > sgn-eng-AUS would refer to the Australian variety of Signed Exact English. > > > > > > > > > > > > > > Would this mean that we should recommend - under given > > > > > > > circumstances and as a step in the direction of further > > > > > > > necessary varieties in the future > > > > > > > - a minimum of 12 digits (incl. separators) for coding languages (incl. > > > > > > > sign languages and signed language)? Is this realistic, and if > > > > > > > so, is it sufficient? > > > > > > > > > > > > > > Best regards > > > > > > > > > > > > > > Christian > > > > > > > > > > > > > > > ---------- Ursprüngliche Nachricht ---------- > > > > > > > > > > > > > > > Von: "Fourney, David" <david.fourney@usask.ca > > > > > > > <mailto:david.fourney@usask.ca>> > > > > > > > > > > > > > > > An: "christian.galinski@chello.at > > > > > > > christian.galinski@chello.at > > > <mailto:christian.galinski@chello.at%20christian.galinski@chello.at>" > > > > > > > > > > > > > > > <christian.galinski@chello.at > > > > > > > <mailto:christian.galinski@chello.at>> > > > > > > > > > > > > > > > Cc: "klaus.miesenberger" <klaus.miesenberger@jku.at > > > > > > > <mailto:klaus.miesenberger@jku.at>>, hoeckner > > > > > > > > > > > > > > > <hoeckner@hilfsgemeinschaft.at > > > > > > > <mailto:hoeckner@hilfsgemeinschaft.at>> > > > > > > > > > > > > > > > Datum: 17. August 2019 um 02:00 > > > > > > > > > > > > > > > Betreff: Re: HTML etc. and ISO 639-1 2-letter code > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Christian, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To answer your specific question: There is no connection to CSS. > > > > > > > > > > > > > > > Cascading Style Sheets are used only for the styling and > > > > > > > presentation > > > > > > > > > > > > > > > of content. For example, I would use CSS to indicate the font > > > > > > > I want, > > > > > > > > > > > > > > > whether to make the text bold, and where to put it on the screen. > > > > > > > CSS > > > > > > > > > > > > > > > is not for specifying languages, this is the role of HTML. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The technical issue lies primarily with the HTML5 <video> > > > > > > > element and > > > > > > > > > > > > > > > how it supports the HTML lang attribute. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > A <video> allows for one or more <source> files (which can be > > > > > > > audio > > > > > > > > > > > > > > > and or video tracks) as well as one or more <track> files > > > > > > > (for > > > > > > > > > > > > > > > subtitles, captions, transcripts, etc.). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As a developer, I want to specify the language of the > > > > > > > captions, audio, > > > > > > > > > > > > > > > and video so I can meet meet WCAG's SCs. (WCAG SC 3.1.1 and > > > > > > > SC > > > > > > > 3.1.2 > > > > > > > > > > > > > > > require the specification of the language of content.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > HTML allows the specification of the language of content on > > > > > > > pretty > > > > > > > > > > > > > > > much any element using HTML5's lang attribute. This means > > > > > > > that I can > > > > > > > > > > > > > > > specify the language of a caption file, an audio track, or > > > > > > > > > > > > > > > (presumably) a video track. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As a user, if my media player supports it, I can select an > > > > > > > audio track > > > > > > > > > > > > > > > in one language (e.g., French) and a caption track in another > > > > > > > (e.g., > > > > > > > > > > > > > > > Norwegian). Theoretically, I can also select a video track in > > > > > > > whatever > > > > > > > > > > > > > > > language I want. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > That's where the problem lies. If the audio is embedded in > > > > > > > the video > > > > > > > > > > > > > > > file, then obviously the language of the video is the > > > > > > > language of the > > > > > > > > > > > > > > > audio. This can be any spoken language. Typically, this is > > > > > > > indicated > > > > > > > > > > > > > > > with a two-character code. (This is also true with audio > > > > > > > sources and > > > > > > > > > > > > > > > captioning.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Many languages do NOT have a two-character code. (Many many > > > > > > > languages > > > > > > > > > > > > > > > face this issue. The SIL code tables provides a list of > > > > > > > languages that > > > > > > > > > > > > > > > have one or both types of codes: > > > > > > > > > > > > > > > https://iso639-3.sil.org/code_tables/639/data) > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> (A reminder that 2019 is the UN's International Year of > > > > > > >> Indigenous > > > > > > > > > > > > > > > Languages.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > But, what if there is no audio in the video? What if the > > > > > > > language of > > > > > > > > > > > > > > > the video is in fact a visual language? What if it is a sign language? > > > > > > > > > > > > > > > I should be able to specify the language of the content > > > > > > > (e.g., > > > > > > > > > > > > > > > lang="ase"). Since no sign languages have a two-character > > > > > > > code, this > > > > > > > > > > > > > > > must be a three-character code. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > So the first issue is: "Can I do this?" > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From reading the HTML 5.2 and some IETF specifications, I > > > > > > > MIGHT be > > > > > > > > > > > > > > > able to use a three-character code, but its not very clear IF I CAN. > > > > > > > > > > > > > > > The specification appears to allow a code of 6 to 8 characters in length. > > > > > > > > > > > > > > > This suggests a combination of language and region codes, > > > > > > > including > > > > > > > > > > > > > > > hyphens, might fit a three-character language code plus a > > > > > > > > > > > > > > > two-character region code, but not much else. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Resources on this include IETF's BCP47 > > > > > > > > > > > > > > > https://tools.ietf.org/html/bcp47 > > > > > > > > > > > > > > > and the HTML5.2 specification > > > > > > > > > > > > > > > > > > > > > > https://www.w3.org/TR/html52/dom.html#the-lang-and-xmllang-attri > > > > > > > bute > > > > > > > s > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The living specification discusses this at > > > > > > > > > > > > > > > > > > > > > > https://html.spec.whatwg.org/#the-lang-and-xml:lang-attributes > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The second issue is: "Will it work?" > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If a browser sees a three-character language code, will it > > > > > > > know what > > > > > > > > > > > > > > > to do with it? What about a media player? What about a screen reader? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Its all well and good that I can specify my language, but not > > > > > > > if it is > > > > > > > > > > > > > > > not supported (i.e., my user agent won't be able to handle it). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Setting aside <video>, I would also point out that this > > > > > > > second issue > > > > > > > > > > > > > > > applies to the browser in general. Is there full support for > > > > > > > > > > > > > > > specifying the language of a document using a three-character > > > > > > > code > > > > > > > > > > > > > > > (e.g., <html lang="eng"> vs. <html lang="en">). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As I mentioned in Ottawa, what we need the W3C to do is: > > > > > > > > > > > > > > > 1. Confirm how large a language code can be used within the > > > > > > > HTML lang > > > > > > > > > > > > > > > attribute and determine if this length is large enough given > > > > > > > the > > > > > > > > > > > > > > > three-character codes of ISO 639-2 and the various region and > > > > > > > script > > > > > > > > > > > > > > > codes that can be appended to it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Confirm that user agents are required to support long > > > > > > > language > > > > > > > > > > > > > > > codes (via the lang attribute), not just the two-character > > > > > > > codes that > > > > > > > > > > > > > > > are specified in ISO 639-1. This is important because, if the > > > > > > > HTML > > > > > > > > > > > > > > > specifications allow for rather long codes but the user > > > > > > > agents do not, > > > > > > > > > > > > > > > then using a long code will not work. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To my mind, there should be no issue because it is just a > > > > > > > language > > > > > > > > > > > > > > > indication code. Most of the time user agents should just > > > > > > > accept any > > > > > > > > > > > > > > > code and do nothing further with it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This issue was the source of my concern only because you > > > > > > > mentioned the > > > > > > > > > > > > > > > demand to freeze ISO 639-1 from 20+ years ago. The freeze > > > > > > > request > > > > > > > > > > > > > > > suggests to me that user agents only support a small number > > > > > > > of codes > > > > > > > > > > > > > > > and intend to act in some way on these codes. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. Confirm that the lang attribute (of any length) can be > > > > > > > used on any > > > > > > > > > > > > > > > HTML element in a meaningful way, including the specification > > > > > > > of the > > > > > > > > > > > > > > > language of a video track (e.g., <source src="movie.mp4" > > > > > > > > > > > > > > > type='video/mp4' lang='ase'>). > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> Ultimately, the need is to determine if user agents support > > > > > > > > > > > > > > > three-character codes so that the specification of a video or > > > > > > > a > > > > > > > > > > > > > > > document in a language that only has a three-character code > > > > > > > will > > > > > > > > > > > > > > > actually work. I would expect someone at W3C will know what > > > > > > > support is (or is not) available. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I hope that this explanation helps you. Please let me know if > > > > > > > you have > > > > > > > > > > > > > > > any questions. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > David. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 2019-08-15 12:21 p.m., christian.galinski@chello.at > > > > > > > <mailto:christian.galinski@chello.at> > > > > > > > > > > > > > > > christian.galinski@chello.at <mailto:christian.galinski@chello.at> > > > wrote: > > > > > > > > > > > > > > > > Hi, David, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > How are you doing? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Further to our recent discussions I would like to ask you > > > > > > > to clarify > > > > > > > > > > > > > > > > one more technical question: concerning the use of the > > > > > > > alpha-2 code > > > > > > > > > > > > > > > > (acc. to ISO 639-1?) in HTML and/or XHTML and/or HTML5 > > > > > > > which you > > > > > > > > > > > > > > > > mentioned is hindering certain functions/features necessary > > > > > > > for the > > > > > > > > > > > > > > > > Deaf and hard of hearing. Is there a connection to CSS? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Could you please elaborate a bit on this technical question? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If there is an issue, how should it be presented to W3C/TCAP? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best regards > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Christian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > p.s. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Janina Sajka > > > > > > > > > > Linux Foundation Fellow > > > > > Executive Chair, Accessibility Workgroup: http://a11y.org > > > > > > > > > > The World Wide Web Consortium (W3C), Web Accessibility Initiative (WAI) > > > > > Chair, Accessible Platform Architectures http://www.w3.org/wai/apa > > > > > > > > > > -- > > > > > > Janina Sajka > > > > > > Linux Foundation Fellow > > > Executive Chair, Accessibility Workgroup: http://a11y.org > > > > > > The World Wide Web Consortium (W3C), Web Accessibility Initiative (WAI) > > > Chair, Accessible Platform Architectures http://www.w3.org/wai/apa > > > > -- > > Janina Sajka > > Linux Foundation Fellow > Executive Chair, Accessibility Workgroup: http://a11y.org > > The World Wide Web Consortium (W3C), Web Accessibility Initiative (WAI) > Chair, Accessible Platform Architectures http://www.w3.org/wai/apa > -- Janina Sajka Linux Foundation Fellow Executive Chair, Accessibility Workgroup: http://a11y.org The World Wide Web Consortium (W3C), Web Accessibility Initiative (WAI) Chair, Accessible Platform Architectures http://www.w3.org/wai/apa
Received on Wednesday, 18 September 2019 11:38:48 UTC