- From: Neil Soiffer <soiffer@alum.mit.edu>
- Date: Fri, 3 Feb 2023 21:05:57 -0800
- To: "www-math@w3.org" <www-math@w3.org>
- Message-ID: <CAESRWkBE_ercbfYnEPnBLA6zuo4N6BcJaiB2+Eu4PbG_UufxKA@mail.gmail.com>
Attendees: - Neil Soiffer - Louis Maher - Patrick Ion - David Farmer - Steve Noble - Bert Bos - Dennis Müller - Deyan Ginev - Cary Supalo - David Carlisle - Paul Libbrecht - Sam Dooley - Murray Sargent - Bruce Miller <https://sandbox.cryptpad.info/code/inner.html?ver=5.2.2-0#cp-md-0-regrets> Regrets <https://sandbox.cryptpad.info/code/inner.html?ver=5.2.2-0#cp-md-0-agenda> Agenda <https://sandbox.cryptpad.info/code/inner.html?ver=5.2.2-0#cp-md-0-1-announcements-updates-progress-reports>1. Announcements/Updates/Progress reports NS: The Opera browser has picked up the chrome implementation. SN: Pearson needs line breaking to enhance accessibility. It is not supported in core. NS: For things to move forward, either somebody needs to do the implementation, or somebody needs to pay to do the implementation. NS: Line breaking is on the table for core level 2. SN: will let his management know these facts. NS: There is a polyfill that can provide line breaking. PL: On the email list that discusses media types, there is a professor who is trying to register generic media types for elementary things, such as numbers in operations. PL: says there is no need for this. <https://sandbox.cryptpad.info/code/inner.html?ver=5.2.2-0#cp-md-0-2-charter-discussion-a-walk-through-with-some-quot-live-quot-changes-10-minutes-max->2. Charter discussion: a walk through with some "live" changes (10 minutes max) NS: started reviewing "Other Deliverables" SN, NS, and SD will work on MathML accessibility. NS: removed search from the deliverables. NS: discussed the item: A living catalog for annotations beyond those defined in a MathML 4 recommendation. After a discussion, some of this wording was changed. We want an open Catalog for adding new intents. NS: next considered the item: Sample code for conversion of annotated Presentation MathML to an external form such as speech and/or Content MathML. People did not want to over promise on this issue. DC: There are some cases where it's better not to put intent on, just let the default just do the right thing. We do not want to commit ourselves to put intent everywhere on everything. NS: It seems like providing sample code is setting ourselves up for something that we can't really do. We should just state our expectations for defaults. DG: This is the most difficult thing we have left to do. He wants to push this off the charter list because this is just setting ourselves up for something that we can't really do. This effort may require thousands of rules. DG: Let's push it off of the official charter list, and if we can do it, we can always include it later as a bonus. PL: We are saying that some examples will be delivered. We are not promising completeness. NS: The goal of writing down defaults is to say that this is the minimum amount of interpretation that AT should be able to process. DG: Drop it because if we gave some examples, then people would argue that we did not choose the right examples. *ACTION* PL: will look to see if we already have an issue on this. If we do not have an issue on this, then PL will open one. <https://sandbox.cryptpad.info/code/inner.html?ver=5.2.2-0#cp-md-0-3-continue-intent-discussions->3. Continue intent discussions. <https://sandbox.cryptpad.info/code/inner.html?ver=5.2.2-0#cp-md-0-a-a-href-https-github-com-w3c-mathml-issues-409-409-internationalization-a->a) 409 internationalization <https://github.com/w3c/mathml/issues/409> -- Can anyone come up with a semi-complete list of known intents? NS: MUS: shared is list with the group. Most of the things on his list were Unicode values. They were not really a list of intents because the list preceeds intent by many years. For the part that was potentially useful for intents, it was not a complete list of intents. It had around 40 intents. DF: is opposed to the TeX converter producing an international string. We started a discussion about intent translations. DF: If you have a person, reading a math document in Spanish, he wants to hear his intent in Spanish also. How is this done. There are two ways. 1. The initial creator of the document prepares the document, including intents, in Spanish. 2. The author, creates the document in his own language. When the document is translated, the intents are not translated. When the reader accesses the document, the AT reads the standard Spanish text, and the AT translates the intent into Spanish. NS: did some google translate tests. The translator took about 0.1 seconds per word to translate the document. NS: said that the reader of the document expects his translations to be returned to his screen in around 0.1 seconds. For this reason, an on-the-fly translation of intents is not practical, whereas the document author has all the time he needs to translate the intent values. DF: said that he recommends looking up the intent from a list of pre-translated intents. We could develop a list of intents, and a local language list of those intent translations could be provided. The AT could then provide an intent translation in real time using word lookup. NS: tell me all the words that need translating. NS is dubious that such a general list could be made. NS: The document author knows what needs to be translated. The author's listed of translated intent words is small. MUS: Take the common words, put them in English, and have lookup tables to put them into the local language. MUS: said that this would give you 99% of the words you need. MUS: developed such a list eight years ago. NS: We need a list of intent values for translation. The list might be thousands of words long. MUS: put up a list and people can add to it as necessary. DG: did start on such a list based on Khan academy math classes. PL: What we are talking about is a list of intent names and their pronunciations. NS: As an AT developer, I need to know apriori what the possible intent values are in order to be able to build this table of translations. NS: We are just talking about core and not the open list. DG: Both pieces are important. MUS: We need an extensible set. The core set should be translated ahead of time. translating a single word may not use the math context. we should translate the important terms using the math context. This local language list could be given to the AT. This list should be open. DG: Suppose I write in Bulgarian with Bulgarian intents, for a Bulgarian audience. then I would use macros that tell the AT to use my Bulgarian intents and not to translate them. NS: This reminds me that knowing the language of the document and overriding that via the lang attribute is something we need consider so that if the language of the intent differs from the language of the document, it is noted on via a lang attribute. DF: It is our job to provide the long list and the translations. NS: We cannot translate into many languages. but we do need to develop a list. DG: So, Facebook has this model with 200 languages. It does a nice job, I tried 5 languages, and each of them were translated well. The translator did not use tables. DG: So both the table look-up procedure and the AI approach are important, and you shouldn't predicate what methods going to get used, because I think both symbolic and your own methods have interesting practical applications. From Deyan Ginev to Everyone: https://huggingface.co/facebook/nllb-200-3.3B PL: Thinks that our group could translate intents into six languages. NS: would like to get the lists of intents before working out the details of translating. BM: Internationalization means we can deal with documents in multiple languages. it does not mean we can automatically translate between languages. what level are we aiming for? BM: We have been considering a minimal dictionary for things that need special treatment. From Patrick D F Ion to Everyone: It seems to me that the WG can certainly already do 6 or more languages from native speakers, and knows enough close friends MUS: This would argue for things not in the core list. Elementary things must be on the list. DC: We are over thinking this. Let us get a list of fifty in ten values and set up the translation infrastructure to work with this list. Then we can grow the list as needed. DC: We have not decided what we want to do with the list of words. we are not making progress. PL: We have agreement that a list is desirable. NS: I am afraid that if we come up with such a list, that it will be woefully incomplete and therefore not usable. DF: We need a list so that we can start deciding what we will do with it. *ACTION* DF: I'll start making a short list so that we can maybe get to the next step, and I'll put all my top 10 on it. NS: Please gather up all your macros that you're using for semantics and include those. DF: Yes. NS: So, I hope we've made some progress in that. At least some people are going to come up with lists. Paul, you have the action item of checking on defaults for intense, whether we have an item about that, and if not to create an issue. PL: I will send you an email on this.
Received on Saturday, 4 February 2023 05:06:20 UTC