- From: Frédéric Wang <fwang@igalia.com>
- Date: Thu, 7 May 2020 21:10:49 +0200
- To: Murray Sargent <murrays@exchange.microsoft.com>, "public-mathml4@w3.org" <public-mathml4@w3.org>
- Message-ID: <ebf33090-8f8b-2820-d52f-53911c9da1ff@igalia.com>
I see. This is done in the human-readable dictionary https://mathml-refresh.github.io/mathml-core/#operator-dictionary For the compact form, I'm not sure whether that would be a good idea as 1) I don't think browsers use non-ascii in their code 2) they generally use UTF16 rather than UTF-8 for internal string storage 3) it makes harder to distinguish very similar symbol, invisible operators etc. 4) this is not supposed to be read/edited manually anyway On 07/05/2020 18:53, Murray Sargent wrote: > > Got it. In the RichEdit MathML reader, there’s a large entity name > table since the SAX parser used doesn’t recognize the math entity > names. I’ve used UTF8 for math and braille symbols in RTF and C++ code > and find it way easier to read the actual symbols rather than hex > notation, e.g., ∫instead of U+222B. Works on all platforms 😊 > > > > Thanks, > > Murray > > > > *From: *Frédéric Wang <mailto:fwang@igalia.com> > *Sent: *Thursday, May 7, 2020 9:23 AM > *To: *Murray Sargent <mailto:murrays@exchange.microsoft.com>; > public-mathml4@w3.org <mailto:public-mathml4@w3.org> > *Subject: *Re: [EXTERNAL] Re: MathML Core Meeting: cancelled for 20/4/20 > > > > @Murray: mmh, I'm not sure what you mean by the entity names or &#...; > all of this is handled by the HTML parser, so it's out of the scope of > MathML Core. > > > > This is about https://github.com/mathml-refresh/mathml/issues/176 > <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmathml-refresh%2Fmathml%2Fissues%2F176&data=02%7C01%7Cmurrays%40exchange.microsoft.com%7C1d53c1bff9f94f150d1408d7f2a2f5cd%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637244654039288300&sdata=htst5VQzFmL%2F59J3iuNqHHs6awHUrBWifxmmDTJiqCo%3D&reserved=0> > which deals with the actual character/string, not entities. > > > > On 07/05/2020 18:13, Murray Sargent wrote: > > One thought is to get rid of the entity names in core; just use > UTF8 (the most readable solution) or &#x…; That would save space > and encourage tools to be more modern. In any event, ASCII-only > names should have 8-bit characters, not 16 or 32. > > > > Thanks, > > Murray > > > > *From: *Frédéric Wang <mailto:fwang@igalia.com> > *Sent: *Thursday, May 7, 2020 1:41 AM > *To: *public-mathml4@w3.org <mailto:public-mathml4@w3.org> > *Subject: *[EXTERNAL] Re: MathML Core Meeting: cancelled for 20/4/20 > > > > Hi, > > Sorry, I forgot to reply to this. > > This is for the same reason explained last November: patches > increasing significantly Android binary size will be very hard to > get approved. Current MathML3 operator dictionary size (counting > single-char only) is 6935bytes and although the total is probably > less than Chromium's hard limit for an alarm (16kb), it would be > *very* bad to submit a patch with the raw table of ~1500 entries > and no optimization at all. > > Writing a script to generate a specific form of the tables is not > a big deal, I had already written one last November and was > expecting that it would help David and you for the dictionary > updates... The point is that the actual optimization and size will > depend on the final values of the dictionary. The compression is > based on redundancy but there are still many edge cases that are > causing size increase and it's not clear whether these cases can > be removed (not essential), or can be merged into an existing > category (mistakes or inconsistency) or are really fundamental > (deserves special treatment). See github issues #176 #143 #209 for > details. > > Finally, any update has a cost so I disagree with you on your > "update should be easy": not only we need to regenerate the tables > in the spec, but also the WPT tests & associate WOFF font, then we > need to synchronize all the tests in WebKit/Chromium/Gecko, and > get patches reviewed for all these operator dictionaries updates, > dealing with any test failures/updates. So although in principle I > agree with you that we should be ready to update the table in the > future, I expect that it is not going to happen so frequently and > that any change will preserve the decided existing categories > (i.e. we are not going to add new fancy operators with bizarre > spacing and properties) and so won't change fundamentally the > optimization logic and implementation. > > Given that we are about to upstream operator code to Chromium, I > think the solution to unblock this will be to perform the > synchronization of tests based on > > https://mathml-refresh.github.io/mathml-core/#operator-dictionary-compact > <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmathml-refresh.github.io%2Fmathml-core%2F%23operator-dictionary-compact&data=02%7C01%7Cmurrays%40exchange.microsoft.com%7C1d53c1bff9f94f150d1408d7f2a2f5cd%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637244654039298295&sdata=whuiBoaUXcBfe1jTkQGvQXzj6slo5qaP7kYuCdjQI7U%3D&reserved=0> > > maybe with a few more updates and excluding edge cases ; and we > will then ignore any future update in the short term. > > > > Thanks, > > -- > > Frédéric Wang > > On 23/04/2020 20:20, Neil Soiffer wrote: > > I'm confused as to why updating the values in the operator > dictionary is blocking you. Given that the table is large (and > hence likely to change because bugs will exist in it just as > bugs exist in code) and that Unicode comes out with updates at > least once a year, any implementation needs to be able to be > able to easily update it's tables, so working with the current > values should be acceptable. I've cc'd David in case the > problem is that some specific form of the table needs to be > generated that is not currently generated. > > > > Neil > > > > > > On Wed, Apr 22, 2020 at 9:02 AM Frédéric Wang > <fwang@igalia.com <mailto:fwang@igalia.com>> wrote: > > On 20/04/2020 06:37, Neil Soiffer wrote: > > > The pressing issues to discuss are ones that were > raised by Frédéric. He said he can't make the meeting > on Monday, so there won't be a meeting on Monday. > Hopefully he can make it the following week > > > > If someone wants to add items to an agenda for the > 27th, please do so by editing the agenda at > https://github.com/mathml-refresh/mathml/issues/8#issuecomment-616303323 > <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmathml-refresh%2Fmathml%2Fissues%2F8%23issuecomment-616303323&data=02%7C01%7Cmurrays%40exchange.microsoft.com%7C1d53c1bff9f94f150d1408d7f2a2f5cd%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637244654039298295&sdata=DnlDBzmnnUMP03PhQAGyXOCVJxPBlyIB0QiqUDpjWUQ%3D&reserved=0>. > > > > For those planning on attending, enjoy your unplanned > extra hour this week :-) > > > > Neil > > > > > > <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avg.com%2Femail-signature%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Dwebmail&data=02%7C01%7Cmurrays%40exchange.microsoft.com%7C1d53c1bff9f94f150d1408d7f2a2f5cd%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637244654039308286&sdata=dsj64AySS8nm3wNJI5%2BKmMZMxAUWwJpoXQoYmMh4%2FPw%3D&reserved=0> > > > > Virus-free. www.avg.com > <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avg.com%2Femail-signature%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Dwebmail&data=02%7C01%7Cmurrays%40exchange.microsoft.com%7C1d53c1bff9f94f150d1408d7f2a2f5cd%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637244654039318286&sdata=4V6VFDWRtDrN8p6eIQjBYd%2F89MMscmO8wXuL55TZqt4%3D&reserved=0> > > > I removed my pending items, I think the highest priority > right now is having the compact operator dictionary as > this is already blocking us to implement movablelimits for > example ; or to do refactoring in Gecko / WebKit or to > update tests. Updating the CSS proposal (in particular > with fantasai's suggestion) is also important but we need > more time to review that with Rob ; so it probably does > not make sense to discuss it again for now. Finally, > several layout improvements (operators, etc) will probably > depend on future feedback from Google reviewers, so not > sure the CG can decide for now. > > -- > > Frédéric Wang > > > > > > > > -- > Frédéric Wang > > > -- Frédéric Wang
Received on Thursday, 7 May 2020 19:11:17 UTC