Re: Comments on MathML Last Call from Martin Duerst on 2003-07-03 (www-math@w3.org from July 2003)

From: Martin Duerst <duerst@w3.org>
Date: Thu, 03 Jul 2003 11:25:45 -0400
To: "Patrick D. F. Ion" <ion@ams.org>
Cc: www-math@w3.org, w3c-i18n-ig@w3.org
Message-Id: <4.2.0.58.J.20030703111558.04ee5ff0@localhost>
Hello Patrick,

I have added the i18n IG (where all your technical discussion is)
to the cc, as I did with the original comments, and these comments
are I18N WG comments as said in
http://lists.w3.org/Archives/Public/www-math/2003May/0047.html.

Again I'm speaking only for myself, but by chance we have a
teleconference in a couple hours, so if you don't get anything
more from me by 3:00 pm today, this is final.

I have looked through your detailed discussion, and I'm
okay with the changes you made.

Regards,    Martin.

At 17:09 03/07/02 -0400, Patrick  D. F. Ion wrote:

>Dear Martin,
>
>Thank you very much for your characteristically careful
>and detailed reading of the MathML 2 Revised Edition draft.
>This is, at last, the reply to your various comments and
>suggestions dealing with Chapter 6, most of which we have
>simply adopted.
>
>Your comments on the front page, and Chapter 3 and 4,
>and Appendices A and B have been addressed separately.
>Your message was:
>http://lists.w3.org/Archives/Public/www-math/2003May/0026.html
>
>I hope you will agree that any change not made is properly one of
>stylistic preference.
>
>Best regards,
>
>         Patrick
>
>-------
>-------
>
>In detail:
>
>---
>6.1 Introduction
>
>  >>>>
><< It did not fall naturally within the purview of developing a specification
><< enabling mathematics to be used with HTML and producing a DTD for the
><< Working group this to worry about more than the entities allowed in the 
>DTD.
>  >>>>
>
><< "this" is weird.
>
>FIXED: A typo from another change.
>
>==========
><<
><< More general, the I18N WG has on various occasions requested that the
><< introduction in chapter 6 be seriously shortened to make sure the document
><< stays a spec rather than a historical account of a spec's history.
>
>The text has been shortened quite a bit.  However, the presence of
>explanatory text, to outline the situation to readers and implementors
>who may not be aware of reasons for what they find strange, was
>intentional.  By and large the MathML spec has been felt to read quite
>well.  The spec, I suggest, has enough dry technical detail that
>few will think it anything else.  A history of how the spec came to
>be would require a lot more room.
>
>====
><< "While a long process of review and adoption by UTC and ISO/IEC of the
><< characters of special interest to mathematics and MathML is now  complete
><< (Unicode Work in Progress) there remains the possibility of some further
><< modification of the lists of characters accepted, of the code assignments
><< for those adopted, or of the names given them by Unicode. To make sure any
><< possible corrections to relevant standards are taken into account, and for
><< the latest character tables and font information, see the W3C Math Working
><< Group home page and the Unicode site."
>
><< This is highly misleading. There is a very strong commitment by
><< Unicode and ISO to not change any codepoints or names. The characters
><< referenced in the spec to our knowledge all have been fully
><< accepted, and any language such as the above suggesting there
><< will be further changes is highly confusing and misleading and
><< should be removed.
>
>As you are no doubt aware, although the invariability of a character
>standard like Unicode is as desirable as ever, there seem to be changes
>afoot again that will affect both mathematical encoding and W3C.
>Unfortunately we do not have a situation in which someone can say,
>as in the story of Daniel
>"O king, establish the decree and sign the writing, that it
>be not changed, according to the laws of the Medes and Persians
>which altereth not."
>Daniel 6:8-9
>
>Thus it seems reasonable to retain a weakened version of the text above.
>The reference to the Unicode Work in progress has been moved and clarified.
>
>=====
><< "The parenthetical notation beginning with U+ is one recommended by Unicode
><< for referring to Unicode characters [see [Unicode], page xxviii]."
>
><< What about this notation is parenthetical? Proposal: remove 
>'parenthetical'.
>The notation is in parentheses; that's what parenthetical means.
>CHANGED for clarity TO
>'notation, just introduced in parentheses,'
>
><< 'is one' -> 'is the one';
>CHANGED per grammar To
>'is that'
>
><< also, just introduce the notation, and then
><< avoid to list the same numbers twice, once without and once with U+.
>The redundancy was felt to be of possible assistance to those not
>already well familiar with Unicode notations for character codes.
>====
>
>6.2.1 Unicode Character Data
>
>  >>>>>>>>
><<      * Using characters directly: For example, an A may be entered as 'A'
><< from a keyboard (character U+0041J). This option is only available if the
><< character encoding specified for the XML document includes the character.
><< Most commonly used encodings will have 'A' in the ASCII position. In many
><< encodings, characters may need more than one byte. Note that if the
><< document is, for example, encoded in Latin-1 (ISO-8859-1) then only the
><< characters in that encoding are available directly. Unfortunately, most
><< mathematical symbols may not be encoded as character data in this way.
>  >>>>>>>>
>
><< The last sentence is misleading. Using UTF-8 or UTF-16, the two only
><< encodings that all XML processors are required to accept, mathematical
><< symbols can be encoded as character data.
>
>As mentioned by David Carlisle
>http://lists.w3.org/Archives/Public/www-math/2003May/0029.html
>this didn't get across a point we intended.  We can adopt your
>sentence:
>
>LAST SENTENCE CHANGED TO
>Using UTF-8 or UTF-16, the only two encodings that all XML
>processors are required to accept, mathematical symbols can
>be encoded as character data.
>
>====
>
>  >>>>
><< By using Character references it is always possible to access the entire
><< Unicode range.
>  >>>>
>
><< 'Character references': inconsistent capitalization.
>
>FIXED
>
>=====
>
><< 6.2.2 Special Characters Not in Unicode
>
>  >>>>
><< In these cases one may use the mglyph  element for direct access to a glyph
><< from some font and creation of a MathML character corresponding.
>  >>>>
>
><< corresponding to what?
>To the glyph.  The idea is that if you have created a glyph in a font
>for mathematical notation not in Unicode, then there's a way to use
>it like a character.  For instance, if the overcrossing drawn in
>knot theory is used in a discussion of knotting of DNA then it is
>quite possible that it may need to occur in an equation.  <mglyph>
>is what you use to do this.
>
>CHANGED TO
>creation of a MathML substitute for the corresponding character.
>
>=====
>
><< 6.2.3 Mathematical Alphanumeric Symbols Characters.
>
><< there should not be a dot after the title
>FIXED
>====
>
>  >>>>
><<   The new Mathematical Alphanumeric Symbols provided in Unicode 3.1
>  >>>>
>
><< remove 'new'. Otherwise, the spec already looks outdated
><< before it is approved.
>The characters expressly introduced by Unicode to facilitate
>mathematical formulas certainly are new.  They are the solution
>that was found for a specific need in mathematical markup.
>It could conceivably have happened that only a few special math
>variant markers were introduced, but it did not.
>
>CHANGED
>'new' ===> 'additional'
>
>=====
>  >>>>
><< ... in contrast to the Basic Multilingual Plane (BMP) which has been used
><< by Unicode so far.
>  >>>>
>
><< remove temporal context ('so far')
>
>The addition of many new (additional) planes was an important
>change for Unicode.
>
>'which has been used by Unicode so far'
>CHANGED TO
>'which was originally the entire extent of Unicode'
>
>====
>  >>>>
><< For example, a Mathematical Fraktur alphabet is being added, and the code
><< point for Mathematical Fraktur A is U1D504.
>  >>>>
>
><< 'is being added' seems to refer to some activity that is now complete.
><< Please update. Also, U1D504 -> U+1D504
>
>Wrong tense and wrong code FIXED
>
>=====
>
>6.2.4 Non-Marking Characters
>
>  >>>>
><< Some characters, although important for the quality of print or alternative
><< rendering, do not have glyph marks that correspond directly.
>  >>>>
>
><< correspond to what?
>To the character, since it is not supposed to create a mark directly.
>There are such characters in Unicode.
>
>ADDED 'to them'
>====
>
>
>  >>>>
><< The Universal Character Set (UCS) of Unicode and ISO 10646 continues to
><< evolve, see Section 6.4.4 Status of Character Encodings. A small number of
><< the changes recently introduced, relative to those resulting from the needs
><< of Asian languages, are those designed exactly to facilitate the use of
><< Unicode by the 'equation-writing' community. This specification is written
><< on the assumption that the code assignments suggested to ISO/IEC
><< JTC1/SC2/WG2 by the UTC will be confirmed as they are in public draft forms
><< of Unicode 3.1 and 3.2. As before, we can only reiterate that for latest
><< developments on details of character standards as far as they influence
><< mathematical formalism the home page of the W3C Math Working Group should
><< be consulted.
>  >>>>
>
><< This seems to be totally outdated. Also, 
>http://www.w3.org/Math/workingGroup
><< does not provide any relevant info. As text such as this has appeared
><< in older versions, http://www.w3.org/Math/workingGroup should contain
><< such info, even if it is just to say that all characters in question have
><< been approved in the meantime.
>
>This is a piece of text that should have been excised and so we have
>a new shortened version (see below).  The comments about the character
>information that ought to be found on the Math WG page (or IG page
>later perhaps) are quite right.   It is intended to keep such
>information on updates there.
>
>NEW VERSION ==>
>
>The Universal Character Set (UCS) of Unicode and ISO 10646 continues to
>evolve, see Section 6.4.4 Status of Character Encodings.  At the time
>of writing the standard is Unicode 4.0.  As before, we can only reiterate
>that for latest developments on details of character standards as far as
>they influence mathematical formalism the home page of the W3C Math
>Activity should be consulted.
>
>====
><< 6.3 Character Symbol Listings
>
>  >>>>
><<   The characters are listed by name, and sample glyphs provided for all of
><< them. Each character name is accompanied by a code for a character grouping
><< chosen from a list given below, a short verbal description, and a Unicode
><< hex code drawn from ISO 10646, now extended in accordance with the proposal
><< forwarded by the UTC to ISO/IEC WG2 in March 2000.
>  >>>>
>
>outdated, please fix
>
>UPDATED
>
>====
><< 6.3.1 Special Constants
>
>  >>>>
><< These have been accorded new Unicode values.
>  >>>>
>
><< 'have been accorded': remove temporal reference
>
>'have been accorded new'
>===>
>'now have'
>
>====
>
>6.3.4 Negated Mathematical Characters
>
>  >>>>
><< Note that it is the policy of the W3C and of Unicode that if a single
><< character is already defined for what can be achieved with a combining
><< character, that character must be used instead of the decomposed form. It
><< is also intended that no new single characters representing what can be
><< done by with existing compositions will be introduced.
>  >>>>
>
><< There should be an explicit mention of NFC, with a reference to Unicode
><< Standard Annex #15.
>
>DONE Text and reference added
>
>====
>
><< 6.3.6 Mathematical Alphanumeric Symbols
>
>  >>>>
><< Most of these characters come from the additions to Plane 1, however a few
><< characters (such as the double-struck letters N, P, Z, Q, R, C, H
><< representing common number sets) were already present in Unicode 3.0 and
><< retain their original positions.
>  >>>>
>
><< This is again more version/history-oriented than necessary. What about:
>
><< Most of these characters are in Plane 1, except for a few characters (such
><< as the double-struck letters N, P, Z, Q, R, C, H representing common number
><< sets) which are in the BMP.
>
>It doesn't seem essential to excise the history here, and it helps
>some to understand the context.
>
>=====
>
><< 6.4.2 Fewer Non-marking Characters
>
>  >>>>
><< It used to be in MathML 1.0 that there were a number more non-marking
><< character entities listed.
>  >>>>
>
><< 'It used to be' reads like 'once upon a time'. But this is a spec, not
><< a fairy tale. What about:
>
><< MathML 1.0 contained a small number of non-marking character entities that
><< have been removed in MathML 2.0.
>
>I suppose the suggested revision is more machine-friendly.  I see no
>difficulty with the other, whether or not this spec is a 'fairy tale',
>as some have turned out to be for all their technical writing.
>
>=====
>
><< 6.4.4 Status of Character Encodings
>
><< This section needs serious rework. Some of the (updated) text is speaking
><< about events in 2001. The section simply should say that earlier
><< versions may have mentioned that different characters were in different
><< stages of adoption in the standards process, but that all characters
><< now in the spec are fully standardized. This is the message that
><< we need to get out, and this is the way to avoid that the spec
><< looks silly in a few years.
>
>
>  >>>>
><< Even with the good will shown to the mathenatical community by the Unicode
><< process a small number of characters of special interest to some may not
><< yet have been included. The obvious solution of avoiding their use may not
><< satisfy all. For these characters the Unicode mechanism involving Private
><< Use Area codes could be deployed, in spite of all the dangers of confusion
><< and collisions of conventions this brings with it. However, this is the
><< situation for which mglyph was introduced.
>  >>>>
>
><< This paragraph should be rewritten and shortened, if it belongs
><< into this section at all. It is particularly important to us
><< that mention of the private use area is removed. What about:
>
>Why is it so important the I18N that the existence of the PUA,
>which is a recorded part of the USC and 10646 be denied?  It is
>part of a real standard.  It is not being recommended here, but
>its existence is worth a warning.
>
>A REVISED VERSION version now ends with
>
>"However, this is the situation for which mglyph was introduced.
>The use of <mglyph> is recommended to refer to symbols not included
>in Unicode. "
Received on Thursday, 3 July 2003 11:51:49 UTC