- From: David Carlisle <davidc@nag.co.uk>
- Date: Tue, 23 Oct 2007 10:24:44 +0100
- To: Paul.Bijnens@xplanation.com
- Cc: www-math@w3.org
Thanks you for reportung this. > Seems like an error, because the uppercase G-cedilla, just as > any other letter for Latvian *is* included. > > Really strange that nobody noticed that error, since the file dates > from 2003 Actually further back from that, ISOLat2 dates from the original SGML specififcation, ISO 8879, dated from 1986. As far as I can see, the ISO spec still (despite a couple of amendments since then) only specifies the upper case G with cedilla. (Before replying I checked various online sources, and the Goldfarb's printed SGML handbook.) This is the same problem as the mathematical characters, many of which have no entity name, or inappropriate entity names. It is tempting to "fix" this by just adding the entity, but many systems use catalogs or other similar systems that mean that a reference to a latin2 entity file is intercepted and a local (or internalised) file is used rather than the specified dtdt file being read. In theory the exact form of the FPI in the public identifier would uniquely identify a new variant and systems would detect that, but theory and real life don't always agree. If a document uses &gcedil; and the DTD that is used does not define this, then it is not well formed and most likely the entire document will be rejected with a fatal parse error. This is a rather bad default behaviour so it's really safer in most cases to use the character directly, or to use a numeric reference, ģ which will always work, or use ¸ but define it in the document's local subset, so you don't rely on an updated latin2 file. The entity files are (slowly) being updated for MathMl3 (and hopefully a synchronised update of ISO 9573) but my current thoughts are to keep all the names in all the etity sets exactly as before (for the reasons given above) but just correct the assignments to unicode where appropriate. There is though the possobility of having a new "extra" entity set of previously unnamed characters, and gcedil should clearly (now you have pointed it out) be a candidate for inclusion in any such set. I'm sorry this is a rather unsatisfactory answer, what makes editing the entity sets challenging is that the existing names/definitions are so hard to justify by any rational principle, but it is also hard to justify any change to a set of names that have been in use for over 20 years, if there is any chance of any change breaking any (unknown!) existing uses.... David ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________
Received on Tuesday, 23 October 2007 09:25:00 UTC