- From: David Carlisle <davidc@nag.co.uk>
- Date: Sat, 5 Apr 2008 21:40:55 +0100
- To: ian@hixie.ch
- Cc: public-html@w3.org, www-math@w3.org
> Is there some permanent URI from which the absolute latest unicode.xml > file from which that document is created can always be found? (I don't > mind if it's not in w3.org space, in case you edit the document elsewhere > where the document would be more up to date, it's just a reliably up to > date URI that I'm looking for.) The one linked to from the document at http://www.w3.org/2003/entities/2007xml/ is (always) the latest version of the file (and of the stylesheets used to extract information from other sources into that file, and from that file into the document's tables and DTD entity declarations). Like most (all?) of the W3C site it is under CVS control and the public view on the web just always reflects the HEAD of the CVS repository. I assume you have (or could have) W3C cvs access in which case you could check it out from $CVSROOT/WWW/2003/entities/2007xml/ and see the cvs logs et if you wish, but the URI above is always the latest version. > I notice that there are entities even for many ASCII characters such as > the colon ":", is that really necessary? Entities aren't really necessary:-) Really the only reason for maintaining these entity definitions is to help transition legacy documents, colon is defined in ISONUM that is, it's been around since the original ISO 8879 standard defining SGML in 1986 if not before. I get requests to drop certain characters and (more often) requests to add some new names, but basically doing either causes interoperability problems as fragments often move around without keeping their correct dtd reference. the set of names (especially the ISO ones) are inconsistent, and sometimes downright cryptic, but they are what they are and I don't plan on changing any of them, just trying to keep a sane mapping from that set of names to Unicode. That said, I think it's vitally important the names are consistent with (x)html (Many of the names would be mapped differently if it were not for html compatibility) but it's less important that _all_ of them go in to a larger html+mathml set. The "combined" file that I referred to lists all the entities defined in that set including some of the iso entities that are not included in MathML, (ISOGRK1 and ISOGRK3 for textual Greek rather than mathematical Greek usage, for example are not in the MathML dtd) see <group name="mathml"> in unicode.xml for a list of entity set mathml currently uses. David ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________
Received on Saturday, 5 April 2008 20:41:30 UTC