W3C home > Mailing lists > Public > public-html@w3.org > April 2008

Re: several messages about New Vocabularies in text/html

From: David Carlisle <davidc@nag.co.uk>
Date: Sun, 6 Apr 2008 11:02:56 +0100
Message-Id: <200804061002.m36A2u9Z017835@edinburgh.nag.co.uk>
To: ian@hixie.ch
Cc: public-html@w3.org, www-math@w3.org

> What is the intent of the STIX set? 
historical. Most of the (2000 or so math characters added at Unicode 3.1
and 3.2) were added as the result of a submision from the stix group of
publishes as stage 1 of the (decade long as it turned out) effort to
develop the freely available stix fonts for scientific publishing. They
just finished a beta of the fonts and they may finally appear this year.
I should probably remove those now (or at least do a pass to align them
with the final stix tables when they come out. But in any case they were
not so much an entity set that were defined from here as informatio
about which characters the stix group thought they needed to support the
various entities. 


> Apart from the STIX entities, is the idea that any modern specification, 
> for maximum compatibility, would just support all the entities defined, or 
> are there other sets that should be avoided?

well the original idea (Sebastian's) was just to have something that
documented the unicode characters for jadetex, I adopted (and adapted)
the file once I started to maintain the MathML DTD to have a mechanised
way of keeping the dtd and html documentation (chapter 6 of mathml 1 and
2) in sync.

In MathML3, we decided to pull the tables out of chapter 6 and make them
a separate specification, to make it easier for people to refer to them
if they wish. Personally I think life would be simpler if everyone used
compatible definitions, (not necessarily the same set of names in all
vocabularies, but at least everyine agreed which unicode slot a given
name maps to)  But I'm not putting pressure on people to use them,
However I'll add any information to the file that is needed to make it
easier for people to use them.

> HTML5 has the following entities (supported for legacy reasons) which are 
> not in the unicode.xml file:

I'll add them. I'd better make a new "html5" set then I suspect, it
would be good if xhtml (1 and 2) agreed, but that's for the respective
working groups to sort out, I'll try to document whatever the situation
is in unicode.xml. Also I took an action at a recent Math WG teleconf to
get a new public working draft out in TR space this month, so I need
to look at this sooon...

> In HTML5, we changed the mappings for &lang; and &rang;. The legacy 
> mappings of these two characters are to characters that are defined as 
> canonically equivalent to CJK wide characters. The new mappings are:

Almost all the changes between mathml2 and mathml3 are related to
changing to not use CJK punctuation to instead use new(ish) math
brackets that were added. to unicode. (I see rang was in fact Unicode
3.2, but several more were added in 4 and 5) I almost certainly left
these two for html compatibility, so if you've changed I'd guess I could
as well. (although it's not my sole decision) But again it would be
helpful if you could coordinate with xhtml group. 

Or we all (Math xhtml and html) agree in advance that we'll use whatever
that file says (so we all agree) and then put the file under joint editorial
control. I don't really mind about the politics, I'd just rather the
file helped people have consistent definitions, rather than putting work
in to making it accurately document differences.

> The real question will be whether we actually want to 
> support all these new entities in text/html.

As I've mentioned there are a few more there than even supported in
mathml (mainly greek) but there are other ones you could drop (eg mathml
aliases set, which are mainly TeX like names for characters that already
have iso names) or just decide on a character by character basis.  I
think it's more important that the names you do have have the same
unicode definition, having all the names might simplify some things but
it's not a vital issue (to me at least)


The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. 
Received on Sunday, 6 April 2008 10:03:33 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:54 UTC