- From: Chris Lilley <chris@w3.org>
- Date: Wed, 9 Jan 2002 00:20:28 +0100
- To: Norman Walsh <Norman.Walsh@Sun.COM>
- CC: www-tag@w3.org
On Monday, January 07, 2002, 11:08:19 PM, Norman wrote: NW> James and I had several conversations on this topic at XML 2001. I've NW> been persuaded that the way out of this dilemma is to accept that NW> attributes should never be used for human readable text (as opposed to NW> tokens or other simple datatypes). I agree with that and it was one of the design influences on SVG - human readable text is element content not attribute values (and conversely, non-human-readable numerical stuff is stuffed into attributes). NW> This limitation can be justified, I think, by the argument that NW> attribute values can't contain markup and I18N considerations always NW> require markup in human readable text (e.g, for BIDI or Rubi (Ruby?)). I agree in general (although BIDI does not, in fact, require markup unless there is more than one level of nesting) but yes Ruby is one example and xml:lang is another. NW> If you restrict human readable text to element content, then you can NW> use empty elements to replace named character entities. NW> <para>An é has an accent.</para> NW> could be written: NW> <para>An <e:eacute/> has an accent.</para> Thus making string matching on the DOM element nodes more complex since instead of being stuff that the parser just deals with, you now have to understand whatever namespace the prefix e: is bound to. NW> or even NW> <para>An <e:char name="latin small letter e with acute"/> has an accent.</para> NW> Assuming some in-scope namespace declaration for "e:" of course :-) once could remove both the requirement for an in-scope ns declaration and my gripe abput knowing special namespaces with <para>An <xml:char unicode="00E9"/> has an accent.</para> or <para>An <xml:char name="LATIN SMALL LETTER E WITH ACUTE"/> has an accent.</para> But, with the greater prevalence of Unicode-enabled editors and OS its not clear that single characters are the primary use case for entities, going forward. Even for plane-1-using applications like MathML, given that Windows XP and MacOS X both support non-BMP nowadays. Plus (arguing against my own proposal) the former example has no benefit as against <para>An é has an accent.</para> and the latter, besides being verbose beyond belief, is highly succeptible to mistyping and bloats processors with string tables. Who was it that said 'the current position is unsupportable. Except when compared to the alternatives' or words to that effect ... -- Chris mailto:chris@w3.org
Received on Tuesday, 8 January 2002 18:20:33 UTC