[Prev][Next][Index][Thread]

Re: Characters and SDATA entities



>Using elements or entities for characters is completely possible of
>course, but I think represents a poor analysis of the problem, and
>XML (& SGML) would be better without it. The WWW is one big system,
>and so there should be no need to use SDATA entities, as such.

Well, actually, I agree with you. I find using entities *less*
objectionable than elements, though not entirely pleasant either. If
we treated characters as characters and had a more flexible mechanism
for defining the repertoire, we'd be a lot better off.

>2) The real answer to a lot of the character problems is to
>have a WWW glyph service mechanism: if someone has a strange 

Agreed!! You and I have talked about this for more than a year now. I
think we also need to have a gloabl character repository to get the
best effect though.

>So I agree with Gavin that what  is needed is a mechanism where
>characters/glyphs can be marked up richly. However, I don't think
>anything less than markup to URLs of some glyph server is good
>enough.

Yes. We do need a resolution mechanism. I have a design that does not
reqwuire that you point to a *single* glyph repository, or a single
glyph image though. The resolution is performed via various channels,
the most fundamental of which is the character *name*.

>I guess I am falling into the same trap here as I accuse Gavin of
>in the area of the dreaded encoding PIs: he thinks it is better
>in the long run to create a new file format (new except for Macintoshes
>of course!) rather than extend the meaning of PI to do double duty
>as an in-data notation of encoding or charset. I think it is
>better to create a new type of markup to handle characters/glyphs
>rather than make entities or elements do double duty :-)

For this, you'll only hear violent agreement from me! The SDATA entity
mechanism is not elegant, though slightly better than a hack (and
typed entities are useful in the general sense anyway).

>As an alternative: at the SGML Asia pacific confence, some Japanese WG8
>people presented a scheme for encoding a bitmap glyph into the
>minimum literal of an entity sothat the document can cart around
>the glyph itself: they could get 32x32 bitmaps this way. A size
>of 48x48 would be better for this: if XML's imagined SGML declaration
>has a longer LITLEN than this, it is a feasible method for
>worst-case (e.g. to screen) imaging.

Interesting, but impractical for many important cases.

>4) On the subject of names and numbers, most characters in the
>world don't have useful roman names. The idea of using the
>SDATA entity default value as an ALT attribute for characters
>seems based on the idea that the characters in that field would
>be displayable on my browser anyway in characters and a language I'd
>understand. 

I'd like to have them solely for lookup purposes. I have been dying to
get a global glyph repository together for a long time now.