Representing non ISO 8859/1 Characters in Contemporary HTML (was

The subject of this thread was: Some comments on Amaya 1.3a. 
but I don't think this is terribly useful for people scanning an 
archive, so the main purpose of this article is get the thread 
cross-referenced under a more precise subject.

Basically there are two schools of thought on representing foreign 
characters and mathematical symbols in HTML:

- the HTML is a page description language school (which unfortunately 
  is probably dominant on the commercialised web) which says you 
  select a font and use the character code that produces the right 
  visual effect in that font (typically Symbol);

- the HTML is a logical markup language school (which is taken by 
  myself, the Amaya developers, Lynx, and, in an equivocal fashion,
  IE4) which says that the HTML should be meaningful without regard
  to the fonts used.  (Amaya does reflect the page layout school on 
  some other issues.)

Complicating factors are the large number of browsers and authoring 
tools (Front Page lags IE4 in these areas) that don't permit the 
latter approach at present.

A followup point:

> > (Lynx supports multiple display character sets, including the DOS 
> > greek code page and UTF8.  I would expect both of these to display 
> > the greek characters correctly in the greek1 version.) 

I confirmed this with an ISO 8859/7 code page on Linux and Lynx 
configured for that page.  greek.html displayed as roman characters; 
greek1.html displayed correctly. 

-- 
David Woolley - Office: David Woolley <djw@bts.co.uk>
BTS             Home: <david@djwhome.demon.co.uk>
Wallington      TQ 2887 6421
England         51  21' 44" N,  00  09' 01" W (WGS 84)

Received on Thursday, 12 November 1998 08:27:15 UTC