Re: HTML - i18n / NCR & charsets

Larry Masinter (masinter@parc.xerox.com)
Wed, 27 Nov 1996 03:57:36 PST


To: MISHA.WOLF@reuters.com
CC: www-html@w3.org, www-international@w3.org, unicode@unicode.org
In-reply-to: <6814391926111996/A24242/RE6/11ABD4E70E00*@MHS> (message from
Subject: Re: HTML - i18n / NCR & charsets
From: Larry Masinter <masinter@parc.xerox.com>
Message-Id: <96Nov27.045736pdt."135"@palimpsest.parc.xerox.com>
Date: Wed, 27 Nov 1996 03:57:36 PST

# Some possible solutions are proposed:

If people have old documents with illegal numeric character references
in them, they should change them to not use illegal numeric character
references. All of your proposed solutions are inferior.

If people want to remain 'bugward compatible' with old browsers, they
can use content negotiation based on the user agent string:

- send old ("illegal numeric character references") to browsers that
  can't handle I18N but can display windows codepage (this is a
  relatively small subset of deployed browsers, e.g., old
  versions of MSIE on windows only )

- send standards compliant stuff (text/html;charset=iso8859-1) to
  anyone else (including netscape, alis, any browser on a platform
  that doesn't support windows codepage, robot & search engines,
  unix, newer versions of windows).

I often use a mac-based web browser & find the windows codepage
characters really annoying since they don't display properly anyway.

# If HTML-i18n is to go ahead, without any signaling about the NCRs
# target charset change (i.e in Unicode rather than the announced
# charset); then IMHO this should at least be mensioned in the draft
# as it break existing, widespread, practice, which prior to this
# i18n draft could not be signalled as 'wrong' or 'illegal'.