W3C home > Mailing lists > Public > www-html@w3.org > February 2000

Re: XHTML/XML comment (case sensitivity an I18N, redux)

From: Dan Oscarsson <Dan.Oscarsson@trab.se>
Date: Tue, 1 Feb 2000 09:03:45 +0100 (MET)
Message-Id: <200002010803.JAA23187@valinor.malmo.trab.se>
To: aray@q2.net, connolly@w3.org
Cc: www-html@w3.org

>Just to reiterate the answers that have been given, but citing
>sources... the reason that XHTML mandates lower-case is
>(a) XHTML documents conform to the XML spec:
>"A Reformulation of HTML 4 in XML 1.0"
>	-- http://www.w3.org/TR/2000/REC-xhtml1-20000126
>	aka http://www.w3.org/TR/xhtml1/
>(b) XML 1.0 is case sensitive. Why?
>	"This is a summary of points made:
>	...
>	Internationalization experts are unanimously against folding.
>	..."
>	-- XML WG decisions of Wed. Sep. 10
>	http://www.w3.org/XML/9712-reports.html#ID40

Well I looked at the above document and some of the reasons gives some
thought and some are not right:

 ... XML will rarely be created by hand and when it happens,
     it'll be by experts.
 This implies that XHTML will also not be created by hand. So XHTML can
 never replace HTML. The wide spread of HTML is because it is so
 easy to edit by hand and by non experts (who like case insensititity
 and thinks numbers should not be written with quotes around them).
 ... Internationalization experts are unanimously against folding.
 This is wrong. I know some that are for case insensitivity.
 Saying "only lower case" have big problems too (I will come back to that).
 ... Pleasant experiences with case-sensitive programming languages.
 I have had unpleasant experiences with case-sensitive programming
 languages. I have seen many bugs created just because they use
 case sensitive variables. Case insensitive reserver words and
 variables are less confusing and less error prone.
XHTML says that lower case should be used. But I can see no definition on
lower case! As has been pointed out before, there are some problems
with making things in lower case. In Turkish an I will be lower cased to
dotless i. You cannot define lower case without taking all the problems
with case insensititity with you, because you still have to define
the mapping form upper case to lower case. And if you have done than,
you have the rules needed to make things case insensitive.
You can avoid this by not saying case sensitive and replace it by
saying code point sensitive. Though you may get great problems when
instructing somebody over phone, as case state is normally not 
available when speaking.
But as XHTML will only be generated by programs, it might not matter
(if the programmers get it right).

Received on Tuesday, 1 February 2000 03:04:53 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:05:52 UTC