Re: Constraining xml:lang - Catch 22 from Henry S. Thompson on 2003-12-19 (xmlschema-dev@w3.org from December 2003)

From: Henry S. Thompson <ht@cogsci.ed.ac.uk>
Date: Fri, 19 Dec 2003 08:53:25 +0000
To: "Jack Lindsey" <tuquenukem@hotmail.com>
Cc: xmlschema-dev@w3.org
Message-ID: <f5bllp92np6.fsf@erasmus.inf.ed.ac.uk>

"Jack Lindsey" <tuquenukem@hotmail.com> writes:

> I have been asked why I don't validate xml:lang in my schemas to
> restrict its values to something like en-GB, en-US, en-Ca, fr-CA,
> es-MX.  But if I roll my own or derive something by restriction in my
> namespace, it may be my:lang but it is no longer xml:lang.  But for so
> many internationalization/localization reasons everyone wants the
> instant recognition and standardization of xml:lang.  Apart from using
> Schematron or depending on application program logic, are there any
> other useful strategies in this area?

Just what are you trying to rule out?  The regulatory situation
regarding language codes, as spelled out in RFC 3066 [1], is
sufficiently complicated that the lexical space constraint given in
the schema REC (as amended) for the xs:language type [2] is really the
strictest it's practical to enforce.  With IANA having registered
e.g. cel-gaulish and de-AT-1901 as legal tags, there's really not much
we can do here.

> I love this, from "http://www.w3.org/2001/xml.xsd"

<snip/>

The comment will be removed when the above-cited erratum is formally
encorporated in the 2nd edition of the Schema REC.

ht

[1] http://www.ietf.org/rfc/rfc3066.txt
[2] http://www.w3.org/2001/05/xmlschema-errata#e2-25
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                      Half-time member of W3C Team
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/
 [mail really from me _always_ has this .sig -- mail without it is forged spam]

Received on Friday, 19 December 2003 03:54:24 UTC