- From: Kohsuke KAWAGUCHI <kohsuke.kawaguchi@eng.sun.com>
- Date: Thu, 22 Mar 2001 12:27:27 -0800
- To: www-xml-schema-comments@w3.org
Dear XML Schema WG members, As for lexical/value space of "language" type, the spec states that > The lexical space of language is the set of all strings that are valid > language identifiers as defined in the language identification section > of [XML 1.0 (Second Edition)]. But those production rules are thrown away in the 2nd edition. So please refer to the 1st edition or copy production rule into the spec. Also RFC 1766 explicitly states that the language identifiers are "to be treated as case insensitive", whereas the current definition of XML Schema considers that "en-US" and "EN-US" are different. If the intention of Schema WG is to treat them as case sensitive, please explicitly state so because this is inconsistent with RFC1766. If this is not the intention of Schema WG, please make it a primitive type. Furthermore, the pattern facet which I found in normative definition: > "([a-zA-Z]{2}|[iI]-[a-zA-Z]+|[xX]-[a-zA-Z]+)(-[a-zA-Z]+)* does not correctly model BNF specified in RFC 1766. > Language-Tag = Primary-tag *( "-" Subtag ) > Primary-tag = 1*8ALPHA > Subtag = 1*8ALPHA As you see, every "subtag" must be no longer than 8 characters, but this constraint have not implemented in the normative definition. (maybe this is a problem of XML1.0 spec) To incorporate this constraint, pattern facet should be changed to > "([a-zA-Z]{2}|[iI]-[a-zA-Z]+|[xX]-[a-zA-Z]{1,8})(-[a-zA-Z]{1,8})* Also, semantics of "length" facet for "language" type is not defined at all. This should be defined in section 4.3.1 regards, ---------------------- K.Kawaguchi E-Mail: k-kawa@bigfoot.com
Received on Thursday, 22 March 2001 15:27:16 UTC