W3C home > Mailing lists > Public > www-international@w3.org > April to June 2011

Re: Declaring xml:lang in a RelaxNG schema

From: John Cowan <cowan@mercury.ccil.org>
Date: Tue, 24 May 2011 13:44:36 -0400
To: Chris Lilley <chris@w3.org>
Cc: www-international@w3.org, www-font@w3.org
Message-ID: <20110524174436.GA17763@mercury.ccil.org>
Chris Lilley scripsit:

> however, XML Schema datatypes seems to over-constrain the language
> type such that it can only be an RFC3066-compatible string:

Actually, it's under-constrained lexically, but that doesn't matter
much. yotz-foo-q-rubinv2 isn't a valid BCP 47 language tag, but it is
part of the lexical space of xs:language.  In practice, nobody validates
(in the sense of BCP 47) language tags: they either pass them along
to some other component, or check for a definite list of tags they
understand and treat everything else as "language unknown".  Validation
would involve examining the Language Subtag Registry to see whether the
subtags are registered, which is normally more trouble than it's worth.

> Should I just drop the datatype (so it takes the default value of
> 'token')? Is there a better definition of a BCP47 language type that I
> should reference instead? Should the schema datatype be deprecated, or
> is it planned to update it?

No on all counts.  Use it as-is.

> (In terms of WOFF last call, this relates to WOFF Issue 7 "xsd:NCName
> is too constraining for lang attributes" and WOFF Issue 9
> "I18n-ISSUE-2: Why not using xml:lang? ")

NCName underconstrains even more than language: it allows things like
"en.us" that can't be language tags.

If you would, please pass this back to the WOFFers: they should use the
"lookup" algorithm defined in BCP 47 (= RFC 4647 at present) rather than
defining their own algorithm.  The advantage of "lookup" is that if the
user's locale is en-US and there is no en-US data, but there is en data,
it will be found.

Then the only addition necessary (per 4647 section 3.2) would be to say
that if there's no match, use the first available text.

-- 
John Cowan   cowan@ccil.org    http://ccil.org/~cowan
If a soldier is asked why he kills people who have done him no harm, or a
terrorist why he kills innocent people with his bombs, they can always
reply that war has been declared, and there are no innocent people in an
enemy country in wartime.  The answer is psychotic, but it is the answer
that humanity has given to every act of aggression in history.  --Northrop Frye
Received on Tuesday, 24 May 2011 17:45:59 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 24 May 2011 17:46:00 GMT