- From: Phillips, Addison <addison@lab126.com>
- Date: Tue, 24 May 2011 10:34:30 -0700
- To: Felix Sasaki <felix.sasaki@fh-potsdam.de>, Chris Lilley <chris@w3.org>
- CC: "www-international@w3.org" <www-international@w3.org>, "www-font@w3.org" <www-font@w3.org>
- Message-ID: <131F80DEA635F044946897AFDA9AC3476A931EFA98@EX-SEA31-D.ant.amazon.com>
Hi Chris, In addition to Felix’s comment, let me add: the RFC 3066 language tag grammar is a *superset* of the BCP47 grammar, that is, the value space it represents is *less* constrained than the current BCP 47 value space. This is by design: all BCP 47 well-formed language tags are also well-formed in RFC 3066 terms. Some RFC 3066 “well-formed” tags are not well-formed or valid in BCP 47 terms, but such tags were never valid language tags. The lexical space defined by xs:language is the same as BCP47 (RFC 5646) production “obs-language-tag”. See Section 2.2.9 (Classes of Conformance) [1]. For reasons of compatibility, it makes sense to allow the larger (RFC 3066) range of language tags to be “well-formed” (in the XML sense), even though the reference is to BCP 47 and language tag validators should use the more strict requirements of BCP 47. Note that language tag matching (BCP 47, RFC 4647) depends only on obs-language-tag. Hope that helps, Addison Addison Phillips Globalization Architect (Lab126) Editor (IETF BCP 47) Chair (W3C I18N WG) Internationalization is not a feature. It is an architecture. [1] http://tools.ietf.org/html/bcp47#section-2.2.9 From: www-international-request@w3.org [mailto:www-international-request@w3.org] On Behalf Of Felix Sasaki Sent: Tuesday, May 24, 2011 10:16 AM To: Chris Lilley Cc: www-international@w3.org; www-font@w3.org Subject: Re: Declaring xml:lang in a RelaxNG schema Hello Chris, FYI, the XML Schema for xml:lang at http://www.w3.org/2001/xml.xsd Already refers to BCP 47, not RFC 3066. Note also that that definition encompasses the empty value <xs:simpleType> <xs:union memberTypes="xs:language"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value=""/> </xs:restriction> </xs:simpleType> </xs:union> </xs:simpleType> So does XML Schema 1.1 http://www.w3.org/TR/2009/WD-xmlschema11-2-20091203/datatypes.html#language Regards, Felix 2011/5/24 Chris Lilley <chris@w3.org<mailto:chris@w3.org>> Hello www-international, I'm updating a RelaxNG schema to use xml:lang rather than it's own lang attribute. (In fact, the WOFF schema, and at the request of the I18N Corw WG). But I ran into a problem in terms of the datatypes and wanted to be sure how to proceed hence this email. The snippet I am using is <optional> <attribute name="lang" ns="http://www.w3.org/XML/1998/namespace"> <value type="language"/> </attribute> </optional> because RNG uses the types system form XML Schema part 2: datatypes http://www.oasis-open.org/committees/relax-ng/spec-20011203.html#IDA0ZYR however, XML Schema datatypes seems to over-constrain the language type such that it can only be an RFC3066-compatible string: [Definition:] language represents natural language identifiers as defined by by [RFC 3066]. The ·value space· of language is the set of all strings that are valid language identifiers as defined [RFC 3066] . The ·lexical space· of language is the set of all strings that conform to the pattern [a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})* . The ·base type· of language is token. http://www.w3.org/TR/xmlschema-2/#language My understanding was that the recommended practice was to use BCP47 http://tools.ietf.org/rfc/bcp/bcp47.txt which is currently a concatenation of RFC 5646 and RFC 4647. Should I just drop the datatype (so it takes the default value of 'token')? Is there a better definition of a BCP47 language type that I should reference instead? Should the schema datatype be deprecated, or is it planned to update it? I had a look at the I18N QA on xml:lang http://www.w3.org/International/questions/qa-when-xmllang.en.php but that is more about se in a document instance than use in a schema definition; and also, it references RFC 3066 not BCP47. (In terms of WOFF last call, this relates to WOFF Issue 7 "xsd:NCName is too constraining for lang attributes" and WOFF Issue 9 "I18n-ISSUE-2: Why not using xml:lang? ") -- Chris Lilley Technical Director, Interaction Domain W3C Graphics Activity Lead, Fonts Activity Lead Co-Chair, W3C Hypertext CG Member, CSS, WebFonts, SVG Working Groups
Received on Tuesday, 24 May 2011 17:34:59 UTC