- From: Felix Sasaki <felix.sasaki@fh-potsdam.de>
- Date: Tue, 24 May 2011 19:16:02 +0200
- To: Chris Lilley <chris@w3.org>
- Cc: www-international@w3.org, www-font@w3.org
- Message-ID: <BANLkTimwqdYA-bnKw+vMDYTx=xa6CsT-cQ@mail.gmail.com>
Hello Chris, FYI, the XML Schema for xml:lang at http://www.w3.org/2001/xml.xsd Already refers to BCP 47, not RFC 3066. Note also that that definition encompasses the empty value <xs:simpleType> <xs:union memberTypes="xs:language"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value=""/> </xs:restriction> </xs:simpleType> </xs:union> </xs:simpleType> So does XML Schema 1.1 http://www.w3.org/TR/2009/WD-xmlschema11-2-20091203/datatypes.html#language Regards, Felix 2011/5/24 Chris Lilley <chris@w3.org> > Hello www-international, > > I'm updating a RelaxNG schema to use xml:lang rather than it's own lang > attribute. (In fact, the WOFF schema, and at the request of the I18N Corw > WG). But I ran into a problem in terms of the datatypes and wanted to be > sure how to proceed hence this email. > > The snippet I am using is > > <optional> > <attribute name="lang" ns="http://www.w3.org/XML/1998/namespace"> > <value type="language"/> > </attribute> > </optional> > > because RNG uses the types system form XML Schema part 2: datatypes > http://www.oasis-open.org/committees/relax-ng/spec-20011203.html#IDA0ZYR > > however, XML Schema datatypes seems to over-constrain the language type > such that it can only be an RFC3066-compatible string: > > > [Definition:] language represents natural language identifiers as defined > by by [RFC 3066]. The ·value space· of language is the set of all strings > that are valid language identifiers as defined [RFC 3066] . The ·lexical > space· of language is the set of all strings that conform to the pattern > [a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})* . The ·base type· of language is token. > http://www.w3.org/TR/xmlschema-2/#language > > My understanding was that the recommended practice was to use BCP47 > http://tools.ietf.org/rfc/bcp/bcp47.txt > which is currently a concatenation of RFC 5646 and RFC 4647. > > Should I just drop the datatype (so it takes the default value of 'token')? > Is there a better definition of a BCP47 language type that I should > reference instead? Should the schema datatype be deprecated, or is it > planned to update it? > > I had a look at the I18N QA on xml:lang > http://www.w3.org/International/questions/qa-when-xmllang.en.php > but that is more about se in a document instance than use in a schema > definition; and also, it references RFC 3066 not BCP47. > > > (In terms of WOFF last call, this relates to WOFF Issue 7 "xsd:NCName is > too constraining for lang attributes" and WOFF Issue 9 "I18n-ISSUE-2: Why > not using xml:lang? ") > > > -- > Chris Lilley Technical Director, Interaction Domain > W3C Graphics Activity Lead, Fonts Activity Lead > Co-Chair, W3C Hypertext CG > Member, CSS, WebFonts, SVG Working Groups > > >
Received on Tuesday, 24 May 2011 17:16:30 UTC