- From: Felix Sasaki <felix.sasaki@fh-potsdam.de>
- Date: Tue, 24 May 2011 19:16:02 +0200
- To: Chris Lilley <chris@w3.org>
- Cc: www-international@w3.org, www-font@w3.org
- Message-ID: <BANLkTimwqdYA-bnKw+vMDYTx=xa6CsT-cQ@mail.gmail.com>
Hello Chris,
FYI, the XML Schema for xml:lang at
http://www.w3.org/2001/xml.xsd
Already refers to BCP 47, not RFC 3066. Note also that that definition
encompasses the empty value
<xs:simpleType>
<xs:union memberTypes="xs:language">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value=""/>
</xs:restriction>
</xs:simpleType>
</xs:union>
</xs:simpleType>
So does XML Schema 1.1
http://www.w3.org/TR/2009/WD-xmlschema11-2-20091203/datatypes.html#language
Regards,
Felix
2011/5/24 Chris Lilley <chris@w3.org>
> Hello www-international,
>
> I'm updating a RelaxNG schema to use xml:lang rather than it's own lang
> attribute. (In fact, the WOFF schema, and at the request of the I18N Corw
> WG). But I ran into a problem in terms of the datatypes and wanted to be
> sure how to proceed hence this email.
>
> The snippet I am using is
>
> <optional>
> <attribute name="lang" ns="http://www.w3.org/XML/1998/namespace">
> <value type="language"/>
> </attribute>
> </optional>
>
> because RNG uses the types system form XML Schema part 2: datatypes
> http://www.oasis-open.org/committees/relax-ng/spec-20011203.html#IDA0ZYR
>
> however, XML Schema datatypes seems to over-constrain the language type
> such that it can only be an RFC3066-compatible string:
>
>
> [Definition:] language represents natural language identifiers as defined
> by by [RFC 3066]. The ·value space· of language is the set of all strings
> that are valid language identifiers as defined [RFC 3066] . The ·lexical
> space· of language is the set of all strings that conform to the pattern
> [a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})* . The ·base type· of language is token.
> http://www.w3.org/TR/xmlschema-2/#language
>
> My understanding was that the recommended practice was to use BCP47
> http://tools.ietf.org/rfc/bcp/bcp47.txt
> which is currently a concatenation of RFC 5646 and RFC 4647.
>
> Should I just drop the datatype (so it takes the default value of 'token')?
> Is there a better definition of a BCP47 language type that I should
> reference instead? Should the schema datatype be deprecated, or is it
> planned to update it?
>
> I had a look at the I18N QA on xml:lang
> http://www.w3.org/International/questions/qa-when-xmllang.en.php
> but that is more about se in a document instance than use in a schema
> definition; and also, it references RFC 3066 not BCP47.
>
>
> (In terms of WOFF last call, this relates to WOFF Issue 7 "xsd:NCName is
> too constraining for lang attributes" and WOFF Issue 9 "I18n-ISSUE-2: Why
> not using xml:lang? ")
>
>
> --
> Chris Lilley Technical Director, Interaction Domain
> W3C Graphics Activity Lead, Fonts Activity Lead
> Co-Chair, W3C Hypertext CG
> Member, CSS, WebFonts, SVG Working Groups
>
>
>
Received on Tuesday, 24 May 2011 17:17:32 UTC