- From: Johannes Koch <koch@w3development.de>
- Date: Wed, 14 Apr 2010 10:28:43 +0200
- To: Stephane Corlosquet <scorlosquet@gmail.com>
- Cc: sioc-dev@googlegroups.com, Public RDFa <public-rdfa@w3.org>, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
Hi Stephane Stephane Corlosquet schrieb: > Can anyone confirm whether xml:lang="" is valid or not? The XML 1.0 [6] says > it's valid but I'm not sure if this applies to XHTML+RDFa. Is the last claim > regarding the W3C validator reporting success on invalid markup true? [...] > [6] http://www.w3.org/TR/REC-xml/#sec-lang-tag Simple question, long answer (sorry, but sometimes life is not black or white :-). Indead, the cited text (<http://www.w3.org/TR/REC-xml/#sec-lang-tag>) says: | in addition, the empty string may be specified. and later: | In particular, the empty value of xml:lang is used on an element B to | override a specification of xml:lang on an enclosing element A, | without specifying another language. However... For XHTML 1.0 (somewhere in <http://www.w3.org/TR/xhtml1/dtds.html#a_dtd_XHTML-1.0-Strict>): | xml:lang language code (as per XML 1.0 spec) and | xml:lang %LanguageCode; #IMPLIED with | <!ENTITY % LanguageCode "NMTOKEN"> Looking up NMTOKEN in XML 1.0 (<http://www.w3.org/TR/REC-xml/#nmtok>): | Values of type NMTOKEN MUST match the Nmtoken production and (<http://www.w3.org/TR/REC-xml/#NT-Nmtoken>): | [7] Nmtoken ::= (NameChar)+ <http://www.w3.org/TR/REC-xml/#NT-NameChar>: | [4a] NameCha ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040] (<http://www.w3.org/TR/REC-xml/#NT-NameStartChar>): | [4] NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] This indicates that (formally) an empty string is not a NMTOKEN and so is no valid value for the xml:lang attribute as defined in the XHTML 1.0 Strict DTD. For XHTML languages based on XHTML Modularization (10 April 2001 version), xml:lang is mentioned in prose in <http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/abstract_modules.html#s_commonatts> | xml:lang (NMTOKEN) and defined in the DTD module (<http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/dtd_module_defs.html#a_module_XHTML_Common_Attribute_Definitions>) | xml:lang %LanguageCode.datatype; #IMPLIED with (<http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/dtd_module_defs.html#dtdentry_LanguageCode.datatype>) | <!ENTITY % LanguageCode.datatype "NMTOKEN" > So, same result as for XHTML 1.0. The revision (XHTML Modularization 1.1), mentions xml:lang in <http://www.w3.org/TR/2008/REC-xhtml-modularization-20081008/abstract_modules.html#s_commonatts> | xml:lang (CDATA) and in the DTD module (<http://www.w3.org/TR/2008/REC-xhtml-modularization-20081008/dtd_module_defs.html#a_module_XHTML_Common_Attribute_Definitions>) | xml:lang %LanguageCode.datatype; #IMPLIED with (<http://www.w3.org/TR/2008/REC-xhtml-modularization-20081008/dtd_module_defs.html#a_module_XHTML_Datatypes>): | <!ENTITY % LanguageCode.datatype "CDATA" > The XML schema module references xml:lang in <http://www.w3.org/TR/2008/REC-xhtml-modularization-20081008/schema_module_defs.html#a_module_XHTML_Datatypes>: | <xs:attribute ref="xml:lang" /> from <http://www.w3.org/2001/xml.xsd>: | The union allows for the 'un-declaration' of xml:lang with the empty | string. | | Formal declaration in XSD source form | | <xs:attribute name="lang"> | <xs:annotation> | <xs:documentation> | <div> | | <h3>lang (as an attribute name)</h3> | <p> | denotes an attribute whose value | is a language code for the natural language of the content of | any element; its value is inherited. This name is reserved | by virtue of its definition in the XML specification.</p> | | </div> | <div> | <h4>Notes</h4> | <p> | Attempting to install the relevant ISO 2- and 3-letter | codes as the enumerated possible values is probably never | going to be a realistic possibility. | </p> | <p> | See BCP 47 at <a | href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt"> | http://www.rfc-editor.org/rfc/bcp/bcp47.txt</a> | and the IANA language subtag registry at | <a | href="http://www.iana.org/assignments/language-subtag-registry"> | http://www.iana.org/assignments/language-subtag-registry</a> | for further information. | </p> | <p> | The union allows for the 'un-declaration' of xml:lang with | the empty string. | </p> | </div> | </xs:documentation> | </xs:annotation> | <xs:simpleType> | <xs:union memberTypes="xs:language"> | <xs:simpleType> | <xs:restriction base="xs:string"> | <xs:enumeration value=""/> | </xs:restriction> | </xs:simpleType> | </xs:union> | </xs:simpleType> | </xs:attribute> So, in languages based on XHTML Modularization 1.1, the empty string is (formally) DTD-valid and XML-Schema-valid. In the DTD for "XHTML 1.1 + RDFa" (<http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd>): | xml:lang %LanguageCode.datatype; #IMPLIED with (<http://www.w3.org/MarkUp/DTD/xhtml-datatypes-1.mod>) | <!ENTITY % LanguageCode.datatype "CDATA" > So, in "XHTML 1.1 + RDFa" the empty string is (formally) DTD-valid. -- Johannes Koch In te domine speravi; non confundar in aeternum. (Te Deum, 4th cent.)
Received on Wednesday, 14 April 2010 08:29:28 UTC