XHTML 1.0 [XHTML 1.0] is a reformulation of the three HTML 4 document types as applications of XML 1.0. HTML is an SGML (Standard Generalized Markup Language) application conforming to International Standard ISO 8879, and is widely regarded as the standard publishing language of the World Wide Web.
In XHTML 1.0, the XHTML namespace may be used with other XML namespaces as per [XMLNS], but such documents are not strictly conforming XHTML 1.0 documents [XHTML 1.0 NS].
An example of such non-conformant XHTML 1.0 document is as follow.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:its="http://www.w3.org/2005/11/its" lang="en" xml:lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta name="keywords" content="ITS example, XHTML translation" /> <its:documentRules> <its:ns prefix="h" uri="http://www.w3.org/1999/xhtml" /> <its:translateRule its:selector="//h:meta[@name='keywords']/@content" its:translate="yes" /> <its:termRule its:selector="//h:span[@class='term']" /> </its:documentRules> <title>ITS Working Group</title> </head> <body> <h1>Test of ITS on <span class="term">XHTML</span></h1> <p>Some text to translate.</p> <p its:translate="no">Some text not to translate.</p> </body> </html>
The way to use ITS with XHTML and keep the XHTML document conformant is to use external ITS document rules. Even local information within the document that would be handled by ITS attributes can be set indirectely.
ITS external document rules:
<its:documentRules xmlns:its="http://www.w3.org/2005/11/its"> <its:ns prefix="h" uri="http://www.w3.org/1999/xhtml" /> <its:translateRule its:selector="//h:meta[@name='keywords']/@content" its:translate="yes" /> <its:translateRule its:selector="//h:p[@class='notrans']" its:translate="no" /> <its:termRule its:selector="//h:span[@class='term']" /> </its:documentRules>
XHTML document:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta name="keywords" content="ITS example, XHTML translation" /> <title>ITS Working Group</title> </head> <body> <h1>Test of ITS on <span class="term">XHTML</span></h1> <p>Some text to translate.</p> <p class="notrans">Some text not to translate.</p> </body> </html>
A number of XHTML constructs implement the same semantic as some of the ITS data categories. In addition, some of the attributes in XHTML are translatable which is not the default for XML documents according ITS defaults settings. These attributes need to be identified as translatable.
An external ITS <documentRules>
element can
summarize these relations. Because XHTML use is widespread and covers a
large amount of legacy material the rules defined here may not be
optimal for everyone.
<its:documentRules xmlns:its="http://www.w3.org/2005/11/its"> <its:ns its:prefix="h" its:uri="http://www.w3.org/1999/xhtml"/> <!-- special content. (See note 1) --> <its:translateRule its:selector="//h:script" its:translate="no"/> <its:translateRule its:selector="//h:style" its:translate="no"/> <!-- Normal translatable attributes --> <its:translateRule its:selector="//h:*/@abbr" its:translate="yes"/> <its:translateRule its:selector="//h:*/@accesskey" its:translate="yes"/> <its:translateRule its:selector="//h:*/@alt" its:translate="yes"/> <its:translateRule its:selector="//h:*/@prompt" its:translate="yes"/> <its:translateRule its:selector="//h:*/@standby" its:translate="yes"/> <its:translateRule its:selector="//h:*/@summary" its:translate="yes"/> <its:translateRule its:selector="//h:*/@title" its:translate="yes"/> <!-- The input element (Important: See note 2) --> <its:translateRule its:selector="//h:input/@value" its:translate="yes"/> <its:translateRule its:selector="//h:input[@type='hidden']/@value" its:translate="no"/> <!-- Non-translatable element (See note 3) --> <its:translateRule its:selector="//h:del" its:translate="no"/> <its:translateRule its:selector="//h:del/descendant-or-self::*/@*" its:translate="no"/> <!-- Often-used translatable meta content. --> <its:translateRule its:selector="//h:meta[@name='keywords']/@content" its:translate="yes"/> <its:translateRule its:selector="//h:meta[@name='description']/@content" its:translate="yes"/> <!-- Possible term (Important: See note 4) --> <its:termRule its:selector="//h:dt" its:term="yes"/> <!-- Bidirectional information --> <its:dirRule its:selector="//h:*[@dir='ltr']" its:dir="ltr"/> <its:dirRule its:selector="//h:*[@dir='rtl']" its:dir="rtl"/> <its:dirRule its:selector="//h:bdo[@dir='ltr']" its:dir="lro"/> <its:dirRule its:selector="//h:bdo[@dir='rtl']" its:dir="rlo"/> <!-- Elements within text --> <its:withinTextRule its:withinText="yes" its:selector="//h:abbr | //h:acronym | //h:br | //h:cite | //h:code | //h:dfn | //h:kbd | //h:q | //h:samp | //h:span | //h:strong | //h:var | //h:b | //h:em | //h:big | //h:hr | //h:i | //h:small | //h:sub | //h:sup | //h:tt | //h:del | //h:ins | //h:bdo | //h:img | //h:a | //h:font | //h:center | //h:s | //h:strike | //h:u | //h:isindex" /> </its:documentRules>
Note 1 - The <script>
and <style>
elements may have translatable text, but
their content needs to be parsed with respectively a script filter and a
CSS filter. Depending on the capability of your translation tools you
may want to leave these elements translatable.
Note 2 - The value
attribute of the
<input>
element may or may not be translatable depending on
the way the element is used. Selecting value
as
translatable or not needs to be decided depending on your own use.
Note 3 - The <del>
element
indicates removed text and therefore, most often, would not be
translatable. Because this element may contain elements with
translatable attributes such as <img>
with alt
,
and because the scope of translatability does not include attributes,
you need to: a) define this rule after the definition of the
translatable attributes, and b) use the rules with
its:selector="//h:del/descendant-or-self::*/@*"
to overwrite any
possible translatable attribute within a <del>
element or
any of its descendants.
Note 4 - The <dt>
element is
defined by HTML as a "definition term" and can therefore be seen as a
candidate to be associated with the ITS Terminology data category.
However, for historical reasons, this element has been used for many
other purposes. Selecting <dt>
as a term or not needs to be
decided depending on your own use.