The following specification is an extension to HTML5, including XHTML5 documents and documents that conforms to the Polyglot Markup profile.
It specifies how to use the xml:id
attribute in XML-wellformed HTML5 documents so that authors MAY use XML applications that rely
on tokenized id
attributes of XML ID
type for referring to named fragments
(idrefs) in such documents.
Documents conforming to this specification may be parsed by any XML parser, but will cause (non-fatal)
XML errors if the parser – via DTD, schema, default or otherwise – performs ID-type assignment for both the xml:id
attribute and the id
attribute.
While HTML5 operates with idrefs and defines the id
attribute as the format’s idref container, some
HTML consumers of the XML kind rely on idrefs of the XML ID
type
(ID-type assigned attributes) for such purposes.[[!xml]]
The XHTML 1.x family of HTML documents included DOCTYPE declarations that pointed to DTDs that defined the id
attribute as being of such an XML ID
type. This meant that, when consumed as XML, validating XML
processors could consume HTML’s id
attribute as being of XML ID
type.
With the HTML5 specification, the reference to a DTD has been removed from the DOCTYPE declaration, an no other applicable DOCTYPE declaration that points to a DTD has at this time been specified, whether for HTML5 or for XHTML5.
For this class of HTML consumers, HTML5 documents thus need XML parsers that do not rely on reference to a
DTD for the assignment of ID-type for the id
attribute. However, out of the box, most XML tools
support ID-assignement for the xml:id
attribute but not for HTML’s id
attribute.
For this class of XML consumers, HTML5 documents thus need an applicable specification which specifies how to
use the xml:id
attribute, and it is this use case that this specification aims to solve.
To solve this problem, this specification recommends that HTML’s id
attribute can be duplicated with
an xml:id
attribute, since this attribute is specified to be an attribute of XML ID
type but without dependence on DTD and since most XML tools, out of the box, applies ID-assignment to
xml:id
without performing non-DTD-based ID-assignment for the id
attribute of
HTML documents. By, as necessary, prepping documents with this attribute, authors may
continue to use the XML implementations that require idrefs to be of XML ID
type when consuming
HTML5 and XHTML5 documents.
The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this specification are to be interpreted as described in [[!RFC2119]].
The id
attribute in the XHTML namespace, id
, is defined by the HTML specification.
[[!HTML5]]
The id
attribute in the XML namespace, xml:id
, is defined by the xml:id
specification. [[!XML-ID]]
HTML elements in HTML documents and HTML elements in XML documents MUST NOT replace the id
attribute with the id
attribute in the XML namespace.
To allow HTML documents and XML documents to be (post) processed by XML processing tools that rely on
attributes of the XML ID
type for idreferences, authors MAY specify an attribute in no
namespace with no prefix and with the literal localname ”xml:id
” on HTML elements in HTML
documents or the id
attribute in the XML namespace on HTML elements in XML documents,
but only if an id
attribute in no namespace is also specified for the same element and with the
same value, byte-for-byte, in both attributes.
xml:id
to be used independent of id
is not permitted because:
xml:id
supporting processors (for
instance, if XML tools start to support id
, then one should be able to consume
the document without editing it first).
xml:id
attribute, being of XML ID
type, is expected to represent
the element’s unique identifier, something which the id
attribute is expected to
do as well, it makes sense to require that the two are identical and thus represents the one
and same identifier.
name
attribute in XHTML1, which for the anchor attribute shared the same “name space”
as the id
attribute, and the recommendation was for the attributes to have identical values. Keeping the elements identical facilitates simple adding/removal of xml:id and can help prevent misguided usage.
It is OPTIONAL whether all, some, none or one id
attribute are duplicated with an xml:id
attribute. That is: To duplicate one id
attribute with an xml:id
attribute
does imply that all the id
attributes have to be duplicated with xml:id
attributes.
xml:id
as for id
The permitted attribute values, for id
and xml:id
, when both are used on HTML
elements in HTML documents or in HTML elements in XML documents, is the common subset of the constraints
of the id
attribute and the constrains of the xml:id
attribute:
ID
type. (Required by xml:id)
(This is different from the lang
attribute, which from the outset shares the same syntax
rules as xml:lang
.)
For a more complete, but non-normative, list of the forbidden characters for xml:id, see the appendix.
The attribute in no namespace and with no prefix and with the literal localname "xml:id" has no effect on
idref processing when consumed as HTML. The ID type of the attribute “id
” in the xml
namespace in XML documents is not expected to be assigned by Web browsers, as they do typically not
implement xpointer and are at this point expected to continue to NOT support it.
How to determine the idref for XHTML5 and HTML5 documents is defined by the HTML5 spec.
How to determine the idref based on xml:id, is defined by the xml:id specification.
If both the id
attribute in no namespace and the id
attribute in the XML
namespace are set on an element, user agents will use the id
attribute in the XML namespace
if they support both, and the id
attribute in no namespace will be ignored for the purposes
of determining the element's id.
It might not be possible to apply processing tools that applies a schema or another mean in order to
treat the HTML id
attributes as being of XML “ID” type, MUST NOT be used to consume
documents which applies both id and xml:id on the same element as it is an XML validation error if an
element includes two attributes of XML “ID” type.
If the document references a DTD that defines the id
attribute as of type ID (for instance
a DTD from the XHTML 1.x family), then this specification does not apply. However, let it be mentioned
that xml:id should not be applied to elements that already have an attribute of XML “ID” type because,
again, it is an XML (validation) error if an element includes two attributes of XML “ID” type.
If the resulting value is not a valid xml:id value, and the parser supports xml:id, the parser could report an error, see the xml:id specification.
The xml:id IDL attribute ...
<!-- I feel something about IDL needs to be here, but I do not know what to say ... --!>When the above rules are followed, the xml:id
attribute MAY be used in any HTML or XHTML
document provided the document fulfills the following requirements:
id
attribute as being of ID
type.
An author that aims to use xml:id
and who also wants to adhere to the robustness principles of
the Polyglot Markup profile SHOULD duplicate all id
attributes with the xml:id
attribute. Only then is the xml:id
extension, as defined in this specification, considered
to be compatible with the principles of the polyglot markup profile. Conformance with the principles of
the polyglot markup profile for xml:id
, can be viewed along the same lines as the xml:lang
attribute: it is a polyglot feature as long as the element includes both xml:lang
and
lang
— xml:lang
must be used on all elements where lang
is used.
The Polyglot Markup specification defines an profile of HTML that itself is extensible, and which results in XML well-formed and and HTML5-compatible HTML documents that are robust (with regard to preserving semantics are preserved regardless of parsing) and identical (whether parsed as XML or as XHTML).
While DOM equivalence whether parsed as an XML document or parsing the same document as an
HTML document is a core value of polyglot markup, there are some exceptions where prepping the document
for as many parsers as possible wins over the DOM equivalence – robustness wins over strict
equivalence. A relevant example is xml:lang
, which is permitted on the condition that an
identical lang
is used as well. The fact that xml:id
is prefixed with xml:
,
makes it similar to xml:lang
. The DOM difference caused by HTML and XML’s differing
handling of the xml:
prefix, is tolerated due to the semantic identity and in order to make
polyglot markup supported by a wider set of XML parsers, namely parsers that do not support the lang
attribute of the XHTML namespace. While XML parsers should see that xml:id
belongs in the
xml namespace, only XML parsers that implements xml:id
will type assign it as an
ID
attribute (Web browsers that support XHTML are not expected to implement
xml:id
).
The xi:include
specifications defines an element include
in the xi include namespace.
The elemenet can be used to concatenate different documents, or document fragments, into a new
documents.[[xinclude]],[[xinclude-11]]. The reference to the ID is done via xpointer syntax in the xpointer
attribute.
Here is an XML document “foo.xml
” with an include
element in the XInclude namespace,
which points to a document “polyglot.html
”:
<include xmlns="http://www.w3.org/2001/XInclude"
href="http://dataormen.local/dataimport.xhtml" parse="xml" xpointer="MyBody"
/>
Here is the code of the document “polyglot.html
” for which the include
element in the
above file “foo.xml
” refers to the id ”body
”.
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Include
my body element, please!</title> <meta charset="utf-8"/> </head> <body
id="MyBody" xml:id="MyBody"> <h1>Lorem ipsum.</h1> <p>Dolor
sint.</p> </body> </html>