html5/rdfa rdfa-module.html,NONE,1.1 source,NONE,1.1 from Manu Sporny via cvs-syncmail on 2009-07-12 (public-html-commits@w3.org from July 2009)

From: Manu Sporny via cvs-syncmail <cvsmail@w3.org>
Date: Sun, 12 Jul 2009 18:36:00 +0000
To: public-html-commits@w3.org
Message-Id: <E1MQ3u4-0005tE-Nc@lionel-hutz.w3.org>
Update of /sources/public/html5/rdfa
In directory hutz:/tmp/cvs-serv22460/rdfa

Added Files:
	rdfa-module.html source 
Log Message:
Added first editors draft of HTML5+RDFa module specification and 
anolis-generated HTML module.


--- NEW FILE: source ---
  <h2>
   <dfn>RDFa</dfn>
  </h2>
  <h3>
   Issues
  </h3>
  <p class="XXX">
    This section outlines a number of editorial issues with the RDFa 
    section of the HTML5 specification.
  </p>
  <p class="XXX">
   In order to provide a module that can be authored, inserted and moved
   easily within the HTML5 specification, the RDFa specification section is 
   being edited separately from the main HTML5 specification source file.
   There are two documents that are generated from the RDFa specification 
   source. The first is the full HTML5 specification, which includes the 
   RDFa specification section. The second is the stand-alone HTML5+RDFa 
   document.
  </p>
  <p class="XXX">
   The upside to having two documents generated from the same source mainly
   has to do with load-times for the HTML5 specification in web browsers.
   Loading the 4MB HTML5 specification can be very slow, even in Firefox 
   3.5 or Chrome. So for those that want to just look at the RDFa specification
   text, there is a much smaller, separate document for that purpose.
  </p>
  <p class="XXX">
   Unfortunately, there are a number of down-sides with this approach. The
   first is that the specification language becomes more verbose. The second
   is that cross-references within the HTML5 document are impossible due to
   a bug/feature in the Anolis specification processor. 
  <p class="XXX">
   These down-sides are not ideal and will eventually be remedied as we find
   a way to either fix Anolis or integrate the RDFa document into the HTML5
   specification.
  </p>
  <h3>
   Introduction
  </h3>
  <p>
   <em>This section is informative.</em>
  </p>
  <p>
   Starting in 2006, the Semantic Web Deployment Work Group
   began work to develop a technology to express semantic data in 
   XHTML 1.1. This technology was successfully developed and is now
   called RDFa (The Resource Description Framework in Attributes). 
   While HTML provides a mechanism to express the structure of a document
   (title, paragraphs, links), RDFa provides a mechanism to express 
   the meaning of a document (people, places, events).
  <p>
  <p>
   The document, titled "RDFa in XHTML: Syntax and Processing Rules" 
   [<a href="http://www.w3.org/TR/rdfa-syntax/">XHTML+RDFa</a>], defined
   a set of attributes and rules for processing those attributes that
   resulted in the output of machine-readable semantic data. While the
   document was specific to the XHTML 1.1 member in the HTML family, the
   attributes and rules were always intended to operate across any 
   tree-based structure containing attributes on tree nodes (such as HTML4, 
   SVG and ODF).
  </p>
  <p>
   While RDFa was initially specified for use in XHTML 1.1, adoption by
   a number of large organizations on the Web spurred RDFa's use in non-XHTML
   languages. Its use in HTML4 and HTML5, before an official specification 
   was developed for those languages, caused concern regarding document
   conformance.
  </p>
  <p>
   Over the years, the members of the RDFa Task Force 
   [<a href="http://rdfa.info/">RDFaTF</a>] had discussed the possibility 
   of applying the same attributes and processing rules outlined in the 
   XHTML+RDFa specification to all HTML family documents. By design, the 
   possibility of a unified semantic data expression mechanism between all 
   HTML and XHTML family documents was squarely in the realm of possibility.
  </p>
  <p>
   This section describes the modifications to the original XHTML+RDFa
   specification that permit the use of RDFa in all HTML family documents.
   By using the attributes and processing rules described in the 
   XHTML+RDFa specification and heeding the minor changes in this 
   section, authors can expect to generate markup that produces the same
   semantic data output in HTML4, HTML5 and XHTML5.
  </p>
  <p>
   This section has been prepared by Manu Sporny (President/CEO of Digital
   Bazaar, Inc.) in consultation with key members of the 
   RDFa in XHTML Task Force, the HTML WG, the WHAT WG, and other 
   interested parties.
  </p>
  <h3>
   Parsing Model
  </h3>
  <p>
   Section 5 of the
   [<a href="http://www.w3.org/TR/rdfa-syntax/">XHTML+RDFa</a>] specification
   defines a generic processing model for extracting RDF from a
   tree-based model. The method of transforming an input document into a
   model suited for the RDFa processing rules is intentionally not defined
   in the XHTML+RDFa specification. The method of transformation was intended
   to be defined in the implementation language, in this case, this section of
   the HTML5 specification.
  </p>
  <p>
   In the context of the HTML5 specification, the parsing rules for an input 
   document in HTML4 and HTML5 are clearly defined. The processing model 
   defined in Section 5 of the XHTML+RDFa specification should be executed 
   on the HTML5 DOM. While the HTML5 DOM is not currently stable, a parsing 
   mechanism built on top of the html5lib library should provide a 
   mechanism that is guaranteed to eventually provide a stable, tree-based 
   model for the RDFa processing rules.
  </p>
  <p>
   RDFa's tree-based processing rules enable an input document to be 
   automatically corrected, cleaned-up, re-arranged, or modified in any
   way that is approved by the host language. For example, element nesting 
   issues in HTML documents may be corrected before the input document is 
   serialized into the tree-based model on which the RDFa processing rules 
   will operate.
  </p>
  <h3>
   Conformance Requirements
  </h3>
  <p>
   <em>This section is normative.</em>
  </p>
  <p>
   The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
   interpreted as described in [<a class="nref"
      href="#ref_RFC2119">RFC2119</a>].
  </p>
  <p>
   Note that all examples in this document are informative, and are not meant
   to be interpreted as normative requirements.
  </p>
  <h3>
   Document Conformance
  </h3>
  <p>
   In order for a document to claim that it is a conforming HTML+RDFa document,
   it must provide the facilities described as mandatory in this section.
   The document conformance criteria are listed below, of which only a subset
   are mandatory:
  </p>

  <ol>
   <li>
    There should be a DOCTYPE declaration specified prior to the root element
    in the document that follows the conventions outlined in the
    "The DOCTYPE section" of the HTML5 specification.
   </li>
   <li>
    The root element of the document must follow the conventions outlined
    in "The root element" section of the HTML5 specification.
   </li>
   <li>
    There may be a <code>link</code> element contained in the 
    <code>head</code> element that contains <code>profile</code> for
    the the <code>rel</code> attribute and 
    <code>http://www.w3.org/1999/xhtml/vocab</code> for the <code>href</code>
    attribute.
    <div class="XXX">
     This requires the HTML5 spec to add <code>profile</code> to the list of
     allow-able <code>rel</code>-values. This is used as the signalling 
     mechansim for an RDFa document because the <code>profile</code> 
     attribute is deprecated in HTML5.
    </div>
   </li>
  </ol>
  <h3>
   User Agent Conformance
  </h3>
  <p>
   A conforming RDFa user agent must:
   <ul>
   <li>
    Conform to all conformance requirements listed in the 
    "Conformance requirements" section of the HTML5 specification.
   </li>
   <li>
    Implement all of the features required in the RDFa section of the 
    HTML5 specification.
   </li>
   <li>
    Implement all of the features specified in the XHTML+RDFa specification, 
    excluding those features which are specifically overridden by the RDFa 
    section of the HTML5 specification.
   </li>
   </ul>
  </p>
  <h3>
   RDFa Processor Conformance
  </h3>
  <p>
   A conforming RDFa Processor must implement all of the mandatory features 
   specified in the XHTML+RDFa specification. It must also support any
   mandatory features specified in the RDFa section of the HTML5 specification.
  </p>
  <h3>
   Modifications to XHTML+RDFa
  </h3>
  <p>
   <em>This section is normative.</em>
  </p>
  <p>
   The [<a href="http://www.w3.org/TR/rdfa-syntax/">XHTML+RDFa</a>]
   Recommendation is the base document on which this section builds. That
   document specifies the attributes and processing rules for extracting
   RDF from an XHTML document. This section specifies changes to the
   attributes and processing rules defined in XHTML+RDFa in order to
   support extracting RDF from HTML documents.
  </p>

  <h4>Specifying the language for a literal</h4>
  <p>
   The <code>lang</code> attribute must be supported in the same manner 
   as the <code>xml:lang</code> attribute is in the XHTML+RDFa specification.
   The precedence rules for selecting which value overrides the other is
   outlined in the section titled "The lang and xml:lang attributes" in
   the HTML5 specification.
  </p>
  <p>
   If an author is unsure of the final encapsulating DOCTYPE for their
   markup, such as HTML5 vs. XHTML5, it is suggested that the author specify
   both <code>lang</code> and <code>xml:lang</code> where the value in
   both attributes is exactly the same.
  </p>

  <h4>Invalid XMLLiteral values</h4>
  <p>
   When generating literals of type XMLLiteral, the processor must ensure that
   the output XMLLiteral is 
   <a href="http://www.w3.org/TR/xml/#dt-wellformed">well-formed XML</a>. 
   If the input is not well-formed XML</a>, the processor must transform
   the input text in a way that generates well-formed XML.
   <span class="XXX">We need to specify the algorithm for doing so.</span>
  </p>
  <p>
   Transformation to well-formed XML is required because an application
   that consumes XMLLiteral data expects that data to be well-formed.
  </p>
  <p>
   The transformation requirement does not apply to input data that are
   text-only, such as literals that contain a <code>datatype</code> attribute
   with an empty value (<code>""</code>), or input data that that contain
   only text nodes.
  </p>

  <h4>The <code>xmlns:</code> attribute</h4>
  <p class="XXX">
   There have been various objections to the usage of the <code>xmlns</code>: 
   attribute across all HTML family languages. It is currently unknown whether 
   or not the <code>xmlns</code>: attribute will be supported in HTML5 as it 
   is defined in the 
   [<a href="http://www.w3.org/TR/REC-xml-names/">Namespaces in XML</a>]
   specification. This section assumes deprecation of the <code>xmlns:</code> 
   attribute. The next section provides an alternate mechanism for 
   specifying prefix mappings in addition to deprecated use of 
   <code>xmlns:</code>.
  </p>
  <p>
   If CURIE prefix name definitions are specified using <code>xmlns:</code>, 
   the definitions must be processed using the rules specified in the
   [<a href="http://www.w3.org/TR/REC-xml-names/">Namespaces in XML</a>]
   Recommendation.
  </p>
  <p>
   If CURIE prefix name definitions are specified using <code>xmlns:</code>, and
   since HTML attribute names are case-insensitive, CURIE prefix names declared
   using the <code>xmlns:</code>attribute-name pattern 
   <code>xmlns:&lt;PREFIX&gt;="&lt;URI&gt;"</code> should be specified
   using only lower-case characters. For example, the text "xmlns:" and the
   text in "&lt;PREFIX&gt;" should be lower-case only. This is to ensure that 
   prefix mappings are interpreted in the same way between HTML 
   (case-insensitive attribute names) and XHTML (case-sensitive attribute
   names) document types.
  </p>

  <h4>The <code>token</code> attribute</h4>
  <p>
   <div class="XXX">
    Warning: All statements regarding the <code>token</code> attribute 
    do not enjoy consensus in the RDFa Task Force and could be removed at
    any point without notice.
   </div>
   If authors would like to ensure that their prefix mappings are supported 
   across all XHTML and HTML documents, they should use the <code>token</code> 
   attribute to specify CURIE mapping values.
  </p>
  <p>
   The syntax for the <code>token</code> attribute value is as follows:
   <pre>
    token_mappings := 1*(token_mapping *whitespace)
    token_mapping  := token *whitespace '=' *whitespace mapping
    token          := NCName        ; as defined in [<a href="http://www.w3.org/TR/REC-xml-names/#NT-NCName">Namespaces in XML</a>]
    mapping        := irelative-ref ; as defined in [<a href="http://www.ietf.org/rfc/rfc3987">IRI</a>]
    whitespace     := White_Space   ; as defined in the HTML5 Specification under '"White_Space" characters'
   </pre>
   For example, the following markup:
   <pre>
    &lt;body token="ex=http://example.org/"&gt;
   </pre>
   when applied to the following HTML snippet:
   <pre>
    &lt;a rel="ex:bar"&gt;
   </pre>
   would expand the CURIE value in <code>rel</code> as 
   <code>http://example.org/bar</code>. Similarly, for the following markup:
   <pre>
    &lt;body token="author=http://example.org/author publisher=http://example.org/publisher"&gt;
   </pre>
   when applied to the following HTML snippet:
   <pre>
    &lt;a rel="author"&gt;
   </pre>
   would expand the CURIE value in <code>rel</code> as <code>http://example.org/author</code>.
  </p>
  <h4>Use of URIs in CURIE-only attribute values</h4>
  <p>
   <div class="XXX">
    Warning: All statements regarding the use of URIs in attribute value's
    intended to receive reserved_words, CURIEs or Safe CURIEs, per the 
    XHTML+RDFa specification do not enjoy consensus in the RDFa Task Force 
    and could be removed at any point without notice.
   </div>
  <p>
  <p>
   Document authors should not create CURIE prefix mappings for well-known
   URI schemes such as http, ftp, urn and a number of other well-known schemes 
   specified in 
   [<a href="http://www.iana.org/assignments/uri-schemes.html">The IANA URI Schemes Registry</a>], 
   as well as other URI schemes that are 
   commonly used on the Internet. If common URI schemes are used as CURIE
   prefixes, then they may affect triple generation via modifications to the 
   CURIE processing algorithm (described below). The use of common URI schemes
   as CURIE prefixes may result in unexpected substitutions in certain
   markup scenarios.
  </p>
  <p>
   CURIE processing must follow the processing definition specified in 
   the XHTML+RDFa Recommendation with the following modification:
  </p>
  <p>
   If a prefix mapping is not found for text that is given to the CURIE 
   processing algorithm, and the text is an Internationalized Resource 
   Identifier as defined in 
   [<a href="http://www.ietf.org/rfc/rfc3987">IRI</a>], then the expanded
   value of the potential CURIE should be the IRI.
   </pre>
  </p>

--- NEW FILE: rdfa-module.html ---
<!DOCTYPE html><!-- when publishing, change bits marked ZZZ --><html lang=en-US-x-Hixie><head>
  <title>HTML5+RDFa</title>
  <style type=text/css>
 pre { margin-left: 2em; white-space: pre-wrap; }
   h2 { margin: 3em 0 1em 0; }
   h3 { margin: 2.5em 0 1em 0; }
   h4 { margin: 2.5em 0 0.75em 0; }
   h5, h6 { margin: 2.5em 0 1em; }
   h1 + h2, h1 + h2 + h2 { margin: 0.75em 0 0.75em; }
   h2 + h3, h3 + h4, h4 + h5, h5 + h6 { margin-top: 0.5em; }
   p { margin: 1em 0; }
   hr:not(.top) { display: block; background: none; border: none; padding: 0; margin: 2em 0; height: auto; }
   dl, dd { margin-top: 0; margin-bottom: 0; }
   dt { margin-top: 0.75em; margin-bottom: 0.25em; clear: left; }
   dt + dt { margin-top: 0; }
   dd dt { margin-top: 0.25em; margin-bottom: 0; }
   dd p { margin-top: 0; }
   dd dl + p { margin-top: 1em; }
   dd table + p { margin-top: 1em; }
   p + * > li, dd li { margin: 1em 0; }
   dt, dfn { font-weight: bold; font-style: normal; }
   dt dfn { font-style: italic; }
   pre, code { font-size: inherit; font-family: monospace; font-variant: normal; }
   pre strong { color: black; font: inherit; font-weight: bold; background: yellow; }
   pre em { font-weight: bolder; font-style: normal; }
   @media screen { code { color: orangered; } code :link, code :visited { color: inherit; } }
   var sub { vertical-align: bottom; font-size: smaller; position: relative; top: 0.1em; }
   table { border-collapse: collapse; border-style: hidden hidden none hidden; }
   table thead { border-bottom: solid; }
   table tbody th:first-child { border-left: solid; }
   table td, table th { border-left: solid; border-right: solid; border-bottom: solid thin; vertical-align: top; padding: 0.2em; }
   blockquote { margin: 0 0 0 2em; border: 0; padding: 0; font-style: italic; }

   .bad, .bad *:not(.XXX) { color: gray; border-color: gray; background: transparent; }
   .matrix, .matrix td { border: none; text-align: right; }
   .matrix { margin-left: 2em; }
   .dice-example { border-collapse: collapse; border-style: hidden solid solid hidden; border-width: thin; margin-left: 3em; }
   .dice-example caption { width: 30em; font-size: smaller; font-style: italic; padding: 0.75em 0; text-align: left; }
   .dice-example td, .dice-example th { border: solid thin; width: 1.35em; height: 1.05em; text-align: center; padding: 0; }
   .applies th > * { display: block; white-space: nowrap; }
   .applies thead code { display: block; }
   .applies td { text-align: center; }
   .applies .yes { background: yellow; }

   .toc dfn, h1 dfn, h2 dfn, h3 dfn, h4 dfn, h5 dfn, h6 dfn { font: inherit; }
   img.extra { float: right; }
   pre.idl { border: solid thin; background: #EEEEEE; color: black; padding: 0.5em 1em; }
   pre.idl :link, pre.idl :visited { color: inherit; background: transparent; }
   pre.css { border: solid thin; background: #FFFFEE; color: black; padding: 0.5em 1em; }
   pre.css:first-line { color: #AAAA50; }
   dl.domintro { color: green; margin: 2em 0 2em 2em; padding: 0.5em 1em; border: none; background: #EEFFEE; }
   hr + dl.domintro, div.impl + dl.domintro { margin-top: 2.5em; margin-bottom: 1.5em; }
   dl.domintro dt, dl.domintro dt * { color: black; text-decoration: none; }
   dl.domintro dd { margin: 0.5em 0 1em 2em; padding: 0; }
   dl.domintro dd p { margin: 0.5em 0; }
   dl.switch { padding-left: 2em; }
   dl.switch > dt { text-indent: -1.5em; }
   dl.switch > dt:before { content: '\21AA'; padding: 0 0.5em 0 0; display: inline-block; width: 1em; text-align: right; line-height: 0.5em; }
   dl.triple { padding: 0 0 0 1em; }
   dl.triple dt, dl.triple dd { margin: 0; display: inline }
   dl.triple dt:after { content: ':'; }
   dl.triple dd:after { content: '\A'; white-space: pre; }
   .diff-old { text-decoration: line-through; color: silver; background: transparent; }
   .diff-chg, .diff-new { text-decoration: underline; color: green; background: transparent; }
   a .diff-new { border-bottom: 1px blue solid; }

   h2 { page-break-before: always; }
   h1 + h2, hr + h2.no-toc { page-break-before: auto; }

   p > span:not([title=""]):not([class="XXX"]):not([class="impl"]), li > span:not([title=""]):not([class="XXX"]):not([class="impl"]) { border-bottom: solid #9999CC; }

   div.head { margin: 0 0 1em; padding: 1em 0 0 0; }
   div.head p { margin: 0; }
   div.head h1 { margin: 0; }
   div.head .logo { float: right; margin: 0 1em; }
   div.head .logo img { border: none } /* remove border from top image */
   div.head dl { margin: 1em 0; }
   p.copyright { font-size: x-small; font-style: oblique; margin: 0; }

   body > .toc > li { margin-top: 1em; margin-bottom: 1em; }
   body > .toc.brief > li { margin-top: 0.35em; margin-bottom: 0.35em; }
   body > .toc > li > * { margin-bottom: 0.5em; }
   body > .toc > li > * > li > * { margin-bottom: 0.25em; }
   .toc, .toc li { list-style: none; }

   .brief { margin-top: 1em; margin-bottom: 1em; line-height: 1.1; }
   .brief li { margin: 0; padding: 0; }
   .brief li p { margin: 0; padding: 0; }

   .category-list { margin-top: -0.75em; margin-bottom: 1em; line-height: 1.5; }
   .category-list::before { content: '\21D2\A0'; font-size: 1.2em; font-weight: 900; }
   .category-list li { display: inline; }
   .category-list li:not(:last-child)::after { content: ', '; }
   .category-list li > span, .category-list li > a { text-transform: lowercase; }
   .category-list li * { text-transform: none; } /* don't affect <code> nested in <a> */

   .XXX { color: #E50000; background: white; border: solid red; padding: 0.5em; margin: 1em 0; }
   .XXX > :first-child { margin-top: 0; }
   p .XXX { line-height: 3em; }
   .note { color: green; background: transparent; font-family: sans-serif; }
   .warning { color: red; background: transparent; }
   .note, .warning { font-weight: bolder; font-style: italic; }
   p.note, div.note { padding: 0.5em 2em; }
   span.note { padding: 0 2em; }
   .note p:first-child, .warning p:first-child { margin-top: 0; }
   .note p:last-child, .warning p:last-child { margin-bottom: 0; }
   .warning:before { font-style: normal; }
   p.note:before { content: 'Note: '; }
   p.warning:before { content: '\26A0 Warning! '; }

   .bookkeeping:before { display: block; content: 'Bookkeeping details'; font-weight: bolder; font-style: italic; }
   .bookkeeping { font-size: 0.8em; margin: 2em 0; }
   .bookkeeping p { margin: 0.5em 2em; display: list-item; list-style: square; }

   h4 { position: relative; z-index: 3; }
   h4 + .element, h4 + div + .element { margin-top: -2.5em; padding-top: 2em; }
   .element {
     background: #EEEEFF;
     color: black;
     margin: 0 0 1em 0.15em;
     padding: 0 1em 0.25em 0.75em;
     border-left: solid #9999FF 0.25em;
     position: relative;
     z-index: 1;
   }
   .element:before {
     position: absolute;
     z-index: 2;
     top: 0;
     left: -1.15em;
     height: 2em;
     width: 0.9em;
     background: #EEEEFF;
     content: ' ';
     border-style: none none solid solid;
     border-color: #9999FF;
     border-width: 0.25em;
   }

   .example {
     display: block;
     color: #222222;
     background: #FCFCFC;
     border-left: double;
     margin-left: 2em;
     padding-left: 1em;
   }

   .tall-and-narrow {
     font-size: 0.6em;
     column-width: 25em;
     column-gap: 1em;
     -moz-column-width: 25em;
     -moz-column-gap: 1em;
     -webkit-column-width: 25em;
     -webkit-column-gap: 1em;
   }

   ul.domTree, ul.domTree ul { padding: 0 0 0 1em; margin: 0; }
   ul.domTree li { padding: 0; margin: 0; list-style: none; position: relative; }
   ul.domTree li li { list-style: none; }
   ul.domTree li:first-child::before { position: absolute; top: 0; height: 0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
   ul.domTree li:not(:last-child)::after { position: absolute; top: 0; bottom: -0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
   ul.domTree span { font-style: italic; font-family: serif; }
   ul.domTree .t1 code { color: purple; font-weight: bold; }
   ul.domTree .t2 { font-style: normal; font-family: monospace; }
   ul.domTree .t2 .name { color: black; font-weight: bold; }
   ul.domTree .t2 .value { color: blue; font-weight: normal; }
   ul.domTree .t3 code, .domTree .t4 code, .domTree .t5 code { color: gray; }
   ul.domTree .t7 code, .domTree .t8 code { color: green; }
   ul.domTree .t10 code { color: teal; }

  </style>
  <link href=data:text/css, rel=stylesheet title="Complete specification" type=text/css>
  <link href=data:text/css,.impl%20{%20display:%20none;%20} rel="alternate stylesheet" title="Author documentation only">
  <link href=data:text/css,.impl%20{%20background:%20%23FFEEEE;%20} rel="alternate stylesheet" title="Highlight implementation requirements">
  <link href=http://www.w3.org/StyleSheets/TR/W3C-ED rel=stylesheet type=text/css><!-- ZZZ ED vs WD -->
 </head>
 <body>
  <div class=head>
   <p>
    <a href=http://www.w3.org/><img alt=W3C height=48 src=http://www.w3.org/Icons/w3c_home width=72></a>
   </p>
   <h1>HTML5+RDFa</h1>
   <h2 class="no-num no-toc" id=a-mechanism-for-embedding-rdf-in-html>
    A mechanism for embedding RDF in HTML
   </h2>
   <h2 class="no-num no-toc" id=editor-s-draft-date-1-january-1970>Editor's Draft 13 July 2009</h2>
    <!--:ZZZ-->
   <dl>
    <!-- ZZZ: update the month/day (twice), (un)comment out
    <dt>This Version:</dt>
    <dd><a href="http://www.w3.org/TR/2009/WD-html5-20090423/">http://www.w3.org/TR/2009/WD-html5-20090423/</a></dd>
 :ZZZ -->
    <dt>
     Latest Published Version:
    </dt>
    <dd>
     Not published
    </dd>
    <dt>
     Latest Editor's Draft:
    </dt>
    <dd>
     <a href=http://dev.w3.org/public/source/html5/rdfa/rdfa-module.html>http://dev.w3.org/public/source/html5/rdfa/rdfa-module.html</a>
    </dd><!-- ZZZ: add the new version after it has shipped -->
    <dt>
     Previous Versions:
    </dt>
    <dd>
     None
    </dd><!-- :ZZZ -->
    <dt>
     Contributors (alphabetical order):
    </dt>
    <dd>
     Ben Adida (Chair, Creative Commons)
    </dd>
    <dd>
     Mark Birbeck (Editor, RDFa Core and inventor of RDFa concept, Web
     Backplane Ltd.)
    </dd>
    <dd>
     Shane McCarron (Editor, RDFa Core, Applied Testing and Technology, Inc.)
    </dd>
    <dd>
     Steven Pemberton (Chair, XHTML2, CWI)
    </dd>
    <dd>
     <a href=mailto:msporny@digitalbazaar.com>Manu Sporny</a>, (Editor, HTML5+RDFa, Digital Bazaar, Inc.)
    </dd>
   </dl>
   <p class=copyright>
    <a href=http://www.w3.org/Consortium/Legal/ipr-notice#Copyright>Copyright</a> © 2009 <a href=http://www.w3.org/><abbr title="World Wide Web
    Consortium">W3C</abbr></a><sup>®</sup> (<a href=http://www.csail.mit.edu/><abbr title="Massachusetts Institute of 
     Technology">MIT</abbr></a>, <a href=http://www.ercim.org/><abbr title="European Research Consortium for Informatics and
    Mathematics">ERCIM</abbr></a>, <a href=http://www.keio.ac.jp/>Keio</a>), All Rights Reserved. W3C <a href=http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer>liability</a>,
    <a href=http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks>trademark</a>
    and <a href=http://www.w3.org/Consortium/Legal/copyright-documents>document use</a>
    rules apply.
   </p>
  </div>
  <hr class=top>
  <h2 class="no-num no-toc" id=abstract>
   Abstract
  </h2>
  <p>
   This specification defines rules and guidelines for adapting the RDF 
   in XHTML 1.1 specification (RDFa) for use in the HTML5 and XHTML5 members
   of the HTML family. The rules defined in this document not only apply
   to HTML5 documents, but also to HTML4 documents interpreted through the 
   HTML5 parsing rules.
  </p>
  <h2 class="no-num no-toc" id=status-of-this-document>
   Status of this document
  </h2><!-- intro boilerplate (required) -->
  <p>
   <em>This section describes the status of this document at the time of its
   publication. Other documents may supersede this document. A list of
   current W3C publications and the most recently formally published revision
   of this technical report can be found in the <a href=http://www.w3.org/TR/>W3C technical reports index</a> at
   http://www.w3.org/TR/.</em>
  </p>
  <p>
   If you wish to make comments regarding this document, please send them to
   <a href=mailto:public-rdf-in-xhtml-tf@w3.org>public-rdf-in-xhtml-tf@w3.org</a>
   (<a href="mailto:public-rdf-in-xhtml-tf-request@w3.org?subject=subscribe">subscribe</a>,
   <a href=http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/>archives</a>) 
  </p>
  <!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST -->
  <!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- stability (required) -->
  <p>
   Implementors should be aware that this specification is not stable.
   <strong>Implementors who are not taking part in the discussions are likely
   to find the specification changing out from under them in incompatible
   ways.</strong> Vendors interested in implementing this specification
   before it eventually reaches the Candidate Recommendation stage should
   join the aforementioned mailing lists and take part in the discussions.
  </p><!-- not everyone agrees with html5 (requested before fpwd) -->
  <p>
   The publication of this document by the W3C as a W3C Working Draft does
   not imply that all of the participants in the W3C HTML working group
   endorse the contents of the specification. Indeed, for any section of the
   specification, one can usually find many members of the working group or
   of the W3C as a whole who object strongly to the current text, the
   existence of the section at all, or the idea that the working group should
   even spend time discussing the concept of that section.
  </p>
  <!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- version history or list of changes (required) -->
  <p>
   The latest stable version of the editor's draft of this specification is
   always available on <a href=http://dev.w3.org/html5/rdfa/rdfa-module.html>the W3C CVS server</a>. 
   The <a href=http://dev.w3.org/html5/rdfa/rdfa>latest editor's working copy</a> (which may contain unfinished text in the process of
   being prepared) is also available.
  </p>
  <!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING LIST TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- status of document, group responsible (required) -->
  <p>
   The W3C <a href=http://www.w3.org/html/wg/>HTML Working Group</a> is the
   W3C working group responsible for this specification's progress along the
   W3C Recommendation track. 
  </p>
  <!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- context and rationale (required) -->
  <p>
   This specification is intended to be included in the HTML5 specification
   as a section of the overall specification.
  </p>
  <!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- required patent boilerplate -->
  <p>
   This document was produced by a group operating under the <a href=http://www.w3.org/Consortium/Patent-Policy-20040205/>5 February 2004 W3C
   Patent Policy</a>. W3C maintains a <a href=http://www.w3.org/2004/01/pp-impl/40318/status rel=disclosure>public list of any patent disclosures</a> made in
      connection with the deliverables of the group; that page also includes
      instructions for disclosing a patent. An individual who has actual
      knowledge of a patent which the individual believes contains <a href=http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential>Essential
      Claim(s)</a> must disclose the information in accordance with <a href=http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure>section
      6 of the W3C Patent Policy</a>.
  </p>
  <h2 id=rdfa><span class=secno>1 </span>
   <dfn>RDFa</dfn>
  </h2>
  <h3 id=issues><span class=secno>1.1 </span>
   Issues
  </h3>
  <p class=XXX>
    This section outlines a number of editorial issues with the RDFa 
    section of the HTML5 specification.
  </p>
  <p class=XXX>
   In order to provide a module that can be authored, inserted and moved
   easily within the HTML5 specification, the RDFa specification section is 
   being edited separately from the main HTML5 specification source file.
   There are two documents that are generated from the RDFa specification 
   source. The first is the full HTML5 specification, which includes the 
   RDFa specification section. The second is the stand-alone HTML5+RDFa 
   document.
  </p>
  <p class=XXX>
   The upside to having two documents generated from the same source mainly
   has to do with load-times for the HTML5 specification in web browsers.
   Loading the 4MB HTML5 specification can be very slow, even in Firefox 
   3.5 or Chrome. So for those that want to just look at the RDFa specification
   text, there is a much smaller, separate document for that purpose.
  </p>
  <p class=XXX>
   Unfortunately, there are a number of down-sides with this approach. The
   first is that the specification language becomes more verbose. The second
   is that cross-references within the HTML5 document are impossible due to
   a bug/feature in the Anolis specification processor. 
  </p><p class=XXX>
   These down-sides are not ideal and will eventually be remedied as we find
   a way to either fix Anolis or integrate the RDFa document into the HTML5
   specification.
  </p>
  <h3 id=introduction><span class=secno>1.2 </span>
   Introduction
  </h3>
  <p>
   <em>This section is informative.</em>
  </p>
  <p>
   Starting in 2006, the Semantic Web Deployment Work Group
   began work to develop a technology to express semantic data in 
   XHTML 1.1. This technology was successfully developed and is now
   called RDFa (The Resource Description Framework in Attributes). 
   While HTML provides a mechanism to express the structure of a document
   (title, paragraphs, links), RDFa provides a mechanism to express 
   the meaning of a document (people, places, events).
  </p><p>
  </p><p>
   The document, titled "RDFa in XHTML: Syntax and Processing Rules" 
   [<a href=http://www.w3.org/TR/rdfa-syntax/>XHTML+RDFa</a>], defined
   a set of attributes and rules for processing those attributes that
   resulted in the output of machine-readable semantic data. While the
   document was specific to the XHTML 1.1 member in the HTML family, the
   attributes and rules were always intended to operate across any 
   tree-based structure containing attributes on tree nodes (such as HTML4, 
   SVG and ODF).
  </p>
  <p>
   While RDFa was initially specified for use in XHTML 1.1, adoption by
   a number of large organizations on the Web spurred RDFa's use in non-XHTML
   languages. Its use in HTML4 and HTML5, before an official specification 
   was developed for those languages, caused concern regarding document
   conformance.
  </p>
  <p>
   Over the years, the members of the RDFa Task Force 
   [<a href=http://rdfa.info/>RDFaTF</a>] had discussed the possibility 
   of applying the same attributes and processing rules outlined in the 
   XHTML+RDFa specification to all HTML family documents. By design, the 
   possibility of a unified semantic data expression mechanism between all 
   HTML and XHTML family documents was squarely in the realm of possibility.
  </p>
  <p>
   This section describes the modifications to the original XHTML+RDFa
   specification that permit the use of RDFa in all HTML family documents.
   By using the attributes and processing rules described in the 
   XHTML+RDFa specification and heeding the minor changes in this 
   section, authors can expect to generate markup that produces the same
   semantic data output in HTML4, HTML5 and XHTML5.
  </p>
  <p>
   This section has been prepared by Manu Sporny (President/CEO of Digital
   Bazaar, Inc.) in consultation with key members of the 
   RDFa in XHTML Task Force, the HTML WG, the WHAT WG, and other 
   interested parties.
  </p>
  <h3 id=parsing-model><span class=secno>1.3 </span>
   Parsing Model
  </h3>
  <p>
   Section 5 of the
   [<a href=http://www.w3.org/TR/rdfa-syntax/>XHTML+RDFa</a>] specification
   defines a generic processing model for extracting RDF from a
   tree-based model. The method of transforming an input document into a
   model suited for the RDFa processing rules is intentionally not defined
   in the XHTML+RDFa specification. The method of transformation was intended
   to be defined in the implementation language, in this case, this section of
   the HTML5 specification.
  </p>
  <p>
   In the context of the HTML5 specification, the parsing rules for an input 
   document in HTML4 and HTML5 are clearly defined. The processing model 
   defined in Section 5 of the XHTML+RDFa specification should be executed 
   on the HTML5 DOM. While the HTML5 DOM is not currently stable, a parsing 
   mechanism built on top of the html5lib library should provide a 
   mechanism that is guaranteed to eventually provide a stable, tree-based 
   model for the RDFa processing rules.
  </p>
  <p>
   RDFa's tree-based processing rules enable an input document to be 
   automatically corrected, cleaned-up, re-arranged, or modified in any
   way that is approved by the host language. For example, element nesting 
   issues in HTML documents may be corrected before the input document is 
   serialized into the tree-based model on which the RDFa processing rules 
   will operate.
  </p>
  <h3 id=conformance-requirements><span class=secno>1.4 </span>
   Conformance Requirements
  </h3>
  <p>
   <em>This section is normative.</em>
  </p>
  <p>
   The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
   interpreted as described in [<a class=nref href=#ref_RFC2119>RFC2119</a>].
  </p>
  <p>
   Note that all examples in this document are informative, and are not meant
   to be interpreted as normative requirements.
  </p>
  <h3 id=document-conformance><span class=secno>1.5 </span>
   Document Conformance
  </h3>
  <p>
   In order for a document to claim that it is a conforming HTML+RDFa document,
   it must provide the facilities described as mandatory in this section.
   The document conformance criteria are listed below, of which only a subset
   are mandatory:
  </p>

  <ol>
   <li>
    There should be a DOCTYPE declaration specified prior to the root element
    in the document that follows the conventions outlined in the
    "The DOCTYPE section" of the HTML5 specification.
   </li>
   <li>
    The root element of the document must follow the conventions outlined
    in "The root element" section of the HTML5 specification.
   </li>
   <li>
    There may be a <code>link</code> element contained in the 
    <code>head</code> element that contains <code>profile</code> for
    the the <code>rel</code> attribute and 
    <code>http://www.w3.org/1999/xhtml/vocab</code> for the <code>href</code>
    attribute.
    <div class=XXX>
     This requires the HTML5 spec to add <code>profile</code> to the list of
     allow-able <code>rel</code>-values. This is used as the signalling 
     mechansim for an RDFa document because the <code>profile</code> 
     attribute is deprecated in HTML5.
    </div>
   </li>
  </ol>
  <h3 id=user-agent-conformance><span class=secno>1.6 </span>
   User Agent Conformance
  </h3>
  <p>
   A conforming RDFa user agent must:
   </p><ul>
   <li>
    Conform to all conformance requirements listed in the 
    "Conformance requirements" section of the HTML5 specification.
   </li>
   <li>
    Implement all of the features required in the RDFa section of the 
    HTML5 specification.
   </li>
   <li>
    Implement all of the features specified in the XHTML+RDFa specification, 
    excluding those features which are specifically overridden by the RDFa 
    section of the HTML5 specification.
   </li>
   </ul>
  <p></p>
  <h3 id=rdfa-processor-conformance><span class=secno>1.7 </span>
   RDFa Processor Conformance
  </h3>
  <p>
   A conforming RDFa Processor must implement all of the mandatory features 
   specified in the XHTML+RDFa specification. It must also support any
   mandatory features specified in the RDFa section of the HTML5 specification.
  </p>
  <h3 id=modifications-to-xhtml-rdfa><span class=secno>1.8 </span>
   Modifications to XHTML+RDFa
  </h3>
  <p>
   <em>This section is normative.</em>
  </p>
  <p>
   The [<a href=http://www.w3.org/TR/rdfa-syntax/>XHTML+RDFa</a>]
   Recommendation is the base document on which this section builds. That
   document specifies the attributes and processing rules for extracting
   RDF from an XHTML document. This section specifies changes to the
   attributes and processing rules defined in XHTML+RDFa in order to
   support extracting RDF from HTML documents.
  </p>

  <h4 id=specifying-the-language-for-a-literal><span class=secno>1.8.1 </span>Specifying the language for a literal</h4>
  <p>
   The <code>lang</code> attribute must be supported in the same manner 
   as the <code>xml:lang</code> attribute is in the XHTML+RDFa specification.
   The precedence rules for selecting which value overrides the other is
   outlined in the section titled "The lang and xml:lang attributes" in
   the HTML5 specification.
  </p>
  <p>
   If an author is unsure of the final encapsulating DOCTYPE for their
   markup, such as HTML5 vs. XHTML5, it is suggested that the author specify
   both <code>lang</code> and <code>xml:lang</code> where the value in
   both attributes is exactly the same.
  </p>

  <h4 id=invalid-xmlliteral-values><span class=secno>1.8.2 </span>Invalid XMLLiteral values</h4>
  <p>
   When generating literals of type XMLLiteral, the processor must ensure that
   the output XMLLiteral is 
   <a href=http://www.w3.org/TR/xml/#dt-wellformed>well-formed XML</a>. 
   If the input is not well-formed XML, the processor must transform
   the input text in a way that generates well-formed XML.
   <span class=XXX>We need to specify the algorithm for doing so.</span>
  </p>
  <p>
   Transformation to well-formed XML is required because an application
   that consumes XMLLiteral data expects that data to be well-formed.
  </p>
  <p>
   The transformation requirement does not apply to input data that are
   text-only, such as literals that contain a <code>datatype</code> attribute
   with an empty value (<code>""</code>), or input data that that contain
   only text nodes.
  </p>

  <h4 id=the-xmlns:-attribute><span class=secno>1.8.3 </span>The <code>xmlns:</code> attribute</h4>
  <p class=XXX>
   There have been various objections to the usage of the <code>xmlns</code>: 
   attribute across all HTML family languages. It is currently unknown whether 
   or not the <code>xmlns</code>: attribute will be supported in HTML5 as it 
   is defined in the 
   [<a href=http://www.w3.org/TR/REC-xml-names/>Namespaces in XML</a>]
   specification. This section assumes deprecation of the <code>xmlns:</code> 
   attribute. The next section provides an alternate mechanism for 
   specifying prefix mappings in addition to deprecated use of 
   <code>xmlns:</code>.
  </p>
  <p>
   If CURIE prefix name definitions are specified using <code>xmlns:</code>, 
   the definitions must be processed using the rules specified in the
   [<a href=http://www.w3.org/TR/REC-xml-names/>Namespaces in XML</a>]
   Recommendation.
  </p>
  <p>
   If CURIE prefix name definitions are specified using <code>xmlns:</code>, and
   since HTML attribute names are case-insensitive, CURIE prefix names declared
   using the <code>xmlns:</code>attribute-name pattern 
   <code>xmlns:&lt;PREFIX&gt;="&lt;URI&gt;"</code> should be specified
   using only lower-case characters. For example, the text "xmlns:" and the
   text in "&lt;PREFIX&gt;" should be lower-case only. This is to ensure that 
   prefix mappings are interpreted in the same way between HTML 
   (case-insensitive attribute names) and XHTML (case-sensitive attribute
   names) document types.
  </p>

  <h4 id=the-token-attribute><span class=secno>1.8.4 </span>The <code>token</code> attribute</h4>
  <p>
   </p><div class=XXX>
    Warning: All statements regarding the <code>token</code> attribute 
    do not enjoy consensus in the RDFa Task Force and could be removed at
    any point without notice.
   </div>
   If authors would like to ensure that their prefix mappings are supported 
   across all XHTML and HTML documents, they should use the <code>token</code> 
   attribute to specify CURIE mapping values.
  <p></p>
  <p>
   The syntax for the <code>token</code> attribute value is as follows:
   </p><pre>    token_mappings := 1*(token_mapping *whitespace)
    token_mapping  := token *whitespace '=' *whitespace mapping
    token          := NCName        ; as defined in [<a href=http://www.w3.org/TR/REC-xml-names/#NT-NCName>Namespaces in XML</a>]
    mapping        := irelative-ref ; as defined in [<a href=http://www.ietf.org/rfc/rfc3987>IRI</a>]
    whitespace     := White_Space   ; as defined in the HTML5 Specification under '"White_Space" characters'
   </pre>
   For example, the following markup:
   <pre>    &lt;body token="ex=http://example.org/"&gt;
   </pre>
   when applied to the following HTML snippet:
   <pre>    &lt;a rel="ex:bar"&gt;
   </pre>
   would expand the CURIE value in <code>rel</code> as 
   <code>http://example.org/bar</code>. Similarly, for the following markup:
   <pre>    &lt;body token="author=http://example.org/author publisher=http://example.org/publisher"&gt;
   </pre>
   when applied to the following HTML snippet:
   <pre>    &lt;a rel="author"&gt;
   </pre>
   would expand the CURIE value in <code>rel</code> as <code>http://example.org/author</code>.
  <p></p>
  <h4 id=use-of-uris-in-curie-only-attribute-values><span class=secno>1.8.5 </span>Use of URIs in CURIE-only attribute values</h4>
  <p>
   </p><div class=XXX>
    Warning: All statements regarding the use of URIs in attribute value's
    intended to receive reserved_words, CURIEs or Safe CURIEs, per the 
    XHTML+RDFa specification do not enjoy consensus in the RDFa Task Force 
    and could be removed at any point without notice.
   </div>
  <p>
  </p><p>
   Document authors should not create CURIE prefix mappings for well-known
   URI schemes such as http, ftp, urn and a number of other well-known schemes 
   specified in 
   [<a href=http://www.iana.org/assignments/uri-schemes.html>The IANA URI Schemes Registry</a>], 
   as well as other URI schemes that are 
   commonly used on the Internet. If common URI schemes are used as CURIE
   prefixes, then they may affect triple generation via modifications to the 
   CURIE processing algorithm (described below). The use of common URI schemes
   as CURIE prefixes may result in unexpected substitutions in certain
   markup scenarios.
  </p>
  <p>
   CURIE processing must follow the processing definition specified in 
   the XHTML+RDFa Recommendation with the following modification:
  </p>
  <p>
   If a prefix mapping is not found for text that is given to the CURIE 
   processing algorithm, and the text is an Internationalized Resource 
   Identifier as defined in 
   [<a href=http://www.ietf.org/rfc/rfc3987>IRI</a>], then the expanded
   value of the potential CURIE should be the IRI.
   
  </p>
</body></html>
Received on Sunday, 12 July 2009 18:36:10 UTC