- From: poot <cvsmail@w3.org>
- Date: Sun, 14 Jun 2009 09:21:29 +0900 (JST)
- To: public-html-diffs@w3.org
Strip the URLs section out now that DanC is editing the Web Addresses draft. (whatwg r3245) http://dev.w3.org/cvsweb/html5/spec/Overview.html?r1=1.2392&r2=1.2393&f=h http://html5.org/tools/web-apps-tracker?from=3244&to=3245 =================================================================== RCS file: /sources/public/html5/spec/Overview.html,v retrieving revision 1.2392 retrieving revision 1.2393 diff -u -d -r1.2392 -r1.2393 --- Overview.html 13 Jun 2009 23:51:38 -0000 1.2392 +++ Overview.html 14 Jun 2009 00:21:09 -0000 1.2393 @@ -152,7 +152,7 @@ <h2 class="no-num no-toc" id="a-vocabulary-and-associated-apis-for-html-and-xhtml">A vocabulary and associated APIs for HTML and XHTML</h2> <!--ZZZ:--> <!--<h2 class="no-num no-toc">W3C Working Draft 23 April 2009</h2>--> - <h2 class="no-num no-toc" id="editor-s-draft-date-1-january-1970">Editor's Draft 13 June 2009</h2> + <h2 class="no-num no-toc" id="editor-s-draft-date-1-january-1970">Editor's Draft 14 June 2009</h2> <!--:ZZZ--> <dl><!-- ZZZ: update the month/day (twice), (un)comment out <dt>This Version:</dt> @@ -245,7 +245,7 @@ track. <!--ZZZ:--> <!--This specification is the 23 April 2009 Working Draft.--> - This specification is the 13 June 2009 Editor's Draft. + This specification is the 14 June 2009 Editor's Draft. <!--:ZZZ--> </p><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- relationship to other work (required) --><p>This specification is also being produced by the <a href="http://www.whatwg.org/">WHATWG</a>. The two specifications are identical from the table of contents onwards.</p><!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- context and rationale (required) --><p>This specification is intended to replace (be a new version of) @@ -327,10 +327,8 @@ <li><a href="#urls"><span class="secno">2.5 </span>URLs</a> <ol> <li><a href="#terminology-0"><span class="secno">2.5.1 </span>Terminology</a></li> - <li><a href="#parsing-urls"><span class="secno">2.5.2 </span>Parsing URLs</a></li> - <li><a href="#resolving-urls"><span class="secno">2.5.3 </span>Resolving URLs</a></li> - <li><a href="#dynamic-changes-to-base-urls"><span class="secno">2.5.4 </span>Dynamic changes to base URLs</a></li> - <li><a href="#interfaces-for-url-manipulation"><span class="secno">2.5.5 </span>Interfaces for URL manipulation</a></ol></li> + <li><a href="#dynamic-changes-to-base-urls"><span class="secno">2.5.2 </span>Dynamic changes to base URLs</a></li> + <li><a href="#interfaces-for-url-manipulation"><span class="secno">2.5.3 </span>Interfaces for URL manipulation</a></ol></li> <li><a href="#fetching-resources"><span class="secno">2.6 </span>Fetching resources</a> <ol> <li><a href="#concept-http-equivalent"><span class="secno">2.6.1 </span>Protocol concepts</a></li> @@ -3905,367 +3903,45 @@ maybe they just don't know about combining dot above? --> - </ol></div><h3 id="urls"><span class="secno">2.5 </span>URLs</h3><p>This specification defines the term <a href="#url">URL</a>, and defines - various algorithms for dealing with URLs, because for historical - reasons the rules defined by the URI and IRI specifications are not - a complete description of what HTML user agents need to implement to - be compatible with Web content.<h4 id="terminology-0"><span class="secno">2.5.1 </span>Terminology</h4><p>A <dfn id="url">URL</dfn> is a string used to identify a resource.<p>A <a href="#url">URL</a> is a <dfn id="valid-url">valid URL</dfn> if at least one of - the following conditions holds:<ul><li><p>The <a href="#url">URL</a> is a valid URI reference <a href="#references">[RFC3986]</a>.</li> - - <li><p>The <a href="#url">URL</a> is a valid IRI reference and it has no - query component. <a href="#references">[RFC3987]</a></li> - - <li><p>The <a href="#url">URL</a> is a valid IRI reference and its query - component contains no unescaped non-ASCII characters. <a href="#references">[RFC3987]</a></li> - - <li><p>The <a href="#url">URL</a> is a valid IRI reference and the <a href="#document-s-character-encoding" title="document's character encoding">character encoding</a> of - the URL's <code>Document</code> is UTF-8 or UTF-16. <a href="#references">[RFC3987]</a></li> - - </ul><div class="impl"> - - <p>A <a href="#url">URL</a> has an associated <dfn id="url-character-encoding">URL character - encoding</dfn>, determined as follows:</p> - - <dl class="switch"><dt>If the URL came from a script (e.g. as an argument to a - method)</dt> - - <dd>The URL character encoding is the <a href="#script-s-url-character-encoding">script's URL character - encoding</a>.</dd> + </ol></div><h3 id="urls"><span class="secno">2.5 </span>URLs</h3><h4 id="terminology-0"><span class="secno">2.5.1 </span>Terminology</h4><p>A <dfn id="url">URL</dfn> is a string used to identify a resource.<p>A <a href="#url">URL</a> is a <dfn id="valid-url">valid URL</dfn> if it is a + <span>valid Web address</span> as defined by the Web addresses + specification. <a href="#references">[WEBADDRESSES]</a><p>A <a href="#url">URL</a> is an <dfn id="absolute-url">absolute URL</dfn> if it is an + <span>absolute Web address</span> as defined by the Web addresses + specification. <a href="#references">[WEBADDRESSES]</a><div class="impl"> - <dt>If the URL came from a DOM node (e.g. from an element)</dt> + <p>To <dfn id="parse-a-url">parse a URL</dfn> <var title="">url</var> into its + component parts, the user agent must use the <span>parse a Web + address</span> algorithm defined by the Web addresses + specification. <a href="#references">[WEBADDRESSES]</a></p> - <dd>The node has a <code>Document</code>, and the URL character - encoding is the <a href="#document-s-character-encoding">document's character encoding</a>.</dd> + <p>Parsing a URL results in the following components, again as + defined by the Web addresses specification:</p> - <dt>If the URL had a character encoding defined when the URL was - created or defined</dt> + <ul class="brief"><li><dfn id="url-scheme" title="url-scheme"><scheme></dfn></li> + <li><dfn id="url-host" title="url-host"><host></dfn></li> + <li><dfn id="url-port" title="url-port"><port></dfn></li> + <li><dfn id="url-hostport" title="url-hostport"><hostport></dfn></li> + <li><dfn id="url-path" title="url-path"><path></dfn></li> + <li><dfn id="url-query" title="url-query"><query></dfn></li> + <li><dfn id="url-fragment" title="url-fragment"><fragment></dfn></li> + <li><dfn id="url-host-specific" title="url-host-specific"><host-specific></dfn></li> + </ul><p>To <dfn id="resolve-a-url">resolve a URL</dfn> to an <a href="#absolute-url">absolute URL</a> + relative to either another <a href="#absolute-url">absolute URL</a> or an element, + the user agent must use the <span>resolve a Web address</span> + algorithm defined by the Web addresses specification. <a href="#references">[WEBADDRESSES]</a></p> - <dd>The URL character encoding is as defined.</dd> + <p>The <dfn id="document-base-url">document base URL</dfn> of a <code>Document</code> + object is the <span>document base Web address</span> as defined by + the Web addresses specification. <a href="#references">[WEBADDRESSES]</a></p> - </dl><p class="note">The term "URL" in this specification is used in a + </div><p class="note">The term "URL" in this specification is used in a manner distinct from the precise technical meaning it is given in RFC 3986. Readers familiar with that RFC will find it easier to read <em>this</em> specification if they pretend the term "URL" as used herein is really called something else altogether. This is a - <a href="#willful-violation">willful violation</a> of RFC 3986. <a href="#references">[RFC3986]</a></p> - - </div><div class="impl"> - - <h4 id="parsing-urls"><span class="secno">2.5.2 </span>Parsing URLs</h4> - - <p>To <dfn id="parse-a-url">parse a URL</dfn> <var title="">url</var> into its - component parts, the user agent must use the following steps:</p> - - <ol><li><p>Strip leading and trailing <a href="#space-character" title="space - character">space characters</a> from <var title="">url</var>.</li> - - <li> - - <p>Parse <var title="">url</var> in the manner defined by RFC - 3986, with the following exceptions:</p> - - <ul><li>Add all characters with code points less than or equal to - U+0020 or greater than or equal to U+007F to the - <unreserved> production.</li> - - <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E, - U+0060, and U+007B .. U+007D to the <unreserved> - production. - <!-- - 0022 QUOTATION MARK - 003C LESS-THAN SIGN - 003E GREATER-THAN SIGN - 005B LEFT SQUARE BRACKET - 005C REVERSE SOLIDUS - 005D RIGHT SQUARE BRACKET - 005E CIRCUMFLEX ACCENT - 0060 GRAVE ACCENT - 007B LEFT CURLY BRACKET - 007C VERTICAL LINE - 007D RIGHT CURLY BRACKET - --> - </li> - - <li>Add a single U+0025 PERCENT SIGN character as a second - alternative way of matching the <pct-encoded> production, - except when the <pct-encoded> is used in the - <reg-name> production.</li> - - <li>Add the U+0023 NUMBER SIGN character to the characters - allowed in the <fragment> production.</li> - - <!-- some browsers also have other differences, e.g. Mozilla - seems to treat ";" as if it was not in sub-delims, if the scheem - is "ftp". --> - - </ul></li> - - <li> - - <p>If <var title="">url</var> doesn't match the - <URI-reference> production, even after the above changes are - made to the ABNF definitions, then parsing the URL fails with an - error. <a href="#references">[RFC3986]</a></p> - - <p>Otherwise, parsing <var title="">url</var> was successful; the - components of the URL are substrings of <var title="">url</var> - defined as follows:</p> - - <dl><dt><dfn id="url-scheme" title="url-scheme"><scheme></dfn></dt> - - <dd><p>The substring matched by the <scheme> production, if any.</dd> - - - <dt><dfn id="url-host" title="url-host"><host></dfn></dt> - - <dd><p>The substring matched by the <host> production, if any.</dd> - - - <dt><dfn id="url-port" title="url-port"><port></dfn></dt> - - <dd><p>The substring matched by the <port> production, if any.</dd> - - - <dt><dfn id="url-hostport" title="url-hostport"><hostport></dfn></dt> - - <dd><p>If there is a <scheme> component and a <port> - component and the port given by the <port> component is - different than the default port defined for the protocol given by - the <scheme> component, then <hostport> is the - substring that starts with the substring matched by the - <host> production and ends with the substring matched by the - <port> production, and includes the colon in between the - two. Otherwise, it is the same as the <host> component.</p> - - - <dt><dfn id="url-path" title="url-path"><path></dfn></dt> - - <dd> - - <p>The substring matched by one of the following productions, if - one of them was matched:</p> - - <ul class="brief"><li><path-abempty></li> - <li><path-absolute></li> - <li><path-noscheme></li> - <li><path-rootless></li> - <li><path-empty></li> - </ul></dd> - - - <dt><dfn id="url-query" title="url-query"><query></dfn></dt> - - <dd><p>The substring matched by the <query> production, if any.</dd> - - - <dt><dfn id="url-fragment" title="url-fragment"><fragment></dfn></dt> - - <dd><p>The substring matched by the <fragment> production, if any.</dd> - - - <dt><dfn id="url-host-specific" title="url-host-specific"><host-specific></dfn></dt> - - <dd><p>The substring that <em>follows</em> the substring matched - by the <authority> production, or the whole string if the - <authority> production wasn't matched.</dd> - - </dl></li> - - </ol><p class="note">These parsing rules are a <a href="#willful-violation">willful - violation</a> of RFC 3986 and RFC 3987 (which do not define error - handling), motivated by a desire to handle legacy content. <a href="#references">[RFC3986]</a> <a href="#references">[RFC3987]</a></p> - - </div><div class="impl"> - - <h4 id="resolving-urls"><span class="secno">2.5.3 </span>Resolving URLs</h4> - - <p>To <dfn id="resolve-a-url">resolve a URL</dfn> to an <a href="#absolute-url">absolute URL</a> - relative to either another <a href="#absolute-url">absolute URL</a> or an element, - the user agent must use the following steps. Resolving a URL can - result in an error, in which case the URL is not resolvable.</p> - - <ol><li><p>Let <var title="">url</var> be the <a href="#url">URL</a> being - resolved.</li> - - <li><p>Let <var title="">encoding</var> be the <a href="#url-character-encoding">URL character - encoding</a>.</li> - - <li><p>If <var title="">encoding</var> is a UTF-16 encoding, then - change the value of <var title="">encoding</var> to UTF-8.</li> - - <li> - - <p>If the algorithm was invoked with an <a href="#absolute-url">absolute URL</a> - to use as the base URL, let <var title="">base</var> be that - <a href="#absolute-url">absolute URL</a>.</p> - - <p>Otherwise, let <var title="">base</var> be the <i>base URI of - the element</i>, as defined by the XML Base specification, with - <i>the base URI of the document entity</i> being defined as the - <a href="#document-base-url">document base URL</a> of the <code>Document</code> that - owns the element. <a href="#references">[XMLBASE]</a></p> - - <p>For the purposes of the XML Base specification, user agents - must act as if all <code>Document</code> objects represented XML - documents.</p> - - <p class="note">It is possible for <code title="attr-xml-base"><a href="#the-xml:base-attribute-xml-only">xml:base</a></code> attributes to be present - even in HTML fragments, as such attributes can be added - dynamically using script. (Such scripts would not be conforming, - however, as <code title="attr-xml-base"><a href="#the-xml:base-attribute-xml-only">xml:base</a></code> attributes - are not allowed in <a href="#html-documents">HTML documents</a>.)</p> - - <p>The <dfn id="document-base-url">document base URL</dfn> of a <code>Document</code> is - the <a href="#absolute-url">absolute URL</a> obtained by running these - substeps:</p> - - <ol><li><p>Let <var title="">fallback base url</var> be <a href="#the-document-s-address">the - document's address</a>.</li> - - <li> - - <!-- http://www.hixie.ch/tests/adhoc/html/navigation/javascript-url/ --> - - <!-- XXX this should be tested in the case of a browsing context - that was navigated to about:blank after having been elsewhere, - as opposed to the about:blank used at the time of the browsing - context's creation. --> - - <p>If <var title="">fallback base url</var> is - <code><a href="#about:blank">about:blank</a></code>, and the <code>Document</code>'s - <a href="#browsing-context">browsing context</a> has a <a href="#creator-browsing-context">creator browsing - context</a>, then let <var title="">fallback base url</var> - be the <a href="#document-base-url">document base URL</a> of the <a href="#creator-document">creator - <code>Document</code></a> instead.</p> - - </li> - - <li><p>If there is no <code><a href="#the-base-element">base</a></code> element that is both a - child of <a href="#the-head-element-0">the <code>head</code> element</a> and has an - <code title="attr-base-href"><a href="#attr-base-href">href</a></code> attribute, then the - <a href="#document-base-url">document base URL</a> is <var title="">fallback base - url</var>.</li> - - <li><p>Otherwise, let <var title="">url</var> be the value of the - <code title="attr-base-href"><a href="#attr-base-href">href</a></code> attribute of the first - such element.</li> - - <li><p><a href="#resolve-a-url" title="resolve a URL">Resolve</a> <var title="">url</var> relative to <var title="">fallback base - url</var> (thus, the <code><a href="#the-base-element">base</a></code> <code title="attr-base-href"><a href="#attr-base-href">href</a></code> attribute isn't affected by - <code title="attr-xml-base"><a href="#the-xml:base-attribute-xml-only">xml:base</a></code> attributes).</li> - - <li><p>The <a href="#document-base-url">document base URL</a> is the result of the - previous step if it was successful; otherwise it is <var title="">fallback base url</var>.</li> - - </ol></li> - - <li><p><a href="#parse-a-url" title="parse a URL">Parse</a> <var title="">url</var> into its component parts.</li> - - <li> - - <p>If parsing <var title="">url</var> resulted in a <a href="#url-host" title="url-host"><host></a> component, then replace the - matching substring of <var title="">url</var> with the string that - results from expanding any sequences of percent-encoded octets in - that component that are valid UTF-8 sequences into Unicode - characters as defined by UTF-8.</p> - - <p>If any percent-encoded octets in that component are not valid - UTF-8 sequences, then return an error and abort these steps.</p> - - <p>Apply the IDNA ToASCII algorithm to the matching substring, - with both the AllowUnassigned and UseSTD3ASCIIRules flags - set. Replace the matching substring with the result of the ToASCII - algorithm.</p> - - <p>If ToASCII fails to convert one of the components of the - string, e.g. because it is too long or because it contains invalid - characters, then return an error and abort these steps. <a href="#references">[RFC3490]</a></p> - - </li> - - <li> - - <p>If parsing <var title="">url</var> resulted in a <a href="#url-path" title="url-path"><path></a> component, then replace the - matching substring of <var title="">url</var> with the string that - results from applying the following steps to each character other - than U+0025 PERCENT SIGN (%) that doesn't match the original - <path> production defined in RFC 3986:</p> - - <ol><li>Encode the character into a sequence of octets as defined by - UTF-8.</li> - - <li>Replace the character with the percent-encoded form of those - octets. <a href="#references">[RFC3986]</a></li> - - </ol><div class="example"> - - <p>For instance if <var title="">url</var> was "<code title="">//example.com/a^b☺c%FFd%z/?e</code>", then the - <a href="#url-path" title="url-path"><path></a> component's substring - would be "<code title="">/a^b☺c%FFd%z/</code>" and the two - characters that would have to be escaped would be "<code title="">^</code>" and "<code title="">☺</code>". The - result after this step was applied would therefore be that <var title="">url</var> now had the value "<code title="">//example.com/a%5Eb%E2%98%BAc%FFd%z/?e</code>".</p> - - </div> - - </li> - - <li> - - <p>If parsing <var title="">url</var> resulted in a <a href="#url-query" title="url-query"><query></a> component, then replace the - matching substring of <var title="">url</var> with the string that - results from applying the following steps to each character other - than U+0025 PERCENT SIGN (%) that doesn't match the original - <query> production defined in RFC 3986:</p> - - <ol><li>If the character in question cannot be expressed in the - encoding <var title="">encoding</var>, then replace it with a - single 0x3F octet (an ASCII question mark) and skip the remaining - substeps for this character.</li> - - <li>Encode the character into a sequence of octets as defined by - the encoding <var title="">encoding</var>.</li> - - <li>Replace the character with the percent-encoded form of those - octets. <a href="#references">[RFC3986]</a></li> - - </ol></li> - - <li><p>Apply the algorithm described in RFC 3986 section 5.2 - Relative Resolution, using <var title="">url</var> as the - potentially relative URI reference (<var title="">R</var>), and - <var title="">base</var> as the base URI (<var title="">Base</var>). <a href="#references">[RFC3986]</a></li> - - <li> - - <p>Apply any relevant conformance criteria of RFC 3986 and RFC - 3987, returning an error and aborting these steps if - appropriate. <a href="#references">[RFC3986]</a> <a href="#references">[RFC3987]</a></p> - - <p class="example">For instance, if an absolute URI that would be - returned by the above algorithm violates the restrictions specific - to its scheme, e.g. a <code title="">data:</code> URI using the - "<code title="">//</code>" server-based naming authority syntax, - then user agents are to treat this as an error instead.<!-- RFC - 3986, 3.1 Scheme --></p> - - </li> - - <li><p>Let <var title="">result</var> be the target URI (<var title="">T</var>) returned by the Relative Resolution - algorithm.</li> - - <li><p>If <var title="">result</var> uses a scheme with a - server-based naming authority, replace all U+005C REVERSE SOLIDUS - (\) characters in <var title="">result</var> with U+002F SOLIDUS - (/) characters.</li> - - <li><p>Return <var title="">result</var>.</li> - - </ol><p>A <a href="#url">URL</a> is an <dfn id="absolute-url">absolute URL</dfn> if <a href="#resolve-a-url" title="resolve a URL">resolving</a> it results in the same - URL without an error.</p> - - </div><div class="impl"> + <a href="#willful-violation">willful violation</a> of RFC 3986. <a href="#references">[RFC3986]</a><div class="impl"> - <h4 id="dynamic-changes-to-base-urls"><span class="secno">2.5.4 </span>Dynamic changes to base URLs</h4> + <h4 id="dynamic-changes-to-base-urls"><span class="secno">2.5.2 </span>Dynamic changes to base URLs</h4> <p>When an <code title="attr-xml-base"><a href="#the-xml:base-attribute-xml-only">xml:base</a></code> attribute changes, the attribute's element, and all descendant elements, are @@ -4333,7 +4009,7 @@ </dd> - </dl></div><h4 id="interfaces-for-url-manipulation"><span class="secno">2.5.5 </span>Interfaces for URL manipulation</h4><p>An interface that has a complement of <dfn id="url-decomposition-attributes">URL decomposition + </dl></div><h4 id="interfaces-for-url-manipulation"><span class="secno">2.5.3 </span>Interfaces for URL manipulation</h4><p>An interface that has a complement of <dfn id="url-decomposition-attributes">URL decomposition attributes</dfn> will have seven attributes with the following definitions:<pre class="idl"> attribute DOMString <a href="#dom-uda-protocol" title="dom-uda-protocol">protocol</a>;
Received on Sunday, 14 June 2009 00:22:05 UTC