- From: Maciej Stachowiak <mjs@apple.com>
- Date: Mon, 01 Mar 2010 13:13:26 -0800
- To: Larry Masinter <LMM@acm.org>
- Cc: public-html@w3.org, 'Ted Hardie' <ted.ietf@gmail.com>
Thanks for the update. Recorded at <http://dev.w3.org/html5/status/issue-status.html#ISSUE-056 >. - Maciej On Feb 25, 2010, at 6:45 PM, Larry Masinter wrote: > With regard to ISSUE-56, ACTION-171: > > Rationale: > > The Issue this proposal is trying to address is: > "Bring URLs section/definition and IRI specification in alignment." > > (1) The fundamental rationale is that URLs in HTML and similar > identifiers > in other Internet systems need to have the same syntax and semantics. > The advantages of doing this in technical specifications include all > of those articulated for modular specifications. > > (2) The IETF has approved an IRI working group whose charter > specifically includes working with the W3C HTML working group: > as noted in: > http://lists.w3.org/Archives/Public/public-html/2010Feb/0476.html > and > http://tools.ietf.org/wg/iri/charters which includes: > > " The IRI specification(s) must (continue to) be suitable > for normative reference with Web and XML standards from W3C > specifications. The group should coordinate with the W3C working > groups on HTML5, XML Core, and Internationalization, as well > as with IETF HTTPBIS WG to ensure acceptability. " > > Evidence that there is interest outside of the W3C HTML > working group current members to contribute to this work > has been the extensive participation and time spent already > in meetings, including: > > * meetings at the last W3C TPAC > * Two working group development sessions at IETF meetings > with significant participation by non-HTML-WG members > > http://www.alvestrand.no/pipermail/idna-update/2009-October/005720.htm > l > http://lists.w3.org/Archives/Public/public-iri/2009Nov/0040.html > > http://www.alvestrand.no/pipermail/idna-update/2009-July/004598.html > * Interest in, and discussions with, members of the Unicode > Consortium Technical Committee. > > In addition, there is evidence that this work can succeed: > the discussion in the mailing list for the IRI working group > http://lists.w3.org/Archives/Public/public-iri/ is active; > most of the recent active contributions have been by > W3C HTML Working Group members, with additional contributions > from the broader community of Internet application > development. > > > The first F2F meeting of the IRI working group in IETF > will be Friday, March 25, but of course, as with all IETF > working groups, the primary work of the group is on the > mailing list, and there is no cost or fee for participation > there. > > (3) Recent public-iri discussion seems to raise the issue that the > current definition of URLs in the existing HTML5 specification > may not match implementations in any case. The analysis of > how currently deployed systems work, and how they should work > in the face of changes to the Internationalization of Domain > Names, should be done in a context where the affected communities > (IDN, Unicode Technical Committee, HTML WG, etc.) can come > to agreement. > > (4) Additional information in the HTML5 bug report > http://www.w3.org/Bugs/Public/show_bug.cgi?id=8207 > indicate that the reason for rejecting this as a "bug" > is that the IRI document is 'vague' and does not contain > sufficient normative language to satisfy some who believe > that MUST language with normative algorithms is necessary. > However, these requirements should be handled as updates > to the IRI specification, so that the HTML5 specification > not contain divergent implementation advice from that > used by every other application that uses URLs/IRIs. > > (5) While there may be additional adjustments necessary > to align the boundary between what the HTML5 document > and the IRIBIS document, this work should > proceed as bugs on the drafts, as amended by this change > proposal. > > =============================================================== > Proposal: > > The actual proposal itself was available as an attachment to > http://lists.w3.org/Archives/Public/public-html/2009Nov/0670.html > http://lists.w3.org/Archives/Public/public-html/2009Nov/att-0670/iri-r > ewrite-draft.html > > A minor update of that proposal (edited to update the reference > to point to the IETF document) is attached to this message > and also made available in plain text here: > > > ================================================================ > > > NOTE: This is a draft of one way of rewriting section 2.5.1 of The > HTML 5 editor's draft of 25 August 2009, provided as an example. > > > 2.5.1 Terminology > > Historically the term "URI" was used for "Universal Resource > Identifier" [RFC1630]; with a Uniform Resource Locator (URL) being the > form of URI which expresses an address which maps onto an access > algorithm using network protocols. Further technical specifications > [RFC 1738], [RFC 1808], [RFC 2396] and [RFC 3986], subsequently > defined a "relative URL", elaborated the distinction between Uniform > Resource Names (URN) and URLs, and led to the adoption of "URI" as > Uniform Resource Identifier, and introduced the notion of an > "Internationalized Resource Identifier" (IRI) [RFC 3987] as a > syntactic form which allowed (unencoded) non-ASCII Unicode characters. > [HTML 4.01] (from which this specification was evolved) used "URI" as > specified by [RFC 2396], but contained recommended processing rules > for HTML agents (in [HTML 4.01] appendix B.2) for handling invalid > values containing non-ASCII characters, roughly corresponding to the > guidance in [RFC 3987]. > > Popular informal usage continues to use "URL" to refer to any of these > variations, although, for the most part, the term "URL" alone > indicates an "absolute" form including a scheme (see below). > > Definition: In this document, the term "URL" is used for any strings > used to identify a resource, including relative forms; the > distinction between various forms are made in context or with > qualifiers or by processing rules, as to whether the URL corresponds > to a URI or a "relative reference" (as specified in [RFC 3986]) or the > "internationalized" forms of those, IRI and relative IRI reference (as > specified in [draft-ietf-iri-3987bis]), or to strings which (after > preprocessed by the rules defined in Section 7.2 of > [draft-ietf-iri-3987bis]) result in one of those forms. > > Definition: a valid URL is a string that matches the production of > "iri-reference" in[draft-ietf-iri-3987bis]. > > Definition: a valid absolute URL is a string that matches the > production of "IRI" in [draft-ietf-iri-3987bis]. > > Definition: an absolute URL is a string which results in a valid > absolute URL (defined above) after being processed by the rules of > "Web Address Processing" in section 7.2 of [draft-ietf-iri-3987bis]. > Note that this basically means any string which, after preprocessing, > starts with an initial string matching the "scheme" production of > [draft-ietf-iri-3987bis], followed by a colon. > > Definition: A relative URL is a URL that is not an absolute URL; > similarly, a valid relative URL is a valid URL that is not an absolute > URL. > Definition: To parse a URL into its component parts means to first > preprocess the string according to section 7.2 of > [draft-ietf-iri-3987bis] "Web Address Processing", and then to parse > the results of preprocessing (as per section 3.2 of > [draft-ietf-iri-3987bis]) against the "iri-reference" (if parsing a > URL) or the "IRI" production (if parsing an absolute URL). Note that > the preprocessing steps generally result in a valid URL or a valid > relative URL. Matching BNF components results in the following parts: > > * <scheme>: substring that matched "scheme", if any > * <host>: substring that matched "ireg-name", if any > * <port>: substring that matches "port", if any > * <hostport>: if there is a scheme component and a port component > and the port given by the port component is different than the default > port defined the scheme component (if the default port for the scheme > is known), then <hostport> is the substring that starts with the > substring matched by the host production and ends with the substring > matched by the port production, and includes the colon in between > the two. Otherwise, it is the same as the host component. > * <path>: substring that matches "ipath" , if any > * <query>: substring that matches "iquery", if any > * <fragment>: substring that matches "ifragment", if any > * <host-specific>: the substring that follows the substring > matched by the "iauthority" production, or the whole string (that is, > the input to the matching algorithm which is the result of > preprocessing by section 7.2) if the "iauthority" production wasn't > matched. > > Definition: The phrasing resolve.. relative to... (in the context of > resolve a URL relative to another URL) is used to describe the > process of combining two strings: an original URL and a base URL > (usually an absolute URL) to obtain parsed components; these parsed > components may then be recombined to construct a new URL. This is > accomplished by parsing the original and base URLs (preprocessing by > section 7.2 of [draft-ietf-iri-3987bis] first, then matching against > the productions of section 3.2 of [draft-ietf-iri-3987bis]) but then > combining the original and base components following the algorithms in > section 5.2 of [RFC 3986], but applied to the Unicode characters which > constitute the original and base. > > Definition: the document base URL of a Document object is the absolute > URL defined by : > > 1. Let fallback base url be the document's address (an absolute > URL). > 2. If fallback base url is the string about:blank and the > Document's browsing context has a creator browsing context, then let > fallback base url be the document base URL of the creator Document > instead. > 3. If there is no base element that is both a child of the head > element and has a href attribute, then the document base URL is > fallback base url. > 4. Otherwise, the document base URL url is the result of resolving > the href attribute of the first such element relative to fallback base > url(note that the base href attribute isn't affected by xml:base > attributes). > > > > > > <iri-rewrite-draft.html>
Received on Monday, 1 March 2010 21:14:00 UTC