Re: design team notes

On 21.04.2011 17:35, Peter Saint-Andre wrote:
> Folks, we had design team call just now. Here are the notes. More soon.
>
> ###
>
> IRI WG Design Team, April 21, 2010
>
> Participants...
>
> Philippe Le Hagaret (W3C / HTML)
> Julian Reschke (mostly interested as HTML WG member / HTTPbis spec editor)
> Thomas Roessler (W3C)
> Martin Dürst (IRI spec co-editor)
> Mike Smith (W3C / interaction domain / HTML)
> Adam Barth
> Shawn Steele
> Jason Duell
> Chris Weber
>
> Peter: what do we need to deliver to HTML5 folks?
>
> Mike: talked with Larry Masinter around the time of the BoF, created a
> wiki page, things haven't really changed since then
>
> Adam: a link would be helpful
>
> Mike: http://trac.tools.ietf.org/area/app/trac/wiki/IriWorkGoals
>
> Adam: Ian Hickson thinks we need two things:
> - parse document url and extract the host (for security purposes / same
> origin policy)
> - resolving relative URL (e.g. in script or form)

I keep hearing this.

This *is* defined in RFC 3986/3987.

The ABNFs cover only valid URIs/IRIs, but it's trivial to expand this by 
just relaxing the character repertoire constraints.

All that's needed is a simple parser that just acts on the well-defined 
delimiters. One way to implement such a parser is to just use the 
regular expression in

   http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.B

Does this need a separate document? I don't think so, but it won't hurt 
*as long* as that document doesn't conflict with what these specs say 
(in that they treat RFC3986/7-valid identifiers differently than before).

> Martin:
> what may also be of interest:
> 1) Syntax of what to put into the author spec (my personal opinion would
> be that this should be exactly an IRI)

Not sure what "author" spec means.

HTML uses URIs/IRIs in separate places, and there are at least two 
different contexts in which they need to be parsed, one of which uses 
whitespace as delimiter between identifiers.

So special treatment of whitespace will need to be context-dependent.

> 2) Syntax (or whatever other description that makes sense) of what's
> allowed/reasonable for backwards compatibility
>
> Peter: possible path is to put all the parsing/processing stuff into
> Adam's document, fast-track that document, and work on 3987bis in parallel

If this just replicates information from RFC 3986/7, it's harmless, but 
also not critical at all.

Otherwise, we'll have to understand what's supposed to be different.

> Peter: Adam's document is http://tools.ietf.org/html/draft-abarth-url-00
>
> Adam: another topic that's been raised is bidi
>
> Peter: we had discussion and a tracker issue about pulling bidi into a
> separate document, and at least one person has volunteered to work on
> that (Adil Allawi)

That would be great.

> Julian: We need to partition the work that needs to be done and figure
> out who is going to do that work. I see three major issues:
> - do we have a conflict between how browsers parse and what the specs say?
> - need to clarify handling of non-ASCII characters in query strings
> - hooks for HTML spec for referencing algorithm to partitioning URIs
> into components and resolving a reference against a base
>
> Martin: there *are* differences between different browsers w.r.t.
> parsing and processing

Yes. Let's collect information about what the differences are, and help 
the vendors to resolve them; hopefully getting closer to be compliant to 
3986/7 for valid identifiers.

> Julian: supposedly browsers have special rules for parsing based on
> scheme (e.g., data: scheme and fragment processing), would like to see
> proof of that (and whether this is universal)
>
> Martin: some browsers are closer to the spec, some are farther away --
> but are the ones who are farther away able to move closer?

I would thinks so.

> Adam: might be helpful to clarify regular expressions that are used to
> parse, are they consistent with the ABNF?
>
> Julian: agreed, need to determine if we have a bug

Clarifying: it's supposed to, otherwise it wouldn't be in the spec. If 
it's not, we'll have to look into it (note that it's known and by design 
that the regexp is more lenient than the ABNF).

> Julian: also, things like stripping of leading and trailing whitespace
>
> Martin: those seem like browser-specific or HTML-specific topics, there
> are other issues that might be more core to URI/IRI processing
>
> Adam: I think this is one of the smaller issues that needs to be addressed
>
> Julian: simplest solution is to put this in the HTML spec (but it might
> be needed elsewhere because these things leak into other contexts, such
> as JS code with XHR, or HTTP header fields (Location))
>
> Action items...
> - Adam to publish updated version of draft-barth-url early next week

Can we *please* first agree on the problem we want to solve?

> - Peter to work with Marc on next steps, scheduling of additional design
> team calls / WG interim meetings, etc.

Best regards, Julian

Received on Friday, 22 April 2011 12:34:22 UTC