Re: URIs in HTML5 and issues arising

On Mon, 30 Jun 2008, Sam Ruby wrote:
> 
> http://cyber.law.harvard.edu/rss/rss.html
> http://www.rssboard.org/rss-specification

This specification, as far as I can tell, is using the term "URL" in the 
sense of "a string used to identify a resource", not in the sense of "A 
URI that isn't a URN and that conforms precisely to the syntax 
restrictions in RFC3986 and specifically isn't a RFC3897 IRI".

So in fact I would say that the use of "URL" in that spec is far more in 
line with HTML5's definition than anything else. (And indeed, it is 
precisely because most people use "URL" in this sense that I used that 
term in the HTML5 spec. It's more consistent with what people say and do 
than the RFC3986 definition of the term.)


> > > If I understand correctly, HTML5 will allow the following in 
> > > content, and will expect that all comformant HTML5 consumers will be 
> > > able to process it interoperably:
> > >
> > >   <a href="http://www.?ը??.com/">James Holderness</a>
> > >
> > > It is not currently the case that RSS 2.0 allows the following in 
> > > content, and it most assuredly is not the case that conformant RSS 
> > > 2.0 comsumers process it interoperably:
> > >
> > >   <enclosure url="http://www.?ը??.com/atomtests/iri/?.mp3"/>
> >
> > That's unfortunate. Why wouldn't that be allowed?
> 
> My read of RFC 3987 section 1.2 paragraph a would preclude it from being 
> allowed in the context of a pre-existing specification such as RSS 2.0.

So, as with HTML4, it's not the RSS 2.0 specification that disallows it, 
per se.


> And a good reason not to allow it would be that existing clients aren't 
> expecting it, and won't properly handle such IRIs.

Clients can be upgraded. It would be sad if we cut out all non-ASCII use 
of URLs in RSS just because legacy clients don't support IRIs.


> > If it's not allowed, how does RSS 2.0 say that it should be processed?
> 
> The RSS Profile requires that such IRIs must be converted to a URL using 
> the procedure specified in RFC 3987.
>
> http://www.rssboard.org/rss-profile#data-types-url

Ah, so URL is being used to mean IRI?


> > How is the term "URL" defined in RSS 2.0? Is the term used in its 
> > RFC3986 definition? (i.e. is the intention really to exclude URNs?)
> 
> Given the date the spec was originally published, the presumption is 
> that it refers to the term as used in RFC 2396.

That seems to not be the case, based on the actual usage of the term, and 
the way that it is then co-opted by the "profile" document to mean IRI.


> I'd suggest that you contact John Palfrey and/or Dave Winer concerning 
> the Harvard spec, and the RSS Advisory Board regarding the RSS Advisory 
> Board's specification.  The RSS Advisory Board can be reached using the 
> http://tech.groups.yahoo.com/group/rss-public/ mailing list.

I couldn't find the contact details for John or Dave, but I've subscribed 
to the rss-public list.

What exactly do we want to ask? It seems like the usage of the term "URL" 
in the RSS 2.0 spec right now is not in line with the RFCs, and is vaguely 
in line with HTML5, though since the RSS 2.0 spec doesn't define how to 
handle errors it isn't exactly clear what the expectation is and whether 
the algorithms in HTML5 would be useful or not. I'm not really sure what 
to ask. It doesn't seem like HTML5 makes RSS 2.0's usage of the term "URL" 
any more confusing or inaccurate than it already is.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Monday, 30 June 2008 20:39:16 UTC