W3C home > Mailing lists > Public > uri@w3.org > June 2008

Re: URIs in HTML5 and issues arising

From: Sam Ruby <rubys@us.ibm.com>
Date: Mon, 30 Jun 2008 17:23:22 -0400
To: Ian Hickson <ian@hixie.ch>
Cc: public-html-request@w3.org, uri@w3.org
Message-ID: <OF37B98EE7.20368789-ON85257478.0073638A-85257478.00757F49@us.ibm.com>
public-html-request@w3.org wrote on 06/30/2008 04:38:36 PM:

> On Mon, 30 Jun 2008, Sam Ruby wrote:
> >
> > http://cyber.law.harvard.edu/rss/rss.html

> > http://www.rssboard.org/rss-specification

>
> This specification, as far as I can tell, is using the term "URL" in the
> sense of "a string used to identify a resource", not in the sense of "A
> URI that isn't a URN and that conforms precisely to the syntax
> restrictions in RFC3986 and specifically isn't a RFC3897 IRI".

It means a bit more than a string.  In cases such as the enclosure element,
it means something that you can use as is with HTTP to fetch.  There are
many existing http client libraries which do not handle IRIs natively.

> So in fact I would say that the use of "URL" in that spec is far more in
> line with HTML5's definition than anything else. (And indeed, it is
> precisely because most people use "URL" in this sense that I used that
> term in the HTML5 spec. It's more consistent with what people say and do
> than the RFC3986 definition of the term.)

Forgive me, but that seems to be a bit of an optimistic interpretation.  It
might be a good idea to verify it with the groups that I mentioned
previously.  My understanding is that they have taken a much more
conservative approach to defining RSS 2.0 than you are esposing here.

> > > > If I understand correctly, HTML5 will allow the following in
> > > > content, and will expect that all comformant HTML5 consumers will
be
> > > > able to process it interoperably:
> > > >
> > > >   <a href="http://www.?ը??.com/">James Holderness</a>
> > > >
> > > > It is not currently the case that RSS 2.0 allows the following in
> > > > content, and it most assuredly is not the case that conformant RSS
> > > > 2.0 comsumers process it interoperably:
> > > >
> > > >   <enclosure url="http://www.?ը??.com/atomtests/iri/?.mp3"/>
> > >
> > > That's unfortunate. Why wouldn't that be allowed?
> >
> > My read of RFC 3987 section 1.2 paragraph a would preclude it from
being
> > allowed in the context of a pre-existing specification such as RSS 2.0.
>
> So, as with HTML4, it's not the RSS 2.0 specification that disallows it,
> per se.

Agreed.

> > And a good reason not to allow it would be that existing clients aren't

> > expecting it, and won't properly handle such IRIs.
>
> Clients can be upgraded. It would be sad if we cut out all non-ASCII use
> of URLs in RSS just because legacy clients don't support IRIs.

As I said above, the conservators of RSS specs so far have interpred their
mission rather conservatively.  Good ideas that would cause existing
clients to need to be upgraded are studiously avoided.  Particularly as
there is the ability to IDNA encode such IRIs, and there are alternative
feed formats.

> > > If it's not allowed, how does RSS 2.0 say that it should be
processed?
> >
> > The RSS Profile requires that such IRIs must be converted to a URL
using
> > the procedure specified in RFC 3987.
> >
> > http://www.rssboard.org/rss-profile#data-types-url

>
> Ah, so URL is being used to mean IRI?
>
>
> > > How is the term "URL" defined in RSS 2.0? Is the term used in its
> > > RFC3986 definition? (i.e. is the intention really to exclude URNs?)
> >
> > Given the date the spec was originally published, the presumption is
> > that it refers to the term as used in RFC 2396.
>
> That seems to not be the case, based on the actual usage of the term, and

> the way that it is then co-opted by the "profile" document to mean IRI.
>
>
> > I'd suggest that you contact John Palfrey and/or Dave Winer concerning
> > the Harvard spec, and the RSS Advisory Board regarding the RSS Advisory

> > Board's specification.  The RSS Advisory Board can be reached using the

> > http://tech.groups.yahoo.com/group/rss-public/ mailing list.
>
> I couldn't find the contact details for John or Dave, but I've subscribed

> to the rss-public list.

http://www.scripting.com/stories/2007/04/02/newMailAddress.html

http://blogs.law.harvard.edu/palfrey/top/contact/


> What exactly do we want to ask? It seems like the usage of the term "URL"

> in the RSS 2.0 spec right now is not in line with the RFCs, and is
vaguely
> in line with HTML5, though since the RSS 2.0 spec doesn't define how to
> handle errors it isn't exactly clear what the expectation is and whether
> the algorithms in HTML5 would be useful or not. I'm not really sure what
> to ask. It doesn't seem like HTML5 makes RSS 2.0's usage of the term
"URL"
> any more confusing or inaccurate than it already is.

You are phrasing this in terms of "errors".  It is not my understanding
that the James Holderness' IRI would be considered an error when used in an
href attribute of an achor tag in HTML5.

First we need to determine if the IRI which specifies James Holderness' web
site is an error or is valid.  If you would like to see such IRIs be
considered valid in the context of RSS 2.0, that would seem to me to be a
reasonable first question to ask.  If you are not seeking this, but would
like to see error recovery for such IRIs to be specified to be consistent
across all consumers of RSS, that would be a reasonable question to ask.

Making RSS 2.0 consistent with HTML5 would be one way to solve the concern
I raised.  Adding some form of clarification to HTML5 to say that it is
using the term "URL" in a way that may differ from its usages in other
contexts would also.

> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

- Sam Ruby
Received on Monday, 30 June 2008 21:24:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 13 January 2011 12:15:41 GMT