Re: Intent to close ISSUE-205

On 01/15/2013 10:26 AM, Pat Hayes wrote:
> Revolutionary though it may seem, I would suggest, when writing a Web
> standard, to actually use the terminology defined by other normative
> Web standards. That is, if you mean IRI, say "IRI", and if you mean
> URL, say "URL". To do otherwise is at best confusing, and at worst so
> bloody stupid that it is impossible to even discuss it politely.

There is a deeper story here, Pat (and those that feel that continuing
to use the IRI terminology is perfectly okay).

Here's are the numerous problems with term 'IRI':

  * the vast majority of Web developers don't know what it is
  * many of them will never learn the difference because the
    difference doesn't matter to them in practice.
  * The HTML5 spec uses URL everywhere for these reasons, that
    will be the norm going forward.
  * creating a new term to specify "an internationalized URL" was,
    in hindsight, the wrong thing to do because it has caused a great
    deal of confusion.
  * being pedantic when attempting to communicate a general concept
    to a general audience can do more harm than good.
  * this is coming down the pipeline: http://url.spec.whatwg.org/

A little more detail on the points above...

Web developers don't understand the IRI terminology
---------------------------------------------------

I've been very supportive of the use of IRI terminology in specs for a
number of years. For example, in the RDFa spec, we ended up including
this note because of the number of comments we got on the subject:

"""
RDFa is a way of expressing RDF-style relationships using simple
attributes in existing markup languages such as HTML. RDF is fully
internationalized, and permits the use of Internationalized Resource
Identifiers, or IRIs. You will see the term 'IRI' used throughout this
specification. Even if you are not familiar with the term IRI, you
probably have seen the term 'URI' or 'URL'. IRIs are an extension of
URIs that permits the use of characters outside those of plain ASCII.
RDF allows the use of these characters, and so does RDFa. This
specification has been careful to use the correct term, IRI, to make it
clear that this is the case.
"""

Web Keys spec... same thing. PaySwarm base spec, same thing... and now
we're getting the same comments "IRIs are confusing to Web developers"
for the JSON-LD spec. These comments didn't come from the same people,
or same group of people, they came from a myriad of web developers with
different backgrounds. What was not clear to me two years ago is now
very obvious. Web developers don't understand the difference between URL
and IRI and more importantly, they should not have to.

The Difference Doesn't Matter in Practice
-----------------------------------------

If URLs had been designed correctly in the beginning (which is
fantastically easy to say with hindsight), they would've included
internationalized characters and we wouldn't be in this mess. Web
developers call IRIs URLs in practice, it's everywhere, look at the
documentation on building websites and you will find very little to no
use of the term IRI.

Google search index count for
   URL: 366M
   URI:  27M
   IRI:   4M

In fact, I had no idea what an IRI was until I hit RDF. Nobody I worked
with knew what an IRI was before we started working with RDF. It didn't
matter then and it still doesn't matter now (unless you want to be
extremely pedantic, which is a mistake when trying to convince new Web
developers to use this stuff). You are not penalized when you stick an
IRI instead of a URL in your web page in any way (or vice-versa). The
difference doesn't matter to 99.999% of the people building and using
the Web.

Future Work on Merging URL with IRI
-----------------------------------

Anne is working on this http://url.spec.whatwg.org/. Two of the goals are:

* Align RFC 3986 and RFC 3987 with contemporary implementations and
  obsolete them in the process. (E.g. spaces, other "illegal" code
  points, query encoding, equality, canonicalization, are all concepts
  not entirely shared, or defined.) URL parsing needs to become as
  solid as HTML parsing. [URI] [IRI]

* Standardize on the term URL. URI and IRI are just confusing. In
  practice a single algorithm is used for both so keeping them distinct
  is not helping anyone. URL also easily wins the search result
  popularity contest.

The writing is on the wall. I suggest that the RDF WG move toward the
URL terminology. I was attempting to start the ball rolling with the
JSON-LD spec, at least, attempt to future-proof the spec a bit. That
failed. I hope there are others in both the JSON-LD CG and RDF WG that
share this view. IRIs and URIs are dead, they just don't know it yet...
long live the URL.

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
Founder/CEO - Digital Bazaar, Inc.
blog: The Problem with RDF and Nuclear Power
http://manu.sporny.org/2012/nuclear-rdf/

Received on Wednesday, 16 January 2013 01:28:09 UTC