W3C home > Mailing lists > Public > site-comments@w3.org > March 2004

On W3C (/TR/) URI policies...

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Mon, 22 Mar 2004 17:03:34 +0100
To: site-comments@w3.org
Message-ID: <4063d8dc.754879319@smtp.bjoern.hoehrmann.de>

Hi,

  I already did not understand why anyone would assign a "dated" URL to
unstable documents, even less so why W3C QA is at http://www.w3.org/QA/,
W3C WAI at http://www.w3.org/WAI/, W3C DOM at http://www.w3.org/DOM/,
but W3C TAG at http://www.w3.org/2001/tag/. However, I could live with
that with a shake of the head. I have also accepted that my memory is
too bad to use the W3C web site efficiently and use a search engine to
locate documents such as the W3C Manual of Style. I thought I could get
to it through trial and error because I know that it is at
<http://www.w3.org/dddd/dd/manual/>, but I think in order to remember
the second number I mnemonic'd that the numbers are even, hence I
typically tried

  http://www.w3.org/2000/06/manual/
  http://www.w3.org/2000/08/manual/
  http://www.w3.org/2002/08/manual/
  http://www.w3.org/2002/06/manual/
  http://www.w3.org/2000/04/manual/
  ...

first, but this does not work, hence I try uneven numbers... Sometimes I
succeeded after a couple of 404 errors. At some point this got too
frustrating. I tried to get to it through Google a couple of times, but
this is getting frustrating, too. And in fact, as a result I avoid
referencing it.

May it's just me and everyone else has no problem to properly remember
URI References like

  http://www.w3.org/1999/02/22-rdf-syntax-ns#

One of the great things about this URI is the rdf-syntax part. I can
remember this part and in order to use the namespace URI here I visited

  http://www.w3.org/TR/rdf-syntax

Ooops. Well, I got used to that. I mean, what can I do to change this? I
am certain the people responsible for this are aware that this violates
everything in <http://www.useit.com/alertbox/990321.html> except for the
domain name.

After all, usability is not part of http://www.w3.org/Consortium/#goals
and there is links and copy and paste for people with learning
disabilities, dammit.

But now you broke my last resort on http://www.w3.org/, the /TR/ space.
There is no longer a <http://www.w3.org/TR/REC-xml>, this is now a
"Temporary Redirect"... Some comments about this change:

  * It does not recover from speling errors, <http://www.w3.org/TR/css2>
    properly redirects to <http://www.w3.org/TR/CSS2/>,
    <http://www.w3.org/TR/css21> "Sorry, Not Found".

  * ,translations is broken, <http://www.w3.org/TR/REC-xml,translations>
    properly gives a list of translations,
    <http://www.w3.org/TR/2004/REC-xml-20040204/,translations> "No
    Translations are Available for This Document"

  * It breaks tools like Wget 1.8.1 (latest in Debian stable)

    % wget http://www.w3.org/TR/CSS2/
    --14:29:28--  http://www.w3.org/TR/CSS2/
               => `index.html'
    Resolving www.w3.org... done.
    Connecting to www.w3.org[18.29.1.35]:80... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 56,242 [text/html]
    
    100%[=========================>] 56,242        76.07K/s    ETA 00:00
    
    14:29:29 (76.07 KB/s) - `index.html' saved [56242/56242]
    
    % wget http://www.w3.org/TR/CSS21/
    --14:29:32--  http://www.w3.org/TR/CSS21/
               => `index.html.1'
    Resolving www.w3.org... done.
    Connecting to www.w3.org[18.29.1.34]:80... connected.
    HTTP request sent, awaiting response... 307 Temporary Redirect
    14:29:32 ERROR 307: Temporary Redirect.

    The same applies to some older browsers though some of them provide
    access to the returned HTML 2.0 page, they can follow the link

      The document has moved <A
      HREF="http://www.w3.org/TR/2004/REC-xml-20040204">here</A>.<P>

    which is btw very bad link text. However, this annoys users of my
    site if I link to W3C Technical Reports, and I do not want to annoy
    my users.

  * It breaks offline browsing, http://www.w3.org/TR/REC-xml is not
    cacheable due to lack of a Cache-Control or Expires header, hence
    it is not possible to follow links to that document in common
    offline browsing scenarios.

  * Latest version links are much slower to follow, redirects to
    <http://www.w3.org/TR/REC-xml> redirects to
    <http://www.w3.org/TR/2004/REC-xml-20040204> which then again
    redirects to <http://www.w3.org/TR/2004/REC-xml-20040204/>

  * It breaks my address bar, when I visit
    <http://www.w3.org/TR/REC-xml#syntax> I will see
    <http://www.w3.org/TR/2004/REC-xml-20040204/> after the redirect,
    not <http://www.w3.org/TR/2004/REC-xml-20040204/#syntax> as I
    would expect.

  * It reduces the utility of edited recommendations. Most
    specifications contain unpredictable fragment identifers like
    http://www.w3.org/TR/REC-xml#charsets which points to the section
    on Characters or #section-N64772-Out-of-line-Formatting-Objects
    which force me to use copy & paste from my address bar to get
    fragment references, hence I would either end up referencing
    always the nth edition of a specification or would have do to
    a lot of work to always hack the URI down to the latest version
    URI which means I would always refer to the nth edition.

    The same applies of course - even though to lesser extend - to all
    other TR maturity levels.

  * Partly due to the same reason as the former item, it becomes at
    least very difficult to include links in mails, usenet postings or
    even web sites because these insanely dated URIs are about 20
    characters longer than normal URIs, beasts like

      http://www.w3.org/TR/2003/WD-xquery-full-text-requirements-20030502/#div3-score-independent
      http://www.w3.org/TR/2004/WD-xhtml-modularization-20040218/schema_module_defs.html#a_module_XHTML_Common_Attribute_Definitions

    do *not* fit properly into the typical boundaries of editors, broken
    editors will likely wrap such URIs which make them inaccessible to
    many and in case of plain/text ugly to everyone.

  * It does *not* help with issues such as the /TR/SOAP/ case,
    /TR/ is the wrong place for lastest-foo-technology URIs as
    http://www.w3.org/TR/html/ among others kindly demonstrates.

  * It breaks my sites, I link to <http://www.w3.org/TR/REC-xml> and now
    my link checker reports warnings for potentially broken references,
    wading through the clutter of all these warnings is frustrating and
    reduces my willingness to keep my links up to date. Even if my link
    checker supported means to hide warnings for specific sites, it
    won't help much, since links to www.w3.org break all the time;
    ironically, the W3C Linkchecker at http://validator.w3.org/checklink
    links to W3C's page explaining URIs, specifically
    http://www.w3.org/Addressing/#terms which does no longer exist.

  * Once again, you want people to link technical reports, this change
    makes it way more difficult to due this properly, which is bad.

Of course there are more reasons why this change is ... *wrong*. There
is no benefit, but an overwhelming number of downsides. This is sooo
obvious. Please fix this. Not its symptoms. Switch it back to what
worked for so many years. Please. Please! Please! Please! What do I need
to do to convince you? Use the WBS system and let the community vote on
W3C URI policy if you like, if there is a good justification for this
change or your URI policy in general, I am certain you can convince your
community to support this. Unless this is all part of a practical joke,
of course.

regards.
Received on Monday, 22 March 2004 11:06:13 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 18 February 2014 13:20:07 UTC