Re: On W3C (/TR/) URI policies...

On Mon, 2004-03-22 at 11:03, Bjoern Hoehrmann wrote:
> Hi,
> 
>   I already did not understand why anyone would assign a "dated" URL to
> unstable documents, even less so why W3C QA is at http://www.w3.org/QA/,
> W3C WAI at http://www.w3.org/WAI/, W3C DOM at http://www.w3.org/DOM/,
> but W3C TAG at http://www.w3.org/2001/tag/. 

Hello Bjoern,

Thank you for your thoughtful comments. I made some comments below but
no definitive statement. I'll try to get some discussion on this in the
Team.

 Thank you,

 - Ian

Notes:

You shouldn't try to read meaning into a URI unless there's a published
policy for a particular convention. W3C has a published policy for
URIs for TR documents. W3C does not have a published policy for numbers
in other URIs such as /2001/tag. It turns out that the Director prefers
not to "fill up" top-level URI space after www.w3.org. In the case of
/2001/tag, we put the date when the TAG came into existence. This allows
us to reuse "/tag" later if we choose.

> However, I could live with
> that with a shake of the head. I have also accepted that my memory is
> too bad to use the W3C web site efficiently and use a search engine to
> locate documents such as the W3C Manual of Style. 

We have millions and millions of documents. You SHOULD use links or
a search engine to find them, not your memory.

> I thought I could get
> to it through trial and error because I know that it is at
> <http://www.w3.org/dddd/dd/manual/>, but I think in order to remember
> the second number I mnemonic'd that the numbers are even, hence I
> typically tried

If you have to resort to trial and error because our site does not
provide you with useful navigation, please let us know clearly what
resources you are looking for. If you prefer trial and error in spite of
available search and navigation mechanisms, then you might consider
(after your own experience, in fact), relying on better means.

> 
>   http://www.w3.org/2000/06/manual/
>   http://www.w3.org/2000/08/manual/
>   http://www.w3.org/2002/08/manual/
>   http://www.w3.org/2002/06/manual/
>   http://www.w3.org/2000/04/manual/
>   ...
> 
> first, but this does not work, hence I try uneven numbers... Sometimes I
> succeeded after a couple of 404 errors. At some point this got too
> frustrating. I tried to get to it through Google a couple of times, but
> this is getting frustrating, too. And in fact, as a result I avoid
> referencing it.
> 
> May it's just me and everyone else has no problem to properly remember
> URI References like
> 
>   http://www.w3.org/1999/02/22-rdf-syntax-ns#
> 
> One of the great things about this URI is the rdf-syntax part. I can
> remember this part and in order to use the namespace URI here I visited
> 
>   http://www.w3.org/TR/rdf-syntax
> 
> Ooops. Well, I got used to that. I mean, what can I do to change this? I
> am certain the people responsible for this are aware that this violates
> everything in <http://www.useit.com/alertbox/990321.html> except for the
> domain name.
> 
> After all, usability is not part of http://www.w3.org/Consortium/#goals
> and there is links and copy and paste for people with learning
> disabilities, dammit.

We do not want people to memorize URIs. We want them to create
bookmarks, to use our navigation tools, and to use our search tools.

> But now you broke my last resort on http://www.w3.org/, the /TR/ space.
> There is no longer a <http://www.w3.org/TR/REC-xml>, this is now a
> "Temporary Redirect"... 

I'm not sure when or why this change was made. I'd have to go dig up
the history.

> Some comments about this change:

>   * It does not recover from speling errors, <http://www.w3.org/TR/css2>
>     properly redirects to <http://www.w3.org/TR/CSS2/>,
>     <http://www.w3.org/TR/css21> "Sorry, Not Found".

Don't rely on mod_spelling to help you. See the TAG's architecture
document:
  http://www.w3.org/TR/2003/WD-webarch-20031209/#identifiers-comparison

   "If a URI has been assigned to a resource, agents SHOULD refer to the
resource using the same URI, character for character."

>   * ,translations is broken, <http://www.w3.org/TR/REC-xml,translations>
>     properly gives a list of translations,
>     <http://www.w3.org/TR/2004/REC-xml-20040204/,translations> "No
>     Translations are Available for This Document"

Ivan, can you have a look at this?

>   * It breaks tools like Wget 1.8.1 (latest in Debian stable)
> 
>     % wget http://www.w3.org/TR/CSS2/
>     --14:29:28--  http://www.w3.org/TR/CSS2/
>                => `index.html'
>     Resolving www.w3.org... done.
>     Connecting to www.w3.org[18.29.1.35]:80... connected.
>     HTTP request sent, awaiting response... 200 OK
>     Length: 56,242 [text/html]
>     
>     100%[=========================>] 56,242        76.07K/s    ETA 00:00
>     
>     14:29:29 (76.07 KB/s) - `index.html' saved [56242/56242]
>     
>     % wget http://www.w3.org/TR/CSS21/
>     --14:29:32--  http://www.w3.org/TR/CSS21/
>                => `index.html.1'
>     Resolving www.w3.org... done.
>     Connecting to www.w3.org[18.29.1.34]:80... connected.
>     HTTP request sent, awaiting response... 307 Temporary Redirect
>     14:29:32 ERROR 307: Temporary Redirect.
> 
>     The same applies to some older browsers though some of them provide
>     access to the returned HTML 2.0 page, they can follow the link
> 
>       The document has moved <A
>       HREF="http://www.w3.org/TR/2004/REC-xml-20040204">here</A>.<P>
> 
>     which is btw very bad link text. However, this annoys users of my
>     site if I link to W3C Technical Reports, and I do not want to annoy
>     my users.

Vivien, can you have a look at that?

>   * It breaks offline browsing, http://www.w3.org/TR/REC-xml is not
>     cacheable due to lack of a Cache-Control or Expires header, hence
>     it is not possible to follow links to that document in common
>     offline browsing scenarios.

Ok.

>   * Latest version links are much slower to follow, redirects to
>     <http://www.w3.org/TR/REC-xml> redirects to
>     <http://www.w3.org/TR/2004/REC-xml-20040204> which then again
>     redirects to <http://www.w3.org/TR/2004/REC-xml-20040204/>

"Much slower" seems like a bit of a stretch, but I agree that if you
can avoid a redirect, you should. 

>   * It breaks my address bar, when I visit
>     <http://www.w3.org/TR/REC-xml#syntax> I will see
>     <http://www.w3.org/TR/2004/REC-xml-20040204/> after the redirect,
>     not <http://www.w3.org/TR/2004/REC-xml-20040204/#syntax> as I
>     would expect.

That, I believe, is a browser problem. In particular, see CUAP
checkpoint 4.1.
  http://www.w3.org/TR/cuap#uri

>   * It reduces the utility of edited recommendations. Most
>     specifications contain unpredictable fragment identifers like
>     http://www.w3.org/TR/REC-xml#charsets which points to the section
>     on Characters or #section-N64772-Out-of-line-Formatting-Objects
>     which force me to use copy & paste from my address bar to get
>     fragment references, hence I would either end up referencing
>     always the nth edition of a specification or would have do to
>     a lot of work to always hack the URI down to the latest version
>     URI which means I would always refer to the nth edition.
> 
>     The same applies of course - even though to lesser extend - to all
>     other TR maturity levels.
> 
>   * Partly due to the same reason as the former item, it becomes at
>     least very difficult to include links in mails, usenet postings or
>     even web sites because these insanely dated URIs are about 20
>     characters longer than normal URIs, beasts like
> 
>       http://www.w3.org/TR/2003/WD-xquery-full-text-requirements-20030502/#div3-score-independent
>       http://www.w3.org/TR/2004/WD-xhtml-modularization-20040218/schema_module_defs.html#a_module_XHTML_Common_Attribute_Definitions
> 
>     do *not* fit properly into the typical boundaries of editors, broken
>     editors will likely wrap such URIs which make them inaccessible to
>     many and in case of plain/text ugly to everyone.
> 
>   * It does *not* help with issues such as the /TR/SOAP/ case,
>     /TR/ is the wrong place for lastest-foo-technology URIs as
>     http://www.w3.org/TR/html/ among others kindly demonstrates.
> 
>   * It breaks my sites, I link to <http://www.w3.org/TR/REC-xml> and now
>     my link checker reports warnings for potentially broken references,
>     wading through the clutter of all these warnings is frustrating and
>     reduces my willingness to keep my links up to date. Even if my link
>     checker supported means to hide warnings for specific sites, it
>     won't help much, since links to www.w3.org break all the time;
>     ironically, the W3C Linkchecker at http://validator.w3.org/checklink
>     links to W3C's page explaining URIs, specifically
>     http://www.w3.org/Addressing/#terms which does no longer exist.
> 
>   * Once again, you want people to link technical reports, this change
>     makes it way more difficult to due this properly, which is bad.
> 
> Of course there are more reasons why this change is ... *wrong*. There
> is no benefit, but an overwhelming number of downsides. This is sooo
> obvious. Please fix this. Not its symptoms. Switch it back to what
> worked for so many years. Please. Please! Please! Please! What do I need
> to do to convince you? Use the WBS system and let the community vote on
> W3C URI policy if you like, if there is a good justification for this
> change or your URI policy in general, I am certain you can convince your
> community to support this. Unless this is all part of a practical joke,
> of course.

-- 
Ian Jacobs (ij@w3.org)   http://www.w3.org/People/Jacobs
Tel:                     +1 718 260-9447

Received on Monday, 22 March 2004 12:07:38 UTC