- From: Paul Prescod <paul@prescod.net>
- Date: Thu, 29 Jun 2006 11:38:40 -0700
- To: "Ian Hickson" <ian@hixie.ch>
- Cc: "Tim Berners-Lee" <timbl@w3.org>, noah_mendelsohn@us.ibm.com, www-tag@w3.org, dino@w3.org
- Message-ID: <1cb725390606291138s65d1d5d2k31d809fca9ef3371@mail.gmail.com>
Without making the argument myself, I'll point you to some relevant information: http://www.w3.org/Provider/Style/URI We have so much material that we can't keep track of what is out of date and what is confidential and what is valid and so we thought we'd better just turn the whole lot off. That I can sympathize with - the W3C went through a period like that, when we had to carefully sift archival material for confidentiality before making the archives public. The solution is forethought - make sure you capture with every document its acceptable distribution, its creation date and ideally its expiry date. Keep this metadata. http://www.nsf.gov/pubs/1998/nsf9814/nsf9814.htm Looking at this one, the "pubs/1998" header is going to give any future archive service a good clue that the old 1998 document classification scheme is in progress. Though in 2098 the document numbers might look different, I can imagine this URI still being valid, and the NSF or whatever carries on the archive not being at all embarrassed about it. ... So what should I do? Designing URIs It is the the duty of a Webmaster to allocate URIs which you will be able to stand by in 2 years, in 20 years, in 200 years. This needs thought, and organization, and commitment. URIs change when there is some information in them which changes. It is critical how you design them. (What, design a URI? I have to design URIs? Yes, you have to think about it.). Designing mostly means leaving information out. The creation date of the document - the date the URI is issued - is one thing which will not change. It is very useful for separating requests which use a new system from those which use an old system. That is one thing with which it is good to start a URI. If a document is in any way dated, even though it will be if interest for generations, then the date is a good starter. What to leave out Everything! After the creation date, putting any information in the name is asking for trouble one way or another. Topics and Classification by subject I'll go into this danger in more detail as it is one of the more difficult things to avoid. Typically, topics end up in URIs when you classify your documents according to a breakdown of the work you are doing. That breakdown will change. Names for areas will change. At W3C we wanted to change "MarkUp" to "Markup" and then to "HTML" to reflect the actual content of the section. Also, beware that this is often a flat name space. In 100 years are you sure you won't want to reuse anything? We wanted to reuse "History" and "Stylesheets" for example in our short life. ... Effectively, when you use a topic name in a URI you are binding yourself to some classification. You may in the future prefer a different one. Then, the URI will be liable to break. A reason for using a topic area as part of the URI is that responsibility for sub-parts of a URI space is typically delegated, and then you need a name for the organizational body - the subdivision or group or whatever - which has responsibility for that sub-space. This is binding your URIs to the organizational structure. It is typically safe only when protected by a date further up the URI (to the left of it): 1998/pics can be taken to mean for your server "what we meant in 1998 by *pics*", rather than "what in 1998 we did with what we now refer to as *pics*." ... Conclusion Keeping URIs so that they will still be around in 2, 20 or 200 or even 2000 years is clearly not as simple as it sounds. However, all over the Web, webmasters are making decisions which will make it really difficult for themselves in the future. Often, this is because they are using tools whose task is seen as to present the best site in the moment, and no one has evaluated what will happen to the links when things change. The message here is, however, that many, many things can change and your URIs can and should stay the same. They only can if you think about how you design them.
Received on Thursday, 29 June 2006 18:38:57 UTC