W3C home > Mailing lists > Public > public-egov-ig@w3.org > November 2010

RE: Does the user care about URLs? (another thread from Re: Censorship?)

From: David Pullinger <David.Pullinger@coi.gsi.gov.uk>
Date: Thu, 11 Nov 2010 12:05:06 +0000
To: "washingtona@acm.org" <washingtona@acm.org>, "chris@e-beer.net.au" <chris@e-beer.net.au>, "rachel.flagg@gsa.gov" <rachel.flagg@gsa.gov>, Hugh Barnes <Hugh.Barnes@nehta.gov.au>
CC: W3C e-Gov IG <public-egov-ig@w3.org>
Message-ID: <3C3AD408516E91439CA4846F407EFAA71B6C1960@LONWINVMBX02.coi.local>
Anne,

Nice to hear of what you've done in Library of Congress.  In UK Government I also looked at use of handles and I had a number of discussions and ran some tests of this.  In the end John Sheridan and I went with idea of creating a UK Government Website Archive and a redirection tool that would take the reader to the URL in the Archive if not active in the original location. This means (theoretically) that there should be no broken links.  

The main reason for the decision was because of the distributed nature of government and recognising we were more likely to get better buy-in from lots of people tweaking their system set up (to include the redirection tool and enabling spiders to crawl for archive) rather than trying to get them all to adopt the process of moving to handles. 

Regards,

David


David Pullinger
Head of Digital Policy
david.pullinger@coi.gsi.gov.uk
020 7261 8513
07788 872321

-----Original Message-----
From: public-egov-ig-request@w3.org [mailto:public-egov-ig-request@w3.org] On Behalf Of Anne L. Washington
Sent: 10 November 2010 22:44
To: chris@e-beer.net.au; rachel.flagg@gsa.gov; Hugh Barnes; David Pullinger
Cc: W3C e-Gov IG
Subject: RE: Does the user care about URLs? (another thread from Re: Censorship?)

Rachel, Chris, Hugh,
Just had to toss in on this one.

First, Internet search works in large part because of page-rank... how 
many other pages point to this URL? If that continues to be part of search 
algorithms, government needs to pay attention to its URL structures.

Second, URLs are visible provenance for a web page.

The term provenance is used to describe the ownership, context and source 
of a physical document in an archive. Until there is some other obvious, 
simple way to determine that you've arrived at your destination, URLs will 
continue to be very important. Phishing wouldn't work if it weren't for 
the face validity given to URLs.

There is great backend significance for not only persistent but logical 
URLs. Building a system where every past and future object follows a logic 
location pattern makes a programmer's life much easier.

Now in full disclosure, I developed the persistent URLs for legislation 
when I was at the Library of Congress. The Library uses handles, the 
general technology that supports the DOI system. The handle can resolve to 
any URL. The handle URL is the front facing URL leaving the backend system 
freedom to change URLs and file structures at anytime if necessary.

So instead of looking for the wonderful bill Rachel mentioned at
http://www.govtrack.us/congress/billtext.xpd?bill=h111-946

The same bill can be seen, persistently and with some provenance
http://hdl.loc.gov/loc.uscongress/legislation.111hr946

There is room for improvement in that syntax.
However, institution that has long-term responsibility 
for the data is also supporting a long term pointer to it.


Anne L. Washington
Standards work - W3C egov - washingtona@acm.org 
Academic work - George Washington University
http://home.gwu.edu/~annew/

On Wed, 10 Nov 2010, Hugh Barnes wrote:

> [ Sent this a few hours back (0201 UTC), thought I had Mike's problem, 
> but seems I trimmed the recipient addresses too keenly. Take two! ]
>
> ........
>
> Great to see you still around, Rachel :)
>
>> 	Hi Chris - thanks for the thought-provoking questions below about "does the user care?"....
>
> I didn't have a clue about a very large chunk of this thread to this 
> point. Hopefully I've been the only one puzzled by its purpose. Now, 
> thanks to Rachel for pointing it out, I can comment on a small part I'd 
> missed.
>
>> 	URL/URI structure IS important on the back-end, to help us do a better
>> job managing our information.  A good Info Architecture helps you
>> organize, categorize & manage your information, but I don't think end
>> users care about it
>
> I think *most* users don't *think* they care about it, but they probably care about some of the benefits it brings them unnoticed.
>
> Let's talk about persistent, deterministic URIs:
>
> * visited links show as such (as long as they weren't styled out, grr..)
> * caching *should* work
> * users know their bookmarks/links/bibliographic citations will continue to work and point to the same resource
>
> But this isn't about persistence and probably very few here doubt its benefits.
>
> It *is* closely related to other good practices, which include the next list of benefits – namely systematic and descriptive URI patterns:
>
> * on the address bar, though most browsers will search the history for page titles these days too, which is handy, URIs provide another hook for address bar discovery
> * when shared, descriptive URIs provide strong cues about the content and permit trust (contrast with, e.g. bit.ly)
> * URLs appearing in print make that little bit more sense ("I should look that one up when I get back online")
> * search engines rightly weight them highly, along with important markup elements
> * they can be marketed or read out more easily over the phone, on radio, or on television (pronouncability)
> * they have more chance of being remembered (have you ever seen an interesting URL outside but couldn't make a note of it?)
> * users can, over a slow period of indoctrination :), start to "get" URIs as structured information paths, not as something "techy" beyond their reach
> * conventions like {department}.gov/about or /contact or /publications across government sites, when followed and known, can be hugely beneficial shortcuts (principle of least surprise)
>
> Advanced users (which are not "most" users, but actually an influential group):
>
> * tweak URLs, like version numbers, dates, or query keywords, or adding "/feed"
> * dislike system information in the URI, like "joomla" or "/default.aspx" because they know it'll change when the new CMS is implemented and also because they are less likely to remember those details (if they do it's a waste of space!). (That means they're less likely to link or reference a URI, which in turn affects most users.)
> * will truncate URLs when they get a 404 (etc.) for some reason – we love it when this comes together
>
>> 	They might care about the domain that comes up in search results -- to verify that the site they want to click on is "trusted" -- but as long as people can go to Google or Bing & find what they are looking for, I don't think they care about URI.
>
> Yes, good search engines certainly have changed the bookmark and URI 
> recall landscape, but not reduced them to insignificance.
>
> I sense a disturbing antagonism toward good practice URLs these days 
> based on the premise they don't really mean anything to most people 
> (which is conditionally true). It's often also driven (IMHO) in a very 
> narrowly focussed way by microblogging. There's some truth to it, but 
> it's just not that simple. I would be sad to see a long-held best 
> architectural practice discarded. I think government resources can and 
> should lead in this respect.
>
> Good discussion BTW, Chris, I look forward to seeing if there's more to say.
>
> Cheers
>
> Hugh Barnes
> Technical Interface Specialist
> nehta – National E-Health Transition Authority
> Address:      Level 2, 10 Browning St, West End, QLD, 4101
> Phone:        +61 7 3023 8537
> Mobile:       +61 417 469 552
> Email:        hugh.barnes@nehta.gov.au
> Web:          http://www.nehta.gov.au
>
> They might care about the domain that comes up in search results -- to
> verify that the site they want to click on is "trusted" -- but as long 
as
> people can go to Google or Bing & find what they are looking for, I 
don't
> think they care about URI.
>
> On a related note, did you see that the US Govt has passed the Plain
> Writing Act of 2010, requiring US Federal Govt agencies to use “writing
> that is clear, concise, well-organized and follows best practices
> appropriate to the subject or field and intended audience”
>
> Read the text of the Act here:
> http://www.govtrack.us/congress/billtext.xpd?bill=h111-946
>

This communication is confidential and copyright.
Anyone coming into unauthorised possession of it should disregard its content and erase it from their records.

The original of this email was scanned for viruses by Government Secure Intranet (GSi) virus scanning service supplied exclusively by Cable & Wireless in partnership with MessageLabs.
On leaving the GSI this email was certified virus free.
The MessageLabs Anti Virus Service is the first managed service to achieve the CSIA Claims Tested Mark (CCTM Certificate Number 2006/04/0007), the UK Government quality mark initiative for information security products and services. For more information about this please visit www.cctmark.gov.uk
Received on Thursday, 11 November 2010 12:03:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 11 November 2010 12:03:14 GMT