RE: Nature: A call for a public gene Wiki from Phillip Lord on 2006-02-09 (public-semweb-lifesci@w3.org from February 2006)

From: Phillip Lord <Phillip.Lord@newcastle.ac.uk>
Date: Thu, 9 Feb 2006 17:15:35 -0000
To: <matt@biomedcentral.com>, <public-semweb-lifesci@w3.org>
Message-ID: <6942EE35B530F84EAD432959F5E4DAB501876F25@largo.campus.ncl.ac.uk>

matt@biomedcentral.com wrote:
> In terms of preserving the bits:
>
> Just as, if you believe that the only way to preserve BioMed
> Central's content is to scratch it in a shrinking spiral onto 2"
> nickel disks (as recommended here
> http://www.longnow.org/projects/conferences/10klibrary/ ), then
> thanks to the Creative Commons license nothing stops you from doing
> so. Same for Wikipedia.     
> 

It's the identifers that are the problem; wikipedia only offers
URL's, and these don't change when the version changes. If wikipedia
offered DOI's, for example, to the pages, then these could update
automatically. 

As you know, versioning is one of the biggest problems in life
sciences. Records change constantly, and things go out of date, 
and you can't tell that is happened. LSID's made an attempt to
solve this problem; the version part of the LSID is the only part
of the identifer which carries semantics. 

I really don't want to try and hack something on top of a source
which doesn't support version descriptions upfront. 

> 
> 
> In terms of preserving the links:
> 
>> Links can break, or refer to other things than they started out.
> 
> 
> Yes indeed - that is surely in the very nature of an evolving
> ontology. But a core aspect of any semantically enhanced version of
> wikipedia (or other semantic wiki project) would certainly be how to
> manage the various different possible forms of aliasing which would
> be part of managing this evolution. e.g. right now, within Wikipedia,
> the primary form of aliasing is that where there were, say, 3
> previous entries that have converted into a single entry, there is a
> redirection/rewrite in place       
> e.g.
> http://en.wikipedia.org/wiki/Einstein
> actually takes you to
> http://en.wikipedia.org/wiki/Albert_Einstein
> 
> This aliasing mechanism would need to get subtler and more structured
> to convey the relevant history of the concepts concerned. 

Agreed. This is very blunt at the moment. Actually, there is another
common form of link. Take this URL. 

http://en.wikipedia.org/wiki/Snare

I wrote this page about 5 years ago. When it was created, it referred
to the musical usage; I had several incoming links to it. They were
all fixed, as it happens, so the creation of a disambiguation page
didn't
hurt. But this is only possible because wikipedia is a closed world, 
so it's possible to identify the internal links. 

> An of course, at any point in time, any organisation may choose to
> snapshot wikipedia and comb through the relevant to create a
> trustworthy version, backed by their imprimatur.  

archive.org probably already does this for us. Again, though,
snapshotting
by crawling just seems a bad way to go. 

There are ways around this; if you created a wiki which release all
changes
as reverse diffs, saying using RSS (hey, I knew I could get this back
clearly
to semantic web technologies!), then by storing the RSS stream, you
could 
recreate the contents of the website and any point in time. If you then
updated
a version number any time anything changed, the problem might be solved

So....

http://en.wikipedia.org/wiki/Snare

would point to all (or any) version of the link while, 

http://en.wikipedia.org/wiki/Snare/7674

would point specifically to the version of that link from version 7674. 

All of this would be easy to do. But much easier to do, if the original
data
source supports it. 

Phil

Received on Thursday, 9 February 2006 17:15:58 UTC