- From: Peter Ansell <ansell.peter@gmail.com>
- Date: Mon, 22 Oct 2007 09:44:03 +1000
- To: public-semweb-lifesci@w3.org
- Cc: p.roe@qut.edu.au, j.hogan@qut.edu.au
Hi all, I have been using the Bio2Rdf markup system and I personally do not see what all the fuss is about but there must be something so here are my opinions based solely on the requirements document http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Tasks/URI_Best_Practices/Recommendations/Requirements # For our own resources, what URIs to mint and what contracts to adhere to regarding well-definedness and documentation Publically retrievable metadata for ones personally produced/published information (if not data as well) should be available using URI's matched to one's institution/organisation, with relevant owl:sameAs and rdfs:seeAlso tags to specify their relationships to other known uri's. Advantages: One does not need to negotiate with the original author in order to augment their definition, and people who actually want to know things have clear unambiguous ways of getting to their goal. Follows the process of how knowledge is developed, ie, someone comes up with an idea and develops it themselves with citations to outside publications. In the case that their following published information Disadvantages, sparql queries are not simple, but I use programmatic level access and enable the retrieval of sameAs items through code which then abstracts queries to utilise all known identifiers when querying. People don't actually want to write sparql queries themselves, they are biologists or doctors, who just want to click on a button and have it work for them, whether the program does one or three queries is basically inconsequential to them. # What particular URI's to use for resources related to public databases (esp. database records) (>4 proposals on table) Admittedly this is an issue, but so far I like being able to have the best of lsid and http: uri's with the bio2rdf markup schemata. Simple text URI's not matching is inconsequential if one has metadata identifying two URI's as identical. * What entity is responsible for choosing and maintaining these URIs What is wrong with a simple scheme that "bio2rdf.org" uses? With my local "myBio2Rdf" installation I populate my database from the original supplier. Why do the metadata records need to be preprocessed and maintained by another entity? What is the difference between their scheme and any other, apart from prejudice against a particular opening identifer which people can translate and use without relying on the actual organisation to exist anyway. # How to get stuff Personally, I would stick with HTTP GET here. * How to use a URI to get metadata (RDF) about an identified resource I have no problems with getting metadata using the explicit URI object reference and then having to follow another url to find the actual data. It is the way things in society pretty much work, you find the identifying information before you find the data, so when you find the data you know what you were looking for and that you actually wanted to expend resources to get the data Ie, I would never follow the following url's until I verified that http://bio2rdf.org/identifier described what I wanted to know. http://bio2rdf.org/data/identifier http://bio2rdf.org/html/identifier http://bio2rdf.org/image/identifier Where one knows about what html and image mean to them for their goal as basic information types. * How to use a URI to retrieve the bits of an information resource Not sure what the difficulties are here. I spent a week making up a perfectly good browser page for bio2rdf information using my local database which assumed that the browser already knew how to follow HTTP standards... and it works so far. Essentially, given all of that, I have an adaptable system which utilises what I see as the best of the distributed semantic web (Web 3.0) with personal touches (Web 2.0). What would change if people all decided for instance to only use lsid and deprecated http:// uri's? Essentially, I could continue my personal methods as lsid is included already in my rdf data. What would change if people decided to access data by default with object references instead of metadata? Bio2RDF already allows for this within itself (ie, http://bio2rdf.org#rdfdata, although it is designed with what I see to be a more intuitive metadata by default approach. Is there any other change that would break my way of doing things? And does everyone need to decide on one standard, as opposed to utilising common elements well enough to combine them. Personally I do not like the idea of anonymous elements, ie bnodes, in RDF describing realistic scientific or medical data, but that is a minor issue I guess. Peter PhD student Faculty of Information Technology Queensland University of Technology Brisbane, Australia
Received on Tuesday, 23 October 2007 02:36:04 UTC