Re: BioRDF Telcon

Here are some comments on
http://bio2rdf.org/JSPWiki/Wiki.jsp?page=BanffManifesto :

Rule #1 - normalized and dereferencable. What we need is to agree on a
single URI for each resource, so that we have the best chance possible
of getting matches when joining one data source with another.  Which
URI do you propose that people use in the RDF that they write, the URL
or the URN?

If all dereferencing always goes through your server, aren't you
concerned about server load and user expectations of availability?
And shouldn't users of your URIs be concerned about permanence - what
if your project folds or is bought and your domain name gets
repurposed?

Rule #2 - namespace names - I believe a list of standard namespace
abbreviations is being developed by another group.  For the
neurocommons PURLs we took abbreviations from a bootleg copy of this
list. The last thing we'd want to do is invent.

Alternate names may be useful for locating information, but they are
terrible for data integration (joins).  Stick to a single name for
each source so that joins are more likely to hit.

How do you define "authoritative"? Who is the authority in this situation?

Rule #3, predicates - some resources are documents, some aren't. The
rdf:type applies to any resource, but dc:identifier and dc:title are
only intended for documents. So while rdf:type applies to the resource
(the chemical), I would expect dc:title would apply to the resource's
metadata document (the one you show as an example). (Yes, I know that
the DC spec places no restrictions on the domains of the properties,
but it seems odd to me to say that water has a title.)

I don't see how dc:identifier, which is so underspecified, could be
used by anyone for any purpose.

Rule #4 - x prefix - I don't understand the reasoning here. And of
course you want to encourage use of predicates in existing ontologies
whenever possible, right? So these x's ought to be rare.

Rule #5 - no blank nodes - are you saying that RDF coming from your
server will not use blank node notation, or that you don't want anyone
else to write RDF that uses blank nodes? What is the rationale?

The abbreviation "BM" will cause most speakers of English to chuckle.
It is a term sometimes used in a particular way with small children.
You might want to pick something else.

I enjoyed the 'creeps' page (another word that has pungent
connotations in English), and am interested to hear your arguments in
favor of making these lists of redundant names even longer by adding
your names. That is, why not just pick a canonical name from among
those available, and provide a metadata server for resolving the names
and cleansing the metadata records? I know why I want to do this, but
I would be interested in hearing your reasons.

Best
Jonathan

Received on Tuesday, 10 July 2007 15:53:52 UTC