- From: <dlrubin@stanford.edu>
- Date: Mon, 16 Jul 2007 08:54:42 -0700
- To: "Balaji S. Srinivasan" <balajis@stanford.edu>
- Cc: public-semweb-lifesci <public-semweb-lifesci@w3.org>
Just to remind everyone--NCBO is planning on providing URIs for entities in the breadth of biomedical ontologies it hosts at http://bioportal.bioontology.org. This group has previously gave us a good set of functional requirements, and over the next few months we will be implementing this. Daniel ___ Daniel Rubin, MD, MS Clinical Asst. Professor, Radiology Research Scientist, Stanford Medical Informatics Scientific Director, National Center of Biomedical Ontology MSOB X-215 Stanford, CA 94305 650-725-5693 Quoting "Balaji S. Srinivasan" <balajis@stanford.edu>: > > Hi, > >> WSDL is a widely accepted W3C spec that is becoming increasingly > accepted worldwide (and is, generally, automatically generated based on > your interface, so requires little or no manual construction), and > which solves a problem that we *know without any doubt* URLs cannot > solve. > > I may be mistaken, but isn't WSDL just an XML format? I don't see how > it solves a problem that URLs "cannot solve"...wouldn't the location of > "foo.wsdl" be best specified as a URL? > >> in fact, they [WSDL] are currently MORE POPULAR than RDF itself, > according to Google Trends > > But the appropriate comparison is to URLs, not RDF...and the advantage > of a URL is that there's tons of widely deployed, lightweight > technology for requesting data from a given URL (e.g. w/ a browser as > well as Perl/Python/etc. libraries) and for setting up web servers > (e.g. Apache). > > I don't understand why it should be necessary to develop a parallel set > of technologies (e.g. the Firefox LSID plugin, or HTTP proxies) for > resolving LSIDs, particularly when most (all?) of these tools seem to > be built on top of tools (such as Firefox) which can already do URL > resolution without downloading anything. > > It would seem to me that the best way to get a reliable set of > canonical URIs is to get NCBI involved. As soon as NCBI published a set > of canonical URIs (e.g. for genes in Entrez Gene, compounds in Pubchem, > etc.) then everyone could use them with confidence. Reasons: > > 1) NCBI identifiers (even more so than EBI) are the de facto standard > and can be mapped to anything. > 2) NCBI is well funded, has serious bandwidth, etc. > 3) NCBI can be trusted to stick around for a long time and to > maintain/redirect old URLs, unlike a research lab or most companies. > 4) In terms of registering new URIs, NCBI is already a standard > location for data submissions (w/ NCBI GEO, GAIN, etc.). > 5) People already use NCBI to get other kinds of data, so getting RDF > data from them is not a serious paradigm shift. > > Perhaps there's someone from NCBI on the list; if not, it would be > worthwhile to contact them. If NCBI adopted the standard that > beta.uniprot.org is using, with different suffixes for different > formats (as per Eric Jain's email): > >> http://beta.uniprot.org/uniprot/P12345 >> http://beta.uniprot.org/uniprot/P12345.xml >> http://beta.uniprot.org/uniprot/P12345.rdf >> http://beta.uniprot.org/uniprot/P12345.fasta > > ....then I think people would adopt it immediately, especially if they > kept it on their front page for a month (like they do with other new > services). Regarding the way UniProt is doing things, I think it was a > particularly good design decision to have the de-facto suffix be HTML, > so that you can get a sense of what the URI represents by looking at it > in a browser. > > Also, from Matthias' recent email: > >> You should not try to pack ANY information about the 'resolution' of > a Semantic Web resource into its URI, quite to the contrary. Make it as > meaningless and generic as possible, in the best case it should just be > a large random alphanumeric string, e.g. tag:uri:a938fjhsdcHSDu39. If > all URIs look like this, nobody will be detered from re-using a URI > just because of how it looks. > > I don't know if this is such a good idea -- when debugging, you want to > have some information about what the URIs represent (e.g. the > "http://beta.uniprot.org/uniprot/" prefix tells you that you're looking > at a UniProt protein with the given ID number). If URIs are just > alphanumeric strings, you need to constantly be doing lookups to remind > yourself of what a particular object means. > > --B > > -- > Balaji S. Srinivasan, Ph.D. > Stanford University > Lecturer, Depts. of Statistics and Computer Science > 318 Campus Drive, Clark Center S251 > (650) 380-0695 > balajis@stanford.edu > http://jinome.stanford.edu > > > On Jul 14, 2007, at 10:30 PM, Mark Wilkinson wrote: > >> >> Well... I apologize in advance, but I'm going to be *insultingly* >> blunt because I'm quite honestly losing interest in this seemingly >> pre-destined discussion... >> >> "blinkers, are a piece of equipment used on a horse's face that >> restrict the horse's vision. They usually compose of leather or >> plastic cups that are places on either side of the eye, so that the >> horse can not see to his sides. Many racehorse trainers believe >> this keeps the horse focused on what is in front of him, >> encouraging him to pay attention to the race rather than other >> distractions, such as crowds" (http://en.wikipedia.org/wiki/Blinders) >> >> WSDL is a widely accepted W3C spec that is becoming increasingly >> accepted worldwide (and is, generally, automatically generated >> based on your interface, so requires little or no manual >> construction), and which solves a problem that we *know without any >> doubt* URLs cannot solve. I really don't see an advantage in >> trying to ignore them, circumvent them, or otherwise relegate them >> to a secondary lookup, in the base spec for the Semantic Web, when >> we know that we are going to have to deal with them at some point >> (and in fact, they are currently MORE POPULAR than RDF itself, >> according to Google Trends: >> http://www.google.com/trends?q=WSDL%2C+RDF&ctab=0&geo=all&date=all&sort=0 >> >> I really don't see the point in trying to build the Semantic Web by >> specifically avoiding acknowledgement of one of the most popular >> trends on the Web, when we already know that the vast majority of >> information we need to access as bioinformaticians is available >> through web forms or web services! >> >> I'm sorry for being rude and disrespectful - I'm honestly quite >> embarrassed to be saying these things so harshly - but I think >> this discussion has started to become a singularity around a >> pre-contrived end-point, rather than a discussion of what the Web >> (and the Semantic Web) really is/can be! >> >> WSDL -1 if you wish, but that puts you in opposition to the >> majority of the world, where WSDL (thanks to Ajax) is finally >> starting to make it's mark! >> >> Again, I apologize for being disrespectful and rude... it really >> isn't personal and I feel truly awful about writing this so >> harshly! I'm just losing patience with a discussion that doesn't >> seem to be a discussion, but rather a shoe-horn into a pre-destined >> end point. >> >> You are all free to crucify me the next time one of my grants comes >> to you for review ;-) >> >> M >> >> >> >> >> On Fri, 13 Jul 2007 20:19:41 -0700, Alan Ruttenberg >> <alanruttenberg@gmail.com> wrote: >> >>> >>> >>> On Jul 13, 2007, at 12:20 AM, Mark Wilkinson wrote: >>> >>> >>>>>> What worries me about the 303 solution (other than that we are >>>>>> not using it for >>>>>> it's primary purpose [1]) is that the redirection can only be >>>>>> to a *single* resource, specified in the Location header. >>>> >>>>> On Thu, 12 Jul 2007 03:57:34 -0700, Jonathan Rees >>>>> <jonathan.rees@gmail.com> wrote: >>>>> If this is an important functionality then it can be provided in a >>>>> variety of ways - a mere matter of programming. LSID resolver happens >>>>> to be the only way that comes ready made. But the functionality >>>>> doesn't need to be tied to the use of LSIDs. >>>> >>>> If there is an alternative solution that provides the same >>>> functionality, and that can be applied universally to all >>>> existing URIs (URLs), then I'm all for it! To be honest, this is >>>> my *primary* objection to moving to a URL solution vs an LSID >>>> solution... if you can solve that problem, then I am *almost* in >>>> the URL camp. >>> >>> Here is an alternative: >>> >>> Problem statement: >>> >>> Enable third parties to register the fact that they have >>> additional statements to provide about something that a URI >>> denotes, in such a way as to make it easy for anyone to discover >>> this fact. Do this in a way which requires minimal coordination >>> (ideally none) between the minter of the original URI, the >>> provider of the additional statements, and the consumer of all the >>> statements. >>> >>> Solution: >>> >>> For a given URI http://a.b/c/d/e, construct a new URI >>> http://purl.org/about/a.b/c/d/e >>> >>> Configure the purl server so that >>> http://purl.org/provide-about/a.b/c/d/e redirects to something >>> akin to a structured wiki page or a REST service (let us assume >>> for the moment that whoever currently provides the LSID WSDL that >>> contains this information currently is the provider of this >>> service). >>> >>> This page may be edited (manually or programmatically) to include >>> a description (suitable for a machine to understand) of how to >>> access the resource and what sort of resource it is, and perhaps >>> some additional useful information (what predicates does the >>> resource provide). This information rendered as RDF using a >>> standard vocabulary and saved. >>> >>> Configure the purl server so that http://purl.org/about/a.b/c/d/e >>> retrieves the RDF that was constructed (or a 404 if there is >>> none). Semantic web agents then interpret this RDF and go fetch >>> what they want or need. >>> >>> We all agree that 303s redirect to a human readable html document, >>> that this document uses a REL link to an RDF document that says >>> what the provider wishes to say and that the RDF also states that >>> http://purl.org/about/a.b/c/d/e may have more information. >>> (suitable shortcuts are provided to make bulk retrievals more >>> efficient - we've already discussed such mechanisms) >>> >>> This can be done now, with effort analogous to what is being done >>> with LSIDS. Let me point out some obvious advantages: 1) No >>> requirement to use web services (though web services *could* be >>> described as ways of accessing further statements using this >>> scheme) 2) Requires *less* manual intervention than is currently >>> required to maintain the WSDL. 3) Re-uses purl, which is based on >>> HTTP, which everyone knows how to use already 4) Makes clear that >>> the description of these additional resources for statements are >>> to be in RDF, and requires that one advertises what to expect if >>> you go to the resource (will you get an RDF document, a SPARQL >>> endpoint, a Web service set of methods?) >>> >>> --- >>> >>> With a bit more effort expended on extending the purl server code >>> we can get some more leverage - we enhance it so that retrieving >>> http://purl.org/about/a.b/c/d/e actually merges the RDF result of >>> retrieving each of http://purl.org/about*/a.b/ >>> http://purl.org/about*/a.b/c >>> http://purl.org/about*/a.b/c/d >>> http://purl.org/about/a.b/c/d/e >>> >>> Where the about* top level domain indicates that the information >>> about covers all URIs that start with the indicated path. >>> >>> In this way different providers can note that they have additional >>> statements about URIs located in varying amounts of namespace. >>> >>> With some coordination among us, we could even decide to dedicate >>> a server to hosting the whole mess of this information (I don't >>> expect that it needs too large a resource) so as to make the >>> service more efficient in answering queried, and making it easy to >>> provide, to whoever wishes, a snapshot that they can host >>> themselves. >>> >>> --- >>> >>> May I now count you among those *almost* in the URL camp? ;-) >>> >>> -Alan >>> >>> >>> >>> >> >> >>
Received on Monday, 16 July 2007 15:54:52 UTC