- From: Michel Dumontier <michel.dumontier@gmail.com>
- Date: Fri, 20 Mar 2009 09:55:20 -0400
- To: bio2rdf@googlegroups.com
- Cc: w3c semweb hcls <public-semweb-lifesci@w3.org>, "public-lod@w3.org" <public-lod@w3.org>, Paul Roe <p.roe@qut.edu.au>, James Hogan <j.hogan@qut.edu.au>, Lawrence Buckingham <l.buckingham@qut.edu.au>
- Message-ID: <c8edab680903200655v7acc369co4763d44cf062009a@mail.gmail.com>
Hi Peter - Great work! I have a question - why are there so many namespaces for these resources: > > * DBpedia - dbpedia, dbpedia_property, dbpedia_class > * LinkedCT - linkedct_ontology, linkedct_intervention, > linkedct_trials, linkedct_collabagency, linkedct_condition, > linkedct_link, linkedct_location, linkedct_overall_official, > linkedct_oversight, linkedct_primary_outcomes, linkedct_reference, > linkedct_results_reference, linkedct_secondary_outcomes, > linkedct_arm_group > * Dailymed - dailymed_ontology, dailymed_drugs, > dailymed_inactiveingredient, dailymed_routeofadministration, > dailymed_organization > * DrugBank - drugbank_ontology, drugbank_druginteractions, > drugbank_drugs, drugbank_enzymes, drugbank_drugtype, > drugbank_drugcategory, drugbank_dosageforms, drugbank_targets > * Diseasome - diseasome_ontology, diseasome_diseases, diseasome_genes, > diseasome_chromosomallocation, diseasome_diseaseclass > * Neurocommons - Uses the equivalent Bio2RDF namespaces, with live > owl:sameAs links back to the relevant Neurocommons namespaces. Used > for pubmed, geneid, taxonomy, mesh, prosite and go so far > * Flyted/Flybase etc not converted yet, only direct access provided > > > Provide live owl:sameAs references which match those used in SPARQL > queries to keep linkages to the original databases without leaving the > database:identifier paradigm, so if people know the DBPedia, etc., > URI's, the link to their current knowledge is given > > * Some http://database.bio2rdf.org/database:identifier URI's are given > by this, but these aren't standard, and are only shown where there is > still at least one SPARQL endpoint available which uses them. People > should utilise the http://bio2rdf.org/database:identifier versions > when linking to Bio2RDF. > > Integrated Semantic Web Pipes (pipes.deri.org) (version 0.7) so the > pipes runtime engine can be utilised on the same server as bio2rdf. > The main servers have a limited number of pipes available so far, but > more can be included by people wishing to contribute their pipes. The > URL syntax is /pipes/PIPEID/parameter1=value1/parameter2=value2 . This > provides a method for people wanting to utilise complex mashup > scenarios and provide them back to the community, as by default the > bio2rdf engine only knows how to do simple integration of RDF sources > into a single output document > > The two currently available pipes are: > * /pipes/bio2rdf_basic/database=DATABASE/identifier=IDENTIFIER Mirrors > /database:identifier functionality > * > /pipes/bio2rdf_subject_object_slicing/database=DATABASE/identifier=IDENTIFIER > Combines /database:identifier and /links/database:identifier > functionality into one operation > I didn't know about DERI pipes - looks fantastic! Thanks! > > Namespace synonyms can be implemented, with the first example that of > taxon and taxonomy for NCBI taxonomy as so far there hasn't been a > clear bias towards one or the other, and together with interlinked > owl:sameAs statements the synonyms will provide resolution to a > standard URI no matter which one is provided in the URI. > > * http://bio2rdf.org/taxon:identifier will return information in the > form http://bio2rdf.org/taxonomy:identifier currently, with an > owl:sameAs link back to the taxon version. This can be switched if > people in general prefer the taxon version as the default, although in > general this is an issue still as it is difficult to make up SPARQL > queries outside of the Bio2RDF server for these heterogeneous sources > ok, which other sources are providing NCBI taxonomy info? and what namespace prefix do they use? > > Provide live statistics to diagnose some network issues without having > to look at log files. The URL is /admin/stats > > * Shows the last time the internal blacklist reset, indicating how > much activity is being displayed as the statistics are reset everytime > the blacklist is reset. > * By default shows the IP's accessing the server, with an indication > of the number and duration of their queries. Can be configured in low > use and private situations to also show the queries being performed > * Shows the servers which have been unresponsive since the last > blacklist reset including a basic reason, such as an HTTP 503 or 400 > error > > Implement true RDF handling in the background to provide consistency > of output and the potential to support multiple output formats such as > NTriples and Turtle, although the only output currently supported is > RDF/XML. The Sesame library is being used to provide this > functionality. > > Provide more RDFiser scripts as part of the source distribution, > including Chebi, GO, Homologene, NCBI Geneid, HGNC, OBO and Ecocyc > > Provide more links to HTML provider URL's for given databases to > provide the link between the Bio2RDF RDF interface and currently > available HTML interfaces. The URL syntax for this is > /html/database:identifier > > Provide links to licence providers, so the applicable license for a > database may be available by following a URL. The URL syntax for this > is /license/database:identifier . It was easier to require the > identifier to be present than to not have it. So far, the identifier > portion is not being used, so it merely has to be present for the URL > resolution to occur, but in future there is the allowance to have > different licenses being given based on the identifier, which is > useful for databases which are not completely released under a single > license. > > Provide countlinks and countlinksns which count the number of reverse > links to a particular item from globally, or from within a given > database. Currently these only function on virtuoso endpoints due to > their use of aggregation extensions to SPARQL. The URL syntax is > /countlinks/database:identifier and > /countlinksns/targetdatabase/database:identifier > > Provide search and searchns, which attempt to search globally using > SPARQL (aren't currently linked to the rdfiser search pages which may > be accessed using searchns), or search within a particular database > for text searches. The searches are all performed using the virtuoso > fulltext search paradigm, ie, bif:contains, and other sparql endpoints > haven't yet been implemented even with regex because it is reasonably > slow but it would be simple to construct a query if people thought it > was necessary. The URL syntax is /search/searchTerm and > /searchns/targetdatabase:searchTerm > > If anyone has any SPARQL queries on biology related databases that > they regularly execute that can either be parameterised or turned into > Pipes then it would be great to include them in future distributions > for others to use. > absolutely! -=Michel=- > > Cheers, > > Peter Ansell > > [1] https://sourceforge.net/project/platformdownload.php?group_id=142631 > > --~--~---------~--~----~------------~-------~--~----~ > You received this message because you are subscribed to the Google Groups > "bio2rdf" group. > To post to this group, send email to bio2rdf@googlegroups.com > To unsubscribe from this group, send email to > bio2rdf+unsubscribe@googlegroups.com<bio2rdf%2Bunsubscribe@googlegroups.com> > For more options, visit this group at > http://groups.google.com/group/bio2rdf?hl=en > -~----------~----~----~----~------~----~------~--~--- > > -- Michel Dumontier Assistant Professor of Bioinformatics http://dumontierlab.com
Received on Friday, 20 March 2009 13:55:58 UTC