- From: Kei Cheung <kei.cheung@yale.edu>
- Date: Mon, 14 Jun 2010 14:41:21 -0400
- To: HCLS <public-semweb-lifesci@w3.org>
The minutes for today's BioRDF call are available at:
http://esw.w3.org/HCLSIG_BioRDF_Subgroup/Meetings/2010/06-14_Conference_Call
Thanks to Matthias for scribing. Below are some excerpts of the minutes.
I'll be away for the next 5 weeks. Jun has agreed to convene the BioRDF
calls on June 21 and July 19.
Cheers,
-Kei
*****Excerpts Begin*******
kei: i want to give a bit of context. part of the agenda is to have jeff
and stephen give a description of the new NIF sparql endpoints.
... this is related to our broader query federation use-case.
... more recently we also looked at a more specific use case, microarray
data.
... we have looked at some examples of microarray results in the area of
neurological diseases.
... from gene expression data we could also link to other kinds of data,
including imaging.
... let us start with the description of NIF endpoints.
jeff: we can divide it into two types of content: the entities in NIF,
and the properties that are entered by the community.
<slars0n> NIF SPARQL Endpoint:
https://confluence.crbs.ucsd.edu/display/NIF/Sparql+endpoint
jeff: this is available in several ways. first, a SPARQL endpoint.
... second, extracted data from literature, and making it query-able.
this data will also be available through the SPARQL endpoint.
... this contet will be available in September.
... kei: for the microarray use-case, we have looked at some examples,
such as Alzheimer's disease. Information about different types of
neurons, brain regions etc. would be very helpful for annotation.
kei: you also mentioned the literature aspect. one of the challenges we
encountered was extracting gene lists from papers.
stephen: to get a sense of the basic structure of what we are doing
here: we are going through a loop between an OWL file which contains the
NIF content, and a Semantic MediaWiki, which has every entity in that
ontology renderes as a page.
... an ontology engineer can track the changes in the wiki and updates
the OWL ontology.
<slars0n> http://neurolex.org/wiki/Main_Page
stephen: as the OWL file changes, the engineers will update the wiki.
... neurolex is easily accessible through the web browser. l
... our goal with the ontology was to be very comprehensive. instead of
linking out, we brought everything in.
... now that semantic web is growing, we are evaluating ways of linking out.
... the SPARQL endpoint i sent before contains a lof of OWL statements
(restrictions etc.)
<slars0n> http://neurolex.org/wiki/SparqlEndPoint
stephen: the SPARQL endpoint i sent just now comes from the Semantic
MediaWiki export.
... this version has less OWL (restrictions etc.) in it.
... the two endpoints are on different servers.
... the ontology endpoint is on a virtuoso server. advantage: can do
transitive queries.
... the performance of transitive queries is good.
scott: did you run rules / pre-inferencing?
stephen: the transitive operation does not require rulesets as far as i
know, you just add it to the query.
... don't know about internals.
stephen: we used a cloud-based service that lets you do SPARQL
... has well-documented update facilities.
... you can even have a 'history' of updates.
<slars0n> http://n2.talis.com/wiki/Main_Page
stephen: (N2 by Talis)
kei: in HCLS we have two instances of Knowledge Bases: the one at DERI
(based on Virtuoso), one at University of Berlin (based on AllegroGraph).
... we have the endpoints, but users still need to know detailed graph
structure. it would be helpful to have some high-level metadata that
would help users know what information is contained in endpoint, what
information can be interrelated between endpoints...
... at the moment we have to develop federated queries at a very low level.
scott: at the moment we have a few, nice, useful SPARQL endpoints, but
in the future there could be thousands of enpoints to choose from
... the ultimate form of federation would be asking the question at one
place and having it automatically distributed to the right places.
... OWL, SKOS? is it exposed via D2R or SWObjects? Licensing information?
... you also need to know the contents. having very condensed
information about what is contained in the named graph.
jeff: we are extracting data from tables, we have a curator working on that.
... e.g., how up- and down-regulation is represented. we use a mixture
of automated tools and manual curation.
... the tables usually come from HTML/PDF version of papers. sometimes
also from supplemental material.
scott: another aspect (having spoken to chis stoeckert)... if we take
this not only to MGED, but also the publishers, and try them to get
researchers to submit gene lists, that would solve this problem in the
future.
kei: the NIF ontology will also be deposited in NCBO BioPortal
... BioPortal has its own SPARQL endpoint, too
... will there be redundancy? which endpoints / URIs will I use?
jeff: Neurolex is the 'working draft', before it goes through the
rigours of ontology engineer.
... NCBO is a community place.
scott: i suppose that some of the data released in september will also
contain the data that was annotated
jeff: yes
scott: you could also make that data available from NCBO
kei: another topic: gene lists. a number of us have been working on how
to represent gene lists.
... we could look at Neurolex to see which neuroscience terms we can
extract form these endpoints that would be relevant for annotation.
... matthias has also been working with aTags, used NCBO resources.
... we need an iterated process of debugging, based on use-cases
... i will be away, jun will convene some of the calls
stephen: we would be happy to receive feedback, suggestions for links.
scott: one potential use-case would be EHRs, helping clinicians with
certain tasks through integrated information.
*****Excerpts End*******
Kei Cheung wrote:
> This is a reminder that the next BioRDF telcon call will be held at
> 11 am EDT (4 pm CET) on Monday, June 14 (see details below).
>
> Jeff Grethe and Stephen Larson will join the call to talk to us ahout
> NIF SPARQL endpoints.
>
> Cheers,
>
> -Kei
>
>
> == Conference Details ==
> * Date of Call: Monday, June 14, 2010
> * Time of Call: 11:00 am Eastern Time (4 pm CET)
> * Dial-In #: +1.617.761.6200 (Cambridge, MA)
> * Dial-In #: +33.4.89.06.34.99 (Nice, France)
> * Dial-In #: +44.117.370.6152 (Bristol, UK)
> * Participant Access Code: 4257 ("HCLS")
> * IRC Channel: irc.w3.org port 6665 channel #HCLS (see W3C IRC page for
> details, or see Web IRC), Quick Start: Use
> http://www.mibbit.com/chat/?server=irc.w3.org:6665&channel=%23hcls for
> IRC access.
> * Duration: ~1 hour
> * Frequency: bi-weekly
> * Convener: Kei
> * Scribe: to-be-determined
>
> ==Agenda==
> * Introduction (Kei)
> * NIF SPARQL endpoints (Jeff, Stephen)
> * Gene list RDF representation (Lena, Satya, Jun, Scott)
>
Received on Monday, 14 June 2010 18:42:07 UTC