- From: Chris Baker <denguehost@gmail.com>
- Date: Fri, 2 Jul 2010 17:31:16 -0300
- To: Christopher Baker <bakerc@unb.ca>
- Message-ID: <AANLkTikaIBnG9g3hOMrhOscGISvNPqIYz93njr4AwST_@mail.gmail.com>
ECCB 2010 Workshop: Annotation, interpretation and management of Mutations (AIMM2010) - Call for papers. ==================================================================================== This year's will workshop showcase the state of the art in extraction and reuse of genotype-phenotype information. Annotation of mutations with their impact on phenotypic expression is crucial to understanding genetic mechanisms involved in phenotypic processes and ultimately in complex diseases. Managing this knowledge is key to generating novel hypotheses. Despite the existence of literature and databases describing impacts of mutations, association studies fail to deliver linkage to phenotypes which is the most important contemporary research interest. Extraction of such information from scientific literature is a promising research field and existing solutions are ready to be deployed as services and as semantic web services. Keynote Speakers: Michael Schroeder - Professor BIOTEC Technical University Dresden, DE. Joost Schymkowitz - Professor VIB Switch Laboratory, Vrije Universiteit Brussel. Submissions: We invite short papers (3000 words / 8 pages) and demonstrations on the following topics: * Issues related to storage and representation of mutation information, including traditional databases, RDF triple stores, semantic knowledgebases and mutation ontologies. * NLP tools and systems for recognition and grounding of entities related to mutations and their annotations: including mutation impacts and mutation grounding. Also evaluations of these NLP tools and systems. * Systems for mutation impact prediction, reusing existing mutation databases and text extracted data. * Bioinformatics data integration, discoverable semantic web services and workflows, and semantic assistants for mutation annotation integration. Submissions can be made through the EasyChair submission page: https://www.easychair.org/account/signin.cgi?conf=aimm2010 Submission guidelines can be found on our website http://www.unbsj.ca/sase/csas/data/aimm2010/ Organizers Christopher J.O. Baker Ph.D., Associate Professor / Innovatia Research Chair, Department of Computer Science and Applied Statistics, University of New Brunswick, Saint John, Canada. Email: bakerc at unb.ca Dietrich Rebholz-Schuhmann MD, Ph.D., Research Group Leader, European Bioinformatics Institute, Welcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom. Email: Rebholz at ebi.ac.uk René Witte Dr.-Ing., Assistant Professor, Group Leader, Semantic Software Lab, Concordia University, Department of Computer Science and Software Engineering, Montreal, Canada. Email: rwitte at cse.concordia.ca Important Deadlines: Abstract submission: July 20 Full paper submission: July 26 Acceptance notification: August 12 Final manuscript submission: September 2 AIMM2010 workshop: September 26 Venue: ECCB2010 @ Ghent, Belgium. http://www.eccb2010.org/ Session Topics: http://www.unbsj.ca/sase/csas/data/aimm2010/ ===================================================== Mutation Databases and Metadata: Design, Content, Accuracy ----------------------------------------------------- Over 400 mutation databases have been produced in the past (determined via ‘google’ search). Many are no longer maintained and cover very specific data sets. In total, these repositories have been designed to support a wide range of features including listings of SNPs, point mutations, insertions, deletions, and observed phenotypes. Furthermore they incorporate a wide range of modified protein features and metrics in the accompanying annotations to the mutation descriptions. In the main these databases are manually curated however mutation annotations are frequently inaccurate e.g. in the PDB, inaccurate to the degree of 40 % of all PDB records. In addition to assessing content and coverage issues this session will explore issues related to storage and representation of mutations information showcasing a spectrum of mutation repositories types from traditional databases to RDF triple stores semantic knowledgebases and mutation ontologies. ======================================================= Extraction of mutations and annotations from literature ------------------------------------------------------- AI techniques such as text mining and natural language processing have been used in BioNLP to enable the extraction and grounding of named entities (mutations, protein, organisms) and impact annotations (protein properties, directions and scale of impact) from the mutation literature, with high levels of precision and recall, albeit prototype in scale. To facilitate their adoption it is necessary to measure the accuracy, recreation and update of existing mutation databases as we as their incorporation into semi manual annotation pipelines - the next milestone. In addition there is continuing discussion over the appropriate metrics for individual tasks within these systems which requires community involvement. This emergent technology now needs standardization. For the workshop we will solicit presentations, posters and demos of NLP tools, evaluations of mutation pipelines, mutation ontology population, and invite suggestions for a database reconstruction challenge to illustrate state of the art performance. ===================================================== Impacts of Mutations: Prediction and Bootstrapping ----------------------------------------------------- The ability to predict the impact of a mutation or the consequence of a sequence variant is central to the diagnosis of genetic diseases. Non-synonymous mutations may impact translational regulation, mRNA stability, mRNA splicing and rates of translation. Proteins affected by nsSNPs may have altered; catalytic sites, stability, ability to aggregate, and or post-translational modifications. Moving from SNP to sequence to structure and function has been addressed with varying degrees of accuracy with sequence and structure based (molecular mechanism, empirical energy function or machine learning) methods. Applying such techniques at a genome scale requires that robust approaches are identified, benchmarked with standard metrics in order to assign valid significance to ns mutations. Reuse of existing mutation databases and text extracted data for training prediction algorithms and checking quality of predictions is pivotal. ===================================================== Mutation Data Integration and Reuse ----------------------------------------------------- For scientists to make rapid advances in our understanding of living systems our infrastructures and techniques for knowledge translation are insufficient. Hypothesis generation based on the reuse of extracted information and in-silico predictions remains a distant capability for most scientists. Furthermore building the derived insights of mutational studies into robust models of a specific biological domain also seems far off. A multi level approach to biology must be accompanied by integrated infrastructures build from a diverse toolset. Integration with information from different systems will require the adoption of rich metadata for semantic knowledge integration, such as provided by existing phenotype ontologies and ontologies specific to impacts, sequence rearrangements and in vitro methodologies to construct mutants. For integration of bioinformatics data, discoverable semantic web services and workflows for mutation integration are emerging paradigms and this session will host examples of reusable mutation extraction and data integration workflows. Semantic assistant clients facilitating real time mutation annotation integration to desktop applications e.g. when browsing pubmed abstracts will be also be showcased. -- Christopher J. O. Baker Ph. D. Associate Professor Dept. Computer Science and Applied Statistics University of New Brunswick, Canada http://ca.linkedin.com/in/christopherjobaker
Received on Friday, 2 July 2010 20:31:53 UTC