- From: Reinhard Schneider <reinhard.schneider@embl.de>
- Date: Thu, 18 Jun 2009 16:52:27 +0200
- To: public-widgets-pag@w3.org
- Message-Id: <7842D80E-FAFE-417B-AFC0-767CD0AB6DAF@embl.de>
Dear PAG-Team, please find attached some documents which describe an automated sequence analysis system for biological DNA and protein sequences. The system had an automatic update procedure (db_update) for the underlying databases. It performed the updates automatically and triggered a range of actions like reformatting, indexing and updating depended system tools. The first publication of the system appeared 1994 (see reference below). I also attach a more detailed description of the system (genequiz.html) and a kind of help file for the update script itself (update.rtf). This software was later part of a commercial package and the further development is still in use. I hope this is of any help and feel free to require more information. Best regards, Reinhard Schneider ___________________________________________ Dr. Reinhard Schneider Data Integration and Knowledge Management Group European Molecular Biology Laboratory (EMBL) 69117 Heidelberg Meyerhofstr. 1 Germany http://schneider-www.embl.de GeneQuiz: a workbench for sequence analysis. Scharf, M., Schneider, R., Casari, G., Bork, P., Valencia, A., Ouzounis, C., Sander, C. Protein Design Group, European Molecular Biology Laboratory, Heidelberg, Germany. Proceedings / . International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology Volume 2, 1994, Pages 348-353 Abstract We present the prototype of a software system, called GeneQuiz, for large-scale biological sequence analysis. The system was designed to meet the needs that arise in computational sequence analysis and our past experience with the analysis of 171 protein sequences of yeast chromosome III. We explain the cognitive challenges associated with this particular research activity and present our model of the sequence analysis process. The prototype system consists of two parts: (i) the database update and search system (driven by perl programs and rdb, a simple relational database engine also written in perl) and (ii) the visualization and browsing system (developed under C++/ET++). The principal design requirement for the first part was the complete automation of all repetitive actions: database updates, efficient sequence similarity searches and sampling of results in a uniform fashion. The user is then presented with "hit-lists" that summarize the results from heterogeneous database searches. The expert's primary task now simply becomes the further analysis of the candidate entries, where the problem is to extract adequate information about functional characteristics of the query protein rapidly. This second task is tremendously accelerated by a simple combination of the heterogeneous output into uniform relational tables and the provision of browsing mechanisms that give access to database records, sequence entries and alignment views. Indexing of molecular sequence databases provides fast retrieval of individual entries with the use of unique identifiers as well as browsing through databases using pre-existing cross-references. The presentation here covers an overview of the architecture of the system prototype and our experiences on its applicability in sequence analysis.(ABSTRACT TRUNCATED AT 250 WORDS)
Attachments
- text/html attachment: stored
- text/html attachment: genequiz.html
- text/html attachment: stored
- text/rtf attachment: update.rtf
- text/html attachment: stored
Received on Monday, 22 June 2009 08:56:07 UTC