Bioinformatics for Undergraduates: Development of a virtual laboratory

G. Reid Bishop
Millsaps College
Department of Chemistry


I. Background: Rationale for Overall Project

The biochemical sciences have recently undergone a revolutionary change from being focused on understanding individual reactions and systems to the exploration of many reactions and systems at once. This "high-throughput" approach to science has resulted in the establishment of databases containing enormous amounts of information in the form of gene and protein sequences and structures. Unfortunately, traditional chemistry and biology curricula do not suitably address the statistical and computational tools and techniques used for the analysis of biochemical databases such as those based at the National Institutes of Health (NIH) and available on-line at www.ncbi.nlm.nih.gov. In short, there is a critical and immediate need for students trained in the newly emerging sciences of chemical and biological informatics.

I propose to complete the development of an upper-level chemistry/ biochemistry laboratory course in Chemical and Biological Informatics to be taught at Millsaps College. The lecture component of the course has already been designed but the course is not complete without a hands-on laboratory component. Thus, the final stage of the course's development is the creation of a virtual computer laboratory designed to address the concepts of molecular modeling and structure and sequence analysis techniques conveyed during the lecture portion of the course. I intend to utilize funding from ACS to attend training sessions on the use of the various bioinformatic tools available on-line at the NIH and on the use of the chemical and biological informatics software package, Molecular Operating Environment (MOE).

The MOE software is distributed by Chemical Computing Group (CCG), Inc. (www.chemcomp.com). MOE is a user-friendly and yet robust chemical and bioinformatics software package that is controllable and expandable through the simple scripting and applications programming language, SVL. I have one year of experience with the software and currently am supervising undergraduate research students in its use. MOE readily performs advanced molecular modeling routines. The real power of the program and what makes it unique as a teaching tool is the simple to use SVL programming environment which allows students to directly interact with the mathematical routines involved in the molecular and bioinformatics calculations. By interfacing the MOE software through SVL, students learn how the program actually performs calculations and, hence, will gain an increased understanding of the theories introduced during the lecture portion of the course.

II. Description: Part of the Project to Be Done Under ACS Funding

My specific intention is to develop 10 laboratory modules and several teaching modules using MOE during the summer of 2002. To accomplish this, I will need to attend a workshop on programming in MOE using the SVL programming environment. I will have the opportunity to bring my curriculum to the workshop and work directly with a computational engineer towards developing the specific laboratory modules. I also plan to attend a workshop sponsored by NIH to learn more about the various tools available for accessing and analyzing their databases. The money award from ACS will be used to pay for travel, lodging and workshop expenses to attend the MOE and NIH bioinformatics events.

III. Timeline: Deliverables/Milestones for ACS Funded Part of Project

My timeline for the project is to have the modules completed during the summer of 2002 for implementation of the course during the Fall or Spring semester of the 2002-2003 academic year. My intention is to utilize the funds during the early part of the summer for workshops and then to complete the modules during the latter half of the summer at Millsaps College.

I have already designed one module in which students are given a protein sequence of an "unknown" protein. Students will then compare the sequence to other sequences at www.ncbi.nlm.nih.gov using the BLAST and psi-BLAST software packages available at the same location. This enables the students to search for and identify other proteins of similar sequence and structure. Next, they will utilize MOE to perform a multiple sequence alignment and a protein structure prediction using the homology modeling routines built into MOE. The need for the SVL programming language will be used to show how changing parameters in the math routines results in different computed structures of the proteins. This will demonstrate how computational results can vary with human intervention and will remove the mystery associated with molecular computations.

IV. Technology: Technical Requirements for the Project

All of the software needed to build the modules which includes an educational network license for MOE from CCG, Inc., is currently running on existing computers that are readily available to students. MOE runs quite well on a Pentium II class IBM compatible PC which are plentiful at Millsaps College. Students will have the opportunity to access the software from our network via a high-speed T3 line. Also, a version of the MOE software for Unix based computers, such as the Silicon Graphics computers in the Chemistry Department, will be available to introduce the students to working with Unix based operating systems. Many computers are available for accessing the NIH databases via the World Wide Web.

V. Other Support: Institutional and/or Outside Support for Project

One positive benefit of using MOE in the classroom is that it is available free of charge for educational institutions such as Millsaps College when used for educational purposes. Currently the software is in use by several prominent drug discovery companies, research universities, and government agencies. Thus, the students will have an exceptional opportunity to interact with a high-level software package already in use by biotechnological industries. I expect that this experience will make students who take the course and graduate from Millsaps College, attractive prospects for biotech companies and graduate programs.

Over the past year, the scientists at CCG have already provided me with a great amount of support in the use of MOE which I utilize in my own research. The people at CCG have even been gracious enough to grant me permission to publish results obtained by students in educational and research grade journals provided the work is performed in a classroom setting by undergraduate students. Thus, the course currently being designed is actually going to be approached as a research based exploration of chemical and biological informatics where students will explore both charted and uncharted territory in the laboratory using the skills they learn during lecture and demonstrations. Millsaps College has provided several powerful computers including two Silicon Graphics Computers for the establishment of a bioinformatics computer laboratory. They have also given computer and network support and have provided with ample space on the computer servers for housing MOE and other bioinformatic tools as well as a chemical and bioinformatics web-site to publicly present the students' work. This web-site is currently under construction.

Students will also utilize the publicly available bioinformatic tools, sequences and structures available from the National Institutes of Health at www.ncbi.nlm.nih.gov. To facilitate the portions of the laboratory requiring extensive use of the databases and informatics tools at the NIH, I also plan to attend a Bioinformatics Workshop sponsored by the National Center for Biotechnology Information (NCBI) located at the NIH.

VI. Learning Outcomes: How the Project Will Enhance Teaching/Learning

This project will provide a unique learning opportunity for students since it exposes them to a state-of-the-art science. The laboratory component will greatly enhance the lecture component since chemical and biological informatics are best understood by actually performing the calculations instead of merely discussing them as theories in the classroom. Students will also be required to present their results to the class and other ACS institutions via a web-page that each participant will be required to contribute to.

VII. Curriculum: How the Project Will Be Integrated into the Curriculum

The laboratory and teaching modules will be integrated into an upper-level chemistry/biochemistry course currently under development. I am planning to implement the course, Chemical and Biological Informatics during the Fall or Spring semester of the 2002-2003 academic school year. The implementation of this lab-based course will help to complete the Chemistry Department's requirements for an American Chemistry Society certified major in Biochemistry.

VIII. Assessment: How the Project Will Be Evaluated

Like all courses taught at Millsaps College, all students are given an opportunity to officially evaluate and critique not only the instructor of a course but also the course itself. I also plan to take regular surveys to determine what aspects of the course need modification. The content of the course is currently being evaluated by Dr. Anton Hopfinger who is a colleague and an expert in chemical informatics at the University of Illinois in Chicago, IL.

IX. Dissemination: How the Project Will Be Shared with ACS Colleagues

The developed laboratory modules will be shared via a website created for describing the use of MOE in the bioinformatics classroom. The web-site will be maintained by students taking the course. I also plan to write a laboratory manual using the MOE routines used in conjunction with the NIH databases and tools. I fully intend to make this manual and all laboratory modules available to other ACS Colleagues should they decide to create a similar course. I will also volunteer to give presentations in the science departments of other ACS schools once the course has been taught at Millsaps.