Bioinformatics for Undergraduates: Development
of a virtual laboratory
G. Reid Bishop
Millsaps College
Department of Chemistry
I. Background: Rationale for Overall Project
The biochemical sciences have recently undergone a revolutionary change from
being focused on understanding individual reactions and systems to the exploration
of many reactions and systems at once. This "high-throughput" approach
to science has resulted in the establishment of databases containing enormous
amounts of information in the form of gene and protein sequences and structures.
Unfortunately, traditional chemistry and biology curricula do not suitably address
the statistical and computational tools and techniques used for the analysis
of biochemical databases such as those based at the National Institutes of Health
(NIH) and available on-line at www.ncbi.nlm.nih.gov.
In short, there is a critical and immediate need for students trained in the
newly emerging sciences of chemical and biological informatics.
I propose to complete the development of an upper-level chemistry/ biochemistry
laboratory course in Chemical and Biological Informatics to be taught at Millsaps
College. The lecture component of the course has already been designed but the
course is not complete without a hands-on laboratory component. Thus, the final
stage of the course's development is the creation of a virtual computer laboratory
designed to address the concepts of molecular modeling and structure and sequence
analysis techniques conveyed during the lecture portion of the course. I intend
to utilize funding from ACS to attend training sessions on the use of the various
bioinformatic tools available on-line at the NIH and on the use of the chemical
and biological informatics software package, Molecular Operating Environment
(MOE).
The MOE software is distributed by Chemical Computing Group (CCG), Inc.
(www.chemcomp.com).
MOE is a user-friendly and yet robust chemical and bioinformatics software package
that is controllable and expandable through the simple scripting and applications
programming language, SVL. I have one year of experience with the software and
currently am supervising undergraduate research students in its use. MOE readily
performs advanced molecular modeling routines. The real power of the program
and what makes it unique as a teaching tool is the simple to use SVL programming
environment which allows students to directly interact with the mathematical
routines involved in the molecular and bioinformatics calculations. By interfacing
the MOE software through SVL, students learn how the program actually performs
calculations and, hence, will gain an increased understanding of the theories
introduced during the lecture portion of the course.
II. Description: Part of the Project to Be Done Under ACS Funding
My specific intention is to develop 10 laboratory modules and several teaching
modules using MOE during the summer of 2002. To accomplish this, I will need
to attend a workshop on programming in MOE using the SVL programming environment.
I will have the opportunity to bring my curriculum to the workshop and work
directly with a computational engineer towards developing the specific laboratory
modules. I also plan to attend a workshop sponsored by NIH to learn more about
the various tools available for accessing and analyzing their databases. The
money award from ACS will be used to pay for travel, lodging and workshop expenses
to attend the MOE and NIH bioinformatics events.
III. Timeline: Deliverables/Milestones for ACS Funded Part of Project
My timeline for the project is to have the modules completed during the summer
of 2002 for implementation of the course during the Fall or Spring semester
of the 2002-2003 academic year. My intention is to utilize the funds during
the early part of the summer for workshops and then to complete the modules
during the latter half of the summer at Millsaps College.
I have already designed one module in which students are given a protein sequence
of an "unknown" protein. Students will then compare the sequence to
other sequences at www.ncbi.nlm.nih.gov using
the BLAST and psi-BLAST software packages available at the same location. This
enables the students to search for and identify other proteins of similar sequence
and structure. Next, they will utilize MOE to perform a multiple sequence alignment
and a protein structure prediction using the homology modeling routines built
into MOE. The need for the SVL programming language will be used to show how
changing parameters in the math routines results in different computed structures
of the proteins. This will demonstrate how computational results can vary with
human intervention and will remove the mystery associated with molecular computations.
IV. Technology: Technical Requirements for the Project
All of the software needed to build the modules which includes an educational
network license for MOE from CCG, Inc., is currently running on existing computers
that are readily available to students. MOE runs quite well on a Pentium II
class IBM compatible PC which are plentiful at Millsaps College. Students will
have the opportunity to access the software from our network via a high-speed
T3 line. Also, a version of the MOE software for Unix based computers, such
as the Silicon Graphics computers in the Chemistry Department, will be available
to introduce the students to working with Unix based operating systems. Many
computers are available for accessing the NIH databases via the World Wide Web.
V. Other Support: Institutional and/or Outside Support for Project
One positive benefit of using MOE in the classroom is that it is available free
of charge for educational institutions such as Millsaps College when used for
educational purposes. Currently the software is in use by several prominent
drug discovery companies, research universities, and government agencies. Thus,
the students will have an exceptional opportunity to interact with a high-level
software package already in use by biotechnological industries. I expect that
this experience will make students who take the course and graduate from Millsaps
College, attractive prospects for biotech companies and graduate programs.
Over the past year, the scientists at CCG have already provided me with a great
amount of support in the use of MOE which I utilize in my own research. The
people at CCG have even been gracious enough to grant me permission to publish
results obtained by students in educational and research grade journals provided
the work is performed in a classroom setting by undergraduate students. Thus,
the course currently being designed is actually going to be approached as a
research based exploration of chemical and biological informatics where students
will explore both charted and uncharted territory in the laboratory using the
skills they learn during lecture and demonstrations. Millsaps College has provided
several powerful computers including two Silicon Graphics Computers for the
establishment of a bioinformatics computer laboratory. They have also given
computer and network support and have provided with ample space on the computer
servers for housing MOE and other bioinformatic tools as well as a chemical
and bioinformatics web-site to publicly present the students' work. This web-site
is currently under construction.
Students will also utilize the publicly available bioinformatic tools, sequences
and structures available from the National Institutes of Health at
www.ncbi.nlm.nih.gov. To facilitate the portions of the laboratory requiring
extensive use of the databases and informatics tools at the NIH, I also plan
to attend a Bioinformatics Workshop sponsored by the National Center for Biotechnology
Information (NCBI) located at the NIH.
VI. Learning Outcomes: How the Project Will Enhance Teaching/Learning
This project will provide a unique learning opportunity for students since it
exposes them to a state-of-the-art science. The laboratory component will greatly
enhance the lecture component since chemical and biological informatics are
best understood by actually performing the calculations instead of merely discussing
them as theories in the classroom. Students will also be required to present
their results to the class and other ACS institutions via a web-page that each
participant will be required to contribute to.
VII. Curriculum: How the Project Will Be Integrated into the Curriculum
The laboratory and teaching modules will be integrated into an upper-level chemistry/biochemistry
course currently under development. I am planning to implement the course, Chemical
and Biological Informatics during the Fall or Spring semester of the 2002-2003
academic school year. The implementation of this lab-based course will help
to complete the Chemistry Department's requirements for an American Chemistry
Society certified major in Biochemistry.
VIII. Assessment: How the Project Will Be Evaluated
Like all courses taught at Millsaps College, all students are given an opportunity
to officially evaluate and critique not only the instructor of a course but
also the course itself. I also plan to take regular surveys to determine what
aspects of the course need modification. The content of the course is currently
being evaluated by Dr. Anton Hopfinger who is a colleague and an expert in chemical
informatics at the University of Illinois in Chicago, IL.
IX. Dissemination: How the Project Will Be Shared with ACS Colleagues
The developed laboratory modules will be shared via a website created for describing
the use of MOE in the bioinformatics classroom. The web-site will be maintained
by students taking the course. I also plan to write a laboratory manual using
the MOE routines used in conjunction with the NIH databases and tools. I fully
intend to make this manual and all laboratory modules available to other ACS
Colleagues should they decide to create a similar course. I will also volunteer
to give presentations in the science departments of other ACS schools once the
course has been taught at Millsaps.