Report on the
NIST Workshop on a Database for Noncovalent Binding
Held at
The National Institute of Standards and Technology
Gaithersburg, Maryland
August 21-22, 1997
Michael K. Gilson
9/28/97
Introduction
Covalent bonds, bonds in which atoms
share one or more electrons, account for the formation of molecules from atoms.
Molecules, in turn, interact with other molecules via noncovalent interactions,
such as electrostatic attractions, dispersion forces, and solvent-mediated
forces. These noncovalent interactions can cause molecules to bind to each
other in solution with binding energies that can be quite small, or as large as
100 kJ/mol. These binding reactions can be specific; that is, two molecules
that bind each other tightly often do not bind other molecules. Such molecules
will "recognize" and bind each other, even when present in a mixture
of many different chemicals. Specific noncovalent binding, or "molecular
recognition", is fundamental to biology: examples include the recognition
of substrates by enzymes, the action of biological transmitters at their
receptors, the binding and inactivation of foreign proteins by antibodies, and
the inhibition of enzymes by drugs. Specific noncovalent binding is also
important in chemistry. For example, the noncovalent binding of an analyte with
a chromatography column can facilitate its separation from other chemicals that
do not bind the column. More generally, the entire field of supra-molecular
chemistry is founded on specific non-covalent binding reactions that allow host
molecules to bind specific guest molecules, and that can lead to the
self-assembly of organized aggregates of molecules with new properties.
It is not surprising, then, that the
discovery or development of molecules that bind specific target molecules is
important in a number of commercially important activities. Examples include
drug-design and the design of molecules for chemical separations. Researchers
in these areas benefit from the large number of measurements of binding
affinities that have been published in the scientific literature. However,
these data are scattered among many volumes of many journals, and finding data
of value to a particular project can be difficult and time-consuming.
Collecting existing data and linking it with other data on the same molecules
to create an organized database would make it easier for researchers to find
information on specific molecules, or to identify molecules that meet certain
criteria. Researchers who seek molecules that bind specific targets also use
computational prediction methods. However, the physical chemistry of
noncovalent binding is not completely understood, and there is a need for
improvements to current computational methods. A database of binding
measurements would facilitate the elucidation of the physical basis for
noncovalent binding, and thus the development of predictive models that would
be useful to industry. To date, however, there exists no comprehensive database
that provides easy access to binding affinities and associated data. NIST
recently held a workshop that convened about 25 experts from industry and
academia to discuss the establishment such a database. The participants are
listed in Appendix A. This document describes the agenda and the conclusions of
that workshop.
Purpose of the
Workshop
Although it seemed reasonable that a
public database for for noncovalent association would be a useful resource,
establishing and maintaining such a resource is a nontrivial task. There is a
risk of creating a database that would not meet the needs of users, or that
could not be maintained for the long term. In order to maximize the likelihood
of establishing a vigorous, useful project in this area, a workshop was held to
foster discussions among scientists who generate data on noncovalent
association, users of the data, experts in the relevant database and other
technologies, and people interested in developing the database. The workshop
sought to answer the following general questions:
The issues to be discussed were
separated into four categories: Specifications and Content, Technical
Implementation, Management and Maintenance, and Pros and Cons. The chief
questions for each category are listed below; Appendix A provides a more
detailed outline of these topics.
Conclusions of the
Workshop
The discussions at the workshop led to
general agreement on a number of conclusions, including the following:
1) The database project is worthwhile
and technically feasible, and a number of the workshop participants are
interested in contributing to the project.
2) The scope of the data should be
restricted (e.g., exclude crystallographic coordinates and bioavailability
data) so that progress can be made quickly. However, the database should be
designed so that new types of data can be added in the future, and so that
links can be made to other databases containing relevant data.
3) The management of the database
should not assume full responsibility for the quality of the data. However,
several measures can be taken to promote the deposition of high quality data
and to allow for publication of evaluations of the data.
4) A variety of search and query
mechanisms should be supported, and it should be possible for users to download
the database and carry out custom searches on their own computers.
5) A detailed set of specifications
should be developed in advance of implementation. Users must be involved in
design of the user-interface.
6) Progress will be fastest if the
database is built with commercial software.
7) It is hoped that experimentalists
will be willing to deposit new data directly into the database. Once the database
is established, it may be appropriate to carry out literature reviews in order
to gather data from older publications.
Appendix C provides a more
comprehensive list of conclusions, and a few outstanding questions. This list
of conclusions has been reviewed and approved by the participants of the
workshop.
Plans for Development
of the Database
The most important technical step now
is to determine precisely what information an entry in the database should
contain. All entries will identify the molecules involved, their binding
affinity, and the experimental conditions. However, further details of the
entry will depend upon the type of measurement that was used. For example, the
data associated with a calorimetry measurement will differ from those associated
with a spectroscopic measurement. Therefore, it is planned to hold a series of
focused workshops that convene experts in the important measurement techniques.
These workshop will seek to generate detailed specifications of the database
entries for each type of measurement.
Once the data specifications exist,
implementation can begin. This will require the efforts of a database
programmer who will collaborate with the domain experts. A grant proposal will
be prepared for the Biological Databases program of the National Science
Foundation. It is envisioned that this will yield the funds needed to hire the
programmer.
Appendix A: Workshop Participants
(* Member of workshop steering
committee)
Helen M. Berman, Ph.D.
(Bioinformatics; Structural Biology; Nucleic Acid Database)
Dept. of Chemistry
Rutgers U.
P.O. Box 939
Piscataway, NJ 08855-0939
Voice: 908 445-4667
Fax: 908 445-5958
Philip E. Bourne, Ph.D.*
(Bioinformatics; Structural Biology)
San Diego Supercomputer Center
PO Box 85608
San Diego CA 92186-9784
Voice: 619 534-8301
Fax: 619 534-5113
Patrick Brady, Ph.D.
(Computer-Aided Drug-Design; Computational Chemistry)
The DuPont Merck Pharmaceutical Co.
Experimental Station E500-3602A
Wilmington DE 19880-0500
Voice: 302 695-4003
Fax: 302 695-9090
Kenneth J. Breslauer, Ph.D.
(Thermodynamics of Molecular Recognition; DNA-Ligand Interactions}
Dept. of Chemistry
Rutgers U.
P.O. Box 939
Piscataway, NJ 08855-0939
Voice: 908 445-3956
Fax: 908 445-3409
Laurent David, Ph.D.
(Molecular Modeling; Noncovalent Binding)
Center for Advanced Research in Biotechnology
9600 Gudelsky Drive
Rockville, MD 20850-3479
Voice: 301 738-6215
Fax: 301 738-6255
Malcolm E. Davis, Ph.D.
(Computer-Aided Drug-Design; Biophysical Chemistry; Structural Biology)
Dept. of Macromolecular Structure
H23-07
Bristol-Myers Squibb
Pharmaceutical Research Institute
P.O.Box 4000
Princeton, NJ 08543-4000
Voice: 609 252-4324
Fax: 609 252-6030
Robert S. DeWitte, Ph.D.
(Computer-Aided Drug-Design; Chemi-informatics)
Dept. of Chemistry and Chemical Biology
Harvard U.
12 Oxford St.
Cambridge MA 02138
Voice: 617 496-4368
Fax: 617 496-5948
Gary L. Gilliland, Ph.D.
(Biotechnology; Structural Biology; Standards; Databases)
Biotechnology Division
National Institute of Standards and Technology
Chemistry Building, Rm. A345
Gaithersburg, MD 20899-0001
Voice: 301 975-2629
Fax: 301 330-3447
www.carb.nist.gov/carb/gilliland.html
Michael K. Gilson, Ph.D., M.D.*
(Molecular Modeling; Noncovalent Binding; Bioinformatics)
Center for Advanced Research in Biotechnology
National Institute of Standards and Technology
9600 Gudelsky Drive
Rockville, MD 20850-3479
Voice: 301 738-6217
Fax: 301 738-6255
www.carb.nist.gov/carb/gilson.html
Martha Head, Ph.D.
(Computational Chemistry; Noncovalent Association; Software Design)
Center for Advanced Research in Biotechnology
National Institute of Standards and Technology
9600 Gudelsky Drive
Rockville, MD 20850-3479
Voice: 301 738-6104
Fax: 301 738-6255
C. Nicholas Hodge, Ph.D.*
(Computer-Aided Drug-Design; Organic Chemistry)
The DuPont Merck Pharmaceutical Co.
P.O. Box 80500
Wilmington DE 19880-0500
Voice: 302 695-3698
Fax: 302 695-9090
Rui Luo
(Molecular Modeling; Noncovalent Binding)
Center for Advanced Research in Biotechnology
9600 Gudelsky Drive
Rockville, MD 20850-3479
Voice: 301 738-6108
Fax: 301 738-6255
Brock A. Luty, Ph.D.
(Computer-Aided Molecular Design; Computational Chemistry)
Agouron Pharmaceuticals, Inc.
3301 N. Torrey Pines Court
La Jolla, CA 92037
Voice: 619 535-0853
Fax: 619 678-8244
Irwin D. Kuntz, Ph.D.
(Computer-Aided Drug-Design; DOCK)
Dept. Pharmaceutical Chemistry
U. California San Francisco
513 Parnassus Ave. Box 0446
San Francisco, CA 94143-0446
Voice: 415 476-1937
Fax: 415 476-0688 (preferred over email)
Otto Ritter, Ph.D.
(Bioinformatics; Protein Data Bank)
Protein Data Bank
Brookhaven National Laboratory
Biology Department, Building 463
P.O. Box 5000
Upton, NY 11973-5000
Voice: 516 344-6353
Fax: 516 344-5751
Peter Rose, Ph.D.
(Computer-Aided Drug-Design; Databases; Empirical Binding Free energy
Calculations)
Agouron Pharmaceuticals, Inc.
3301 N. Torrey Pines Court
La Jolla, CA 92037
Voice: 619 622-3095
Fax: 619 678-8244
John R. Rumble, Jr., Ph.D.
(Databases; Data Evaluation; Standards)
Standard Reference Data Program
Mail Stop NN113
National Institute of Standards and Technology
Gaithersburg, MD 20899
Voice: 301 975-2200
Andrew Rusinko, Ph.D
(Computer-Aided Drug-Design; Chemoinformatics; Data Mining)
Project Manager -- Chemoinformatics
Glaxo Wellcome Inc.
Five Moore Drive, P.O. Box 13398
Research Triangle Park, N.C. 27709
phone: (919)483-8404
fax: (919)315-0034
Carol Salata, Ph.D.
(Intellectual Property; Chemistry)
Industrial Partnerships Program
Mail Stop NN 213
National Institute of Standards and Technology
Gaithersburg, MD 20899
Mail Stop NN 213
Voice: 301 975-5108
Fax: 301 869-2751
Frederick P. Schwartz, Ph.D.
(Molecular Recognition; Calorimetry)
Center for Advanced Research in Biotechnology
National Institute of Standards and Technology
9600 Gudelsky Drive
Rockville, MD 20850-3479
Voice: 301 738-6217
Fax: 301 738-6255
Alexander Tropsha, Ph.D.
(Computer-Aided Drug-Design; QSAR; Combinatorial Libraries; Free Energy
Simulations)
CB # 7360, Beard Hall
School of Pharmacy
U. North Carolina
Chapel Hill, NC 27599-7360
Voice:. 919 966-2955
Fax: 919 966-6919
John Westbrook, Ph.D.
(Bioinformatics; Nucleic Acid Database, Computational Chemistry)
Dept. of Chemistry
Rutgers U.
P.O. Box 939
Piscataway, NJ 08855-0930
Voice: 732 445-4290
Fax: 732 445-4320
Craig S. Wilcox, Ph.D.*
(Noncovalent Interactions; Bioorganic Chemistry; Host-Guest Chemistry)
Dept. of Chemistry
U. Pittsburgh
Pittsburgh, PA 15260
Voice: 412 624 8270
Fax: 412 624 8552
www.chem.pitt.edu/faculty/wilcox.html
Appendix B: Outline of Discussion
Topics
(See PDF version.)
Appendix C: Detailed Conclusions
(See PDF version.)