Report on the

NIST Workshop on a Database for Noncovalent Binding

Held at

The National Institute of Standards and Technology

Gaithersburg, Maryland

August 21-22, 1997

 

Michael K. Gilson

9/28/97

Introduction

Covalent bonds, bonds in which atoms share one or more electrons, account for the formation of molecules from atoms. Molecules, in turn, interact with other molecules via noncovalent interactions, such as electrostatic attractions, dispersion forces, and solvent-mediated forces. These noncovalent interactions can cause molecules to bind to each other in solution with binding energies that can be quite small, or as large as 100 kJ/mol. These binding reactions can be specific; that is, two molecules that bind each other tightly often do not bind other molecules. Such molecules will "recognize" and bind each other, even when present in a mixture of many different chemicals. Specific noncovalent binding, or "molecular recognition", is fundamental to biology: examples include the recognition of substrates by enzymes, the action of biological transmitters at their receptors, the binding and inactivation of foreign proteins by antibodies, and the inhibition of enzymes by drugs. Specific noncovalent binding is also important in chemistry. For example, the noncovalent binding of an analyte with a chromatography column can facilitate its separation from other chemicals that do not bind the column. More generally, the entire field of supra-molecular chemistry is founded on specific non-covalent binding reactions that allow host molecules to bind specific guest molecules, and that can lead to the self-assembly of organized aggregates of molecules with new properties.

It is not surprising, then, that the discovery or development of molecules that bind specific target molecules is important in a number of commercially important activities. Examples include drug-design and the design of molecules for chemical separations. Researchers in these areas benefit from the large number of measurements of binding affinities that have been published in the scientific literature. However, these data are scattered among many volumes of many journals, and finding data of value to a particular project can be difficult and time-consuming. Collecting existing data and linking it with other data on the same molecules to create an organized database would make it easier for researchers to find information on specific molecules, or to identify molecules that meet certain criteria. Researchers who seek molecules that bind specific targets also use computational prediction methods. However, the physical chemistry of noncovalent binding is not completely understood, and there is a need for improvements to current computational methods. A database of binding measurements would facilitate the elucidation of the physical basis for noncovalent binding, and thus the development of predictive models that would be useful to industry. To date, however, there exists no comprehensive database that provides easy access to binding affinities and associated data. NIST recently held a workshop that convened about 25 experts from industry and academia to discuss the establishment such a database. The participants are listed in Appendix A. This document describes the agenda and the conclusions of that workshop.

Purpose of the Workshop

Although it seemed reasonable that a public database for for noncovalent association would be a useful resource, establishing and maintaining such a resource is a nontrivial task. There is a risk of creating a database that would not meet the needs of users, or that could not be maintained for the long term. In order to maximize the likelihood of establishing a vigorous, useful project in this area, a workshop was held to foster discussions among scientists who generate data on noncovalent association, users of the data, experts in the relevant database and other technologies, and people interested in developing the database. The workshop sought to answer the following general questions:

The issues to be discussed were separated into four categories: Specifications and Content, Technical Implementation, Management and Maintenance, and Pros and Cons. The chief questions for each category are listed below; Appendix A provides a more detailed outline of these topics.

Conclusions of the Workshop

The discussions at the workshop led to general agreement on a number of conclusions, including the following:

1) The database project is worthwhile and technically feasible, and a number of the workshop participants are interested in contributing to the project.

2) The scope of the data should be restricted (e.g., exclude crystallographic coordinates and bioavailability data) so that progress can be made quickly. However, the database should be designed so that new types of data can be added in the future, and so that links can be made to other databases containing relevant data.

3) The management of the database should not assume full responsibility for the quality of the data. However, several measures can be taken to promote the deposition of high quality data and to allow for publication of evaluations of the data.

4) A variety of search and query mechanisms should be supported, and it should be possible for users to download the database and carry out custom searches on their own computers.

5) A detailed set of specifications should be developed in advance of implementation. Users must be involved in design of the user-interface.

6) Progress will be fastest if the database is built with commercial software.

7) It is hoped that experimentalists will be willing to deposit new data directly into the database. Once the database is established, it may be appropriate to carry out literature reviews in order to gather data from older publications.

Appendix C provides a more comprehensive list of conclusions, and a few outstanding questions. This list of conclusions has been reviewed and approved by the participants of the workshop.

Plans for Development of the Database

The most important technical step now is to determine precisely what information an entry in the database should contain. All entries will identify the molecules involved, their binding affinity, and the experimental conditions. However, further details of the entry will depend upon the type of measurement that was used. For example, the data associated with a calorimetry measurement will differ from those associated with a spectroscopic measurement. Therefore, it is planned to hold a series of focused workshops that convene experts in the important measurement techniques. These workshop will seek to generate detailed specifications of the database entries for each type of measurement.

Once the data specifications exist, implementation can begin. This will require the efforts of a database programmer who will collaborate with the domain experts. A grant proposal will be prepared for the Biological Databases program of the National Science Foundation. It is envisioned that this will yield the funds needed to hire the programmer.


Appendix A: Workshop Participants

(* Member of workshop steering committee)

Helen M. Berman, Ph.D.
(Bioinformatics; Structural Biology; Nucleic Acid Database)
Dept. of Chemistry
Rutgers U.
P.O. Box 939
Piscataway, NJ 08855-0939
Voice: 908 445-4667
Fax: 908 445-5958
 
Philip E. Bourne, Ph.D.*
(Bioinformatics; Structural Biology)
San Diego Supercomputer Center
PO Box 85608
San Diego CA 92186-9784
Voice: 619 534-8301
Fax: 619 534-5113
 
Patrick Brady, Ph.D.
(Computer-Aided Drug-Design; Computational Chemistry)
The DuPont Merck Pharmaceutical Co.
Experimental Station E500-3602A
Wilmington DE 19880-0500
Voice: 302 695-4003
Fax: 302 695-9090
 
Kenneth J. Breslauer, Ph.D.
(Thermodynamics of Molecular Recognition; DNA-Ligand Interactions}
Dept. of Chemistry
Rutgers U.
P.O. Box 939
Piscataway, NJ 08855-0939
Voice: 908 445-3956
Fax: 908 445-3409
 
Laurent David, Ph.D.
(Molecular Modeling; Noncovalent Binding)
Center for Advanced Research in Biotechnology
9600 Gudelsky Drive
Rockville, MD 20850-3479
Voice: 301 738-6215
Fax: 301 738-6255
 
Malcolm E. Davis, Ph.D.
(Computer-Aided Drug-Design; Biophysical Chemistry; Structural Biology)
Dept. of Macromolecular Structure
H23-07
Bristol-Myers Squibb
Pharmaceutical Research Institute
P.O.Box 4000
Princeton, NJ 08543-4000
Voice: 609 252-4324
Fax: 609 252-6030
 
Robert S. DeWitte, Ph.D.
(Computer-Aided Drug-Design; Chemi-informatics)
Dept. of Chemistry and Chemical Biology
Harvard U.
12 Oxford St.
Cambridge MA 02138
Voice: 617 496-4368
Fax: 617 496-5948
 
Gary L. Gilliland, Ph.D.
(Biotechnology; Structural Biology; Standards; Databases)
Biotechnology Division
National Institute of Standards and Technology
Chemistry Building, Rm. A345
Gaithersburg, MD 20899-0001
Voice: 301 975-2629
Fax: 301 330-3447
www.carb.nist.gov/carb/gilliland.html
 
Michael K. Gilson, Ph.D., M.D.*
(Molecular Modeling; Noncovalent Binding; Bioinformatics)
Center for Advanced Research in Biotechnology
National Institute of Standards and Technology
9600 Gudelsky Drive
Rockville, MD 20850-3479
Voice: 301 738-6217
Fax: 301 738-6255
www.carb.nist.gov/carb/gilson.html
 
Martha Head, Ph.D.
(Computational Chemistry; Noncovalent Association; Software Design)
Center for Advanced Research in Biotechnology
National Institute of Standards and Technology
9600 Gudelsky Drive
Rockville, MD 20850-3479
Voice: 301 738-6104
Fax: 301 738-6255
 
C. Nicholas Hodge, Ph.D.*
(Computer-Aided Drug-Design; Organic Chemistry)
The DuPont Merck Pharmaceutical Co.
P.O. Box 80500
Wilmington DE 19880-0500
Voice: 302 695-3698
Fax: 302 695-9090
 
Rui Luo
(Molecular Modeling; Noncovalent Binding)
Center for Advanced Research in Biotechnology
9600 Gudelsky Drive
Rockville, MD 20850-3479
Voice: 301 738-6108
Fax: 301 738-6255
 
Brock A. Luty, Ph.D.
(Computer-Aided Molecular Design; Computational Chemistry)
Agouron Pharmaceuticals, Inc.
3301 N. Torrey Pines Court
La Jolla, CA 92037
Voice: 619 535-0853
Fax: 619 678-8244
 
Irwin D. Kuntz, Ph.D.
(Computer-Aided Drug-Design; DOCK)
Dept. Pharmaceutical Chemistry
U. California San Francisco
513 Parnassus Ave. Box 0446
San Francisco, CA 94143-0446
Voice: 415 476-1937
Fax: 415 476-0688 (preferred over email)
 
Otto Ritter, Ph.D.
(Bioinformatics; Protein Data Bank)
Protein Data Bank
Brookhaven National Laboratory
Biology Department, Building 463
P.O. Box 5000
Upton, NY 11973-5000
Voice: 516 344-6353
Fax: 516 344-5751
 
Peter Rose, Ph.D.
(Computer-Aided Drug-Design; Databases; Empirical Binding Free energy Calculations)
Agouron Pharmaceuticals, Inc.
3301 N. Torrey Pines Court
La Jolla, CA 92037
Voice: 619 622-3095
Fax: 619 678-8244
 
John R. Rumble, Jr., Ph.D.
(Databases; Data Evaluation; Standards)
Standard Reference Data Program
Mail Stop NN113
National Institute of Standards and Technology
Gaithersburg, MD 20899
Voice: 301 975-2200
 
Andrew Rusinko, Ph.D
(Computer-Aided Drug-Design; Chemoinformatics; Data Mining)
Project Manager -- Chemoinformatics
Glaxo Wellcome Inc.
Five Moore Drive, P.O. Box 13398
Research Triangle Park, N.C. 27709
phone: (919)483-8404
fax: (919)315-0034
 
Carol Salata, Ph.D.
(Intellectual Property; Chemistry)
Industrial Partnerships Program
Mail Stop NN 213
National Institute of Standards and Technology
Gaithersburg, MD 20899
Mail Stop NN 213
Voice: 301 975-5108
Fax: 301 869-2751
 
Frederick P. Schwartz, Ph.D.
(Molecular Recognition; Calorimetry)
Center for Advanced Research in Biotechnology
National Institute of Standards and Technology
9600 Gudelsky Drive
Rockville, MD 20850-3479
Voice: 301 738-6217
Fax: 301 738-6255
 
Alexander Tropsha, Ph.D.
(Computer-Aided Drug-Design; QSAR; Combinatorial Libraries; Free Energy Simulations)
CB # 7360, Beard Hall
School of Pharmacy
U. North Carolina
Chapel Hill, NC 27599-7360
Voice:. 919 966-2955
Fax: 919 966-6919
 
John Westbrook, Ph.D.
(Bioinformatics; Nucleic Acid Database, Computational Chemistry)
Dept. of Chemistry
Rutgers U.
P.O. Box 939
Piscataway, NJ 08855-0930
Voice: 732 445-4290
Fax: 732 445-4320
 
Craig S. Wilcox, Ph.D.*
(Noncovalent Interactions; Bioorganic Chemistry; Host-Guest Chemistry)
Dept. of Chemistry
U. Pittsburgh
Pittsburgh, PA 15260
Voice: 412 624 8270
Fax: 412 624 8552
www.chem.pitt.edu/faculty/wilcox.html


Appendix B: Outline of Discussion Topics

(See PDF version.)


Appendix C: Detailed Conclusions

(See PDF version.)