Virtual screening is widely used to discover new chemical tools and leads for drug discovery. Unfortunately, the technique remains difficult to use, and has thus been restricted to a few expert laboratories. Here, we create databases and tools to bring virtual screening to a wide biological audience, much expanding its impact and usefulness, and develop a chemoinformatics method to identify the "on" and "off" targets for drugs and reagents.
Two overarching goals in chemical biology are finding ligands for every protein, and identifying the targets underlying phenotypically active compounds. For the last decade, these goals have been pursued empirically. We believe that there is a strong call for computational discovery in both enterprises. It is the long- term goal o this project to bring chemistry to a large community of biologists, by enabling docking screens against all structurally addressable targets, and by developing tools that identify the targets mediating phenotypic biological activity. The first aim is met by developing compound libraries, benchmarking sets, and web-based tools that radically reduce barriers to entry. The second aim, target identification for ligands, is met by developing new chemoinformatic methods and testing them experimentally. 1. To elaborate ZINC with activity predictions using cheminformatics and docking, and link targets to disease. We will develop and deploy public access tools that enable biologists to interrogate chemistry for biology. 1. Tools in the ZINC platform will link commercially available compounds to their known and likely targets and, correspondingly, link targets to their known or likely ligands. 2. A new tool, DxTRx, connects targets to the phenotypes and diseases that they modulate. 3. We will use docking to precalculate high-scoring ligand lists for 10,000 relevant targets for which a structure exists. These hit-lists will be made available to the community, and will be substrates for our own target-target linkage studies. In short, we will develop an integrated tool set to allow an investigator to proceed from target ¿¿ compound ¿¿ phenotype¿¿target in many areas of biology of active interest. 2. Predicting targets from ligands (SEA). We will further exploit SEA to interrogate pharmacology, and to improve the core method. We will A. Use SEA to reorganize target-family trees, such as for kinases, GPCRs, and ion channels, by ligand rather than sequence similarity. Early work portends a dramatic re- arborization, leading to testable hypotheses about new target-associations. B. Investigate a protein structure context for the ligand similarities. SEA now compares ligands by topology, with a statistical engine for significance. For many targets, structures exist, and it may be possible to add a receptor context to these calculations. C. Bringing back the receptor may also address a weakness of SEA, its dependence on known ligands. Exploiting work in aim 1, we will compare the proteome-wide docking hit lists, seeking new target- target associations. A new application is to D. We will use SEA to predict the targets of compounds active in whole organism phenotypic screens, expanding on existing collaborations.