Lead optimization via high-throughput molecular docking Diane Joseph-McCarthy *, J Christian Baber , Eric Feyfant

This is the html version of the file http://www.biomedcentral.com/content/pdf/cd-795650.pdf.
G o o g l e automatically generates html versions of documents as we crawl the web.

Google is neither affiliated with the authors of this page nor responsible for its content.

These search terms have been highlighted:

piecewise

linear

potential

dock

Page 1

264

Lead optimization via high-throughput molecular docking

Diane Joseph-McCarthy

*, J Christian Baber

, Eric Feyfant

, David C Thompson

& Christine Humblet

Addresses

Wyeth Research

Chemical and Screening Sciences

200 CambridgePark Drive

Cambridge

MA 02140

USA

Email: DJoseph@wyeth.com

Wyeth Research

CN 8000

Princeton

NJ 08543

USA

*To whom correspondence should be addressed

Current Opinion in Drug Discovery & Development 2007 10(3):264-274

Structure-based lead optimization approaches are increasingly

playing a role in the drug-discovery process. Recent advances

in 'high-throughput' molecular docking methods and examples

of their successful use in lead optimization are reviewed.

Measures of docking accuracy, scoring function comparisons,

and consensus approaches are discussed. Differences in docking

protocols typically used for lead optimization versus lead

generation are highlighted; this section includes a discussion of

the latest methods for the incorporation of protein flexibility.

New approaches developed specifically for the design of

combinatorial libraries as well as those designed or used for

'fragment' versus lead optimization are presented. Finally,

potential future improvements to the technology are outlined.

Keywords Combinatorial library design, computer-aided

drug design, fragment docking, scoring functions, structure-

based design, virtual screening

Abbreviations

ADME absorption, distribution, metabolism and excretion,

ANOVA analysis of variance, ASP Astex statistical potential,

FEP free energy perturbation, GB generalized Born, HTS

high-throughput screening, IFD induced fit docking, LIE

linear interaction energy, MM molecular mechanics, MW

molecular weight, NMR nuclear magnetic resonance, PB

Poisson-Boltzmann, PDE phosphodiesterase, PLP piecewise

linear potential, PMF potentials of mean force, PTP1B

phosphotyrosine phosphatase 1B, RMSD root-mean-squared

deviation, ROC receiver operating characteristic, SA surface

area, SAR structure-activity relationships, vdw van der Waals

Introduction

Structure-based lead optimization

Structure-based lead optimization is a powerful approach

employed by all large pharmaceutical and biotechnology

companies whenever structural information is available for

a project. The effectiveness of the approach and the exact

methods applied are determined by a number of factors: the

source of the initial lead, the quality of the lead, and the

accuracy and extent of the structural information available.

Computational chemistry techniques are utilized to

transform existing structural information into data that are

accessible and can be used to design improved compounds

or other leads by exploratory and medicinal chemistry

(Figure 1).

An initial lead can be generated via a variety of approaches:

by high-throughput screening (HTS), by more focused

screening of known inhibitors for a given target family, by

virtual screening of large databases of lead-like small

molecules, by modification of compounds that have been

previously published in the literature, by experimental

fragment screening, or by virtual fragment screening. The

quality of a lead in terms of its physical properties, potency

and target selectivity is often related to its source [1]. For

example, a validated HTS hit from a corporate collection

may be deemed too similar to related internal program lead

compounds, and any further optimization may require

scaffold modification to improve target selectivity against

other known therapeutic targets. Such modifications can

result in a reduction in potency. Virtual screening, in

particular, often generates weak hits that require further

optimization to improve potency. If vendor databases such

as ChemNavigator (www.chemnavigator.com) are used for

screening, novel functionality often has to be engineered

into the lead compound because the screened compounds

are in the public domain and available to many different

companies working on similar projects. Fragment screening

is often achieved via nuclear magnetic resonance (NMR) or

other biophysical methods [2,3]. Such screening typically

yields hits with weak activity in the high micromolar to

millimolar range. Although weak, fragment hits have high

'ligand efficiency' and, therefore, can be optimized by

'growing' the compound to bind into neighboring pockets of

the target protein or by connecting it to other fragment hits –

an approach that is discussed in a later section [4•].

Furthermore, fragment screening hits can be used to

optimize leads found via more traditional methods, for

example, by connecting the fragments onto existing

scaffolds.

The accuracy of structure-based computational methods is

affected by the quality and quantity of the available target

and ligand structural data. Structure-based virtual

screening, or molecular docking, is highly dependent on the

biological relevance of the existing 3-D structures of the

target. Solving the X-ray crystal structure of the target

co-crystallized with a number of different ligands is often

critical to the success of a modeling and structure-based

design effort. Consideration of target flexibility or subtle

changes in protein conformation upon ligand binding can be

crucial to the prediction of the docking mode of a ligand and

is reviewed below. In addition, determining new protein-

ligand complex structures periodically throughout the lead

Page 2

Lead optimization via high-throughput molecular docking Joseph-McCarthy et al 265

Figure 1. The structure-based lead optimization process.

Fragment screening

Grow

Connect

Lead

identification

Virtual screening

High-throughput

screening of

corporate collections

Synthesis of new

compounds and

combinatorial

libraries

Biochemical

assays to

confirm hits

Computational

design of

improved lead

compounds

Structure of lead

compound

bound to target

Structure of the

macromolecular

target

Biophysical techniques:

Protein biochemistry

X-ray crystallography

Nuclear magnetic resonance

Mass spectrometry

Steps that may involve the use of high-throughput docking techniques are highlighted with shadowed boxes.

optimization process allows the protein-binding site model

used for docking to be refined (if necessary) and provides a

check on the initial docking mode predictions. If the modeler

is confident that the original models can explain the majority

of the structure-activity relationships (SAR) observed via

biological testing, then solving these additional complex

structures may not be necessary. Furthermore, the

determination of protein-ligand structures can sometimes

prove difficult, even if the apo-structure is crystallized

readily.

A target structure that has been determined by 3-D NMR

techniques can be utilized for lead optimization; however,

this is less common than using an X-ray crystal structure,

because solving an NMR-determined protein structure can

be more time consuming and requires larger samples of

protein. NMR techniques are used for target determination

if the target is (i) sufficiently small (< 30 kDa), (ii) has

eluded structural determination via X-ray crystallographic

techniques, and (iii) is considered an important target.

However, it is more common that a homology model will be

generated based on a high resolution X-ray crystal structure

of a related protein, if one exists. For such homology models,

generally 20 to 30% sequence identity is required between

the desired target and the template protein sequence,

although there are no hard rules. For example, given low

overall sequence identity, higher identity in the active site

can be critical as can be the availability of other types of low

resolution structural data that provide constraints for the

target structure [5]. Gilson and co-workers carried out a

systematic study comparing the docking of a database of

'drug-like' molecules to an X-ray crystal structure and a set

of homology models for five drug targets [6•]. The study

demonstrated that docking to target homology models can

result in significant enrichment of known actives in a ranked

hit list; however, the researchers often found similar

enrichments when docking directly to the templates for the

homology models themselves.

Annotated databases of 3-D structures of druggable binding

sites suitable for docking studies (eg, http://bioinfo-pharma.

u-strasbg.fr/scPDB [7•]) and comparative models for protein

sequences that are homologous to at least one existing 3-D

protein structure (eg, http://salilab.org/modbase [8]) are

available on the Internet. Therefore, given the fact that the

number of 3-D macromolecular structures determined and

protein models generated continues to increase every year

[9], successful docking to surrogate protein structures and

homology models is also expected to increase in the future.

The success of a structure-based lead optimization effort also

depends, to a large extent, on close collaboration between

the modeler(s) and medicinal chemists developing the

ligand. Synthetic accessibility clearly needs to be considered

when designing modified leads, which is best accomplished

by the modeler and medicinal chemists together viewing

models of proposed compounds in 3-D. Tools such as

Benchware3D Explorer allow the modeler to prepare

annotated, labeled views of modeling results that can be

shared with medicinal chemists via email, and allows the

chemists to re-evaluate the models and structures at their

Desktop during the design of new synthetic targets. Existing

X-ray crystal structures of protein-ligand complexes are

viewed superimposed with docked poses for proposed

ligands.

The structure-based lead optimization process can involve

simple energy minimization of a few molecules in the

protein-binding site or the docking of relatively large

combinatorial libraries of hundreds or even thousands of

molecules. The same docking methods that are used for the

virtual screening of large combinatorial libraries of millions

Page 3

266 Current Opinion in Drug Discovery & Development 2007 Vol 10 No 3

of compounds for lead generation are also often applied in

lead optimization. For lead optimization, however, false

negatives are generally considered to be a greater problem,

as fewer molecules are considered and therefore somewhat

lower throughput methods (which are expected to be more

accurate) are also employed. Because of the current

pressures to develop drugs more efficiently, lead

optimization often takes place under tight time constraints;

the rapid optimization of two series with a pre-development

decision in two years is typical for most projects and

therefore computational approaches that can aid in the

process are highly valued.

Molecular docking tools

Structure-based virtual screening involves docking a library

of proposed or existing small molecules into a target-

binding site to identify which molecules have a

complementary fit to the target binding site and are

therefore likely to bind to the target. When optimizing lead

compounds, the resulting binding modes and scores for

proposed compounds determined by docking experiments

are compared with those for the current best leads for the

target. Often the modeler will look through several highly

ranked poses (eg, 3 to 10) for each compound in the target

binding site and select the best scoring pose that maintains

the known position of the scaffold (or core of the molecule)

in the site. For kinase targets, for example, Perola reported

that requiring docked ligand poses form two hydrogen

bonds to the hinge region and that the hydrogen bonds to

fall into one of three preferred motifs dramatically reduced

the number of false positives in hit lists [10]. As such, both

sampling and scoring are important components of any

molecular docking approach (Figure 2). The objective

evaluation and comparison of different docking methods is,

however, difficult because each researcher may prepare their

binding sites and ligands differently, define the box

encompassing the binding site differently, or select 'expert

user', non-default parameters for a docking protocol.

Furthermore, different docking methods fail for different

types of systems and may enrich for different types of hits.

Despite the limitations to the docking experiments discussed

above, many research groups have tried to evaluate

commonly used docking methods. The overall conclusion of

a recent comparison of the DOCK, DOCKVISION, Glide and

GOLD methods for five protein targets was that all four

methods enriched hit rates compared with random

screening, but that the prioritization of known ligands is

both method- and target-dependent [11]. As a general-

purpose docking tool, Glide performed the best across the

set of five targets, however, DOCK, DOCKVISION and

GOLD each outperformed Glide on certain targets. More

Figure 2. Molecular docking schematic demonstrating possible hierarchical levels for scoring virtual ligands.

Molecules to

synthesize and test

ID 0000400

ID 0000043

ID 0001345

ID 0000022

ID 0000001

ID 0023452

Virtual library docked

(i) Shape

(ii) Electrostatics

(iii) Solvation

CH3

NH2

CH3

NH2

CH3

(A) Virtual library is docked into the target. (B) The virtual ligands are scored hierarchically according to, for example, (i) shape only,

(ii) electrostatics and (iii) solvation. The orientation and conformation of each ligand is sampled in the binding site and scored to produce

Page 4

Lead optimization via high-throughput molecular docking Joseph-McCarthy et al 267

recently, Chen et al compared the FlexX, GOLD, Glide and

ICM docking methods for their ability to reproduce X-ray

protein-ligand complex structures and to enrich known

actives in hit lists [12•]. This study went on to compare the

docking results to enrichments from some ligand-based

virtual screening experiments. For enriching known actives

in a hit list over a set of 12 diverse targets, the average

enrichment factors for ICM, Glide and ROCS (using a bound

conformation for the ligand from an X-ray protein-ligand

complex structure as the query) were greater than

4.6, while FlexX and GOLD gave enrichment factors of less

than the 3.5 obtained through ISIS 2-D similarity searching.

Warren et al tested 10 docking programs and 37 scoring

functions on eight protein targets, and also concluded that

no single program performed well for all targets [13]. These

results underline the need to have multiple docking tools

available, and, if possible, to test the ability of each method

to reproduce any existing protein-ligand X-ray crystal

structures prior to beginning a lead optimization or lead

generation effort.

In addition to comparing existing methods, there continue to

be reports of new or significantly improved docking

methods. For example, Thomsen et al have developed a new

docking approach that uses a guided differential evolution

algorithm with a modified piecewise linear potential (PLP),

and a re-ranking procedure that re-scores poses from

independent docking runs with a torsional and van der

Waals term added to the PLP [14]. Furthermore, Park et al

described a significantly improved automated version of

AutoDock [15].

Measures of accuracy

To an extent, the measures used to determine the accuracy

of docking methods depend on the overall goal of the

docking exercise. If high-throughput docking is used to

screen libraries and identify compounds for experimental

testing then the ability to separate actives from inactives or

to rank more actives higher relative to inactives is important.

In such cases, it is appropriate to use measures of

enrichment or ROC (receiver operating characteristic) curves

to evaluate different docking methods [16•]. If the results of

docking experiments are being used for lead optimization

the determination of the true binding mode is important

[17]. The ability of a docking method and scoring scheme to

predict binding modes is usually evaluated by calculating

the symmetry-corrected RMSD between predicted and

actual binding modes for compounds with known crystal

structures. As such, the regeneration of a large set of

protein-ligand X-ray crystal structures has been used to

compare the performance of a number of scoring functions

[18]. Other methods of assessing the accuracy of docking

poses have been suggested, including a method developed

by Kroemer et al that concentrates on the interactions

between the ligand and the protein rather than just

deviations in the atomic positions [19].

As discussed, Chen et al examined the ability of docking

methods to enrich known actives in hit lists and to

accurately predict binding modes. They found that for both

tasks ICM and Glide performed particularly well, at least for

the test sets considered [12•]. Previous studies by, for

example, Verdonk et al also highlighted the fact that to

achieve good enrichment in virtual screening experiments, it

is necessary to produce reliable binding modes [20]. Clearly,

when measuring the enrichment of actives in a screening

exercise the selection of test sets (containing known actives

and decoy compounds) is vitally important and has been the

focus of research by a number of groups [20,21••]. When

assessing the performance of docking methods for

compound optimization, the use of 'property-matched'

decoy sets is key because the goal is often to differentiate

between similar compounds, and to pick those most likely to

be active.

Scoring methods

The scoring functions used in molecular docking methods

can be divided into three catagories: (i) energy-based,

(ii) empirical, and (iii) knowledge-based. While virtual

screening of molecular databases is able to enrich hit

molecules in ranked lists of a relatively small size, in most

cases there is little correlation between the experimentally

determined binding affinity of the hits and the computed

score. Analysis of variance (ANOVA) techniques have

recently been used to quantify the discriminatory power of

scoring functions with respect to ligands and decoys, and

may facilitate the development of more accurate scoring

schemes [22]. In this section we discuss recent developments

in the field of scoring function design and use.

Energy-based scoring functions

Energy-based scoring functions approximate atomic

interactions between protein and ligand by including terms

known to be important in molecular recognition. In general,

they consist of the bonded and non-bonded terms common

to established molecular mechanics (MM) force fields

(eg, AMBER95 [23] and CHARMm22 [24]). The parameters

in these functions are derived from physical measurements,

and do not necessarily

include binding affinity

measurements, therefore the scoring functions are highly

transferable from small test systems to larger biological

systems. However, MM-based scoring functions do not

include solvation or entropic considerations and, as such,

calculate binding enthalpies as opposed to free energies.

Solvation and some entropic factors can be accounted for

implicitly by adding either a continuum solvent Poisson-

Boltzmann (PB) or a generalized Born (GB) term for polar

solvation, and a solvent-accessible surface area (SA) term for

nonpolar solvation. The resulting methodology is referred to

as MM-PBSA or MM-GBSA and approximates the free

energy of ligand binding. The MM-GBSA approach has been

shown to improve early enrichment rates in the virtual

screening of large compound databases [25] and to correctly

rank a series of congeneric kinase inhibitors [26••], while the

MM-PBSA approach was used to examine in detail the

binding affinities of a set of biotin analogs for avidin [27•].

The linear interaction energy (LIE) method represents

a compromise

between

the more rigorous

and

computationally demanding free energy perturbation (FEP)

calculations and simple scoring functions. In contrast to FEP,

LIE uses only initial and final states of the binding event of a

Page 5

268 Current Opinion in Drug Discovery & Development 2007 Vol 10 No 3

single ligand to calculate binding free energies [28]. In a

recent study by Stjernschantz and co-workers, an automated

procedure for the straightforward use of the LIE method for

lead optimization is presented, and for three of the four

systems studied, the LIE method out-performed ten

different scoring functions [29].

Empirical scoring functions

Empirical scoring functions decompose the protein-ligand

binding affinity into a series of terms believed to be

important for binding free energy, with each term assigned a

weighting coefficient determined by a mathematical fit to

experimental binding data. Because of the reliance on a finite

sample of experimental data, empirical functions may

exhibit non-transferability issues, and it can be difficult to

know exactly what each term accounts for and to assess

where errors arise. As an example, the Glide XP 4.0 scoring

function includes both a water desolvation energy term

(a crude explicit water model) as well as terms that account

for specific protein-ligand structural motifs that provide

'exceptionally large contributions' to enhance binding

affinity (eg, hydrophobic enclosure, where a lipophilic

portion of a ligand is enclosed on opposite sides by

lipophilic protein atoms). The scoring function and

associated docking protocol have been developed to

reproduce experimental binding affinities for a set of 198

complexes and to yield significant database enrichments

against targets of pharmaceutical interest [30].

Knowledge-based scoring functions

Knowledge-based functions are derived from a statistical

analysis of the occurrence of atom-atom interactions in

known structures. The frequency of this occurrence can be

converted into a free energy term using a Boltzmann

distribution. Employing this method, the DrugScore

program was used to calculate potentials derived from

protein-ligand complexes [31]. A new variant of this

program, DrugScore

CSD

, has been developed based on data

generated from the crystal packing of small organic

molecules available through the Cambridge Structural

Database [32]. The highly resolved small molecule structures

provide relevant contact data across a better balanced

distribution of atom types found in drug-like molecules, and

produce potentials of superior statistical significance. The

original DrugScore and another commonly used knowledge-

based function, potentials of mean force (PMF), were

compared with a novel atom-atom potential derived from a

database of protein-ligand complexes – the Astex statistical

potential (ASP) [33]. ASP was used to construct a targeted

scoring function for cyclin-dependent kinase-2 that

produced significantly

improved enrichment rates

compared with the original ASP function. As might be

expected, it was demonstrated that the more structures used

in the construction of the targeted function the better the

overall results.

Consensus scoring

Consensus scoring methods can also be used to improve the

overall performance in docking methods [34,35]. Consensus

scoring methods have been applied to docking results by

generating a set of poses for each ligand using a particular

docking program, and then using multiple functions to re-

score the poses. Research using these methods is aimed at

establishing the performance of consensus scoring

techniques in various scenarios and determining the best

approach for combining scores [36,37]. Research on

consensus scoring in ligand-based screening has indicated

that much of the improvement from combining scores comes

from the fact that different methods have different

systematic errors [38]. Therefore, additional improvement

may be obtained by repeating the entire docking process

using multiple methods rather than simply re-scoring. The

GFscore method described by Betzi et al uses a non-linear

neural network to combine five scoring functions [39].

However, such methods require additional training sets or

computational expense, and some form of the original sum

rank method still appears to be the most practical solution

for real-life problems.

Lead optimization versus lead generation

High-throughput methods

High-throughput docking methods are routinely used both

for lead generation and lead optimization. For lead

generation it is necessary to screen large databases of up to

millions of compounds in a matter of days. While such

speeds are not required for lead optimization, it would

certainly be desirable if accurate results could be obtained in

seconds to assist the collaborative decision making between

modelers and chemists in designing molecules. Higher

throughput can be obtained by increasing computational

power, by using heuristics or scoring schemes that reduce

the time required to dock individual compounds, and by

reducing the number of compounds to be docked by

filtering (by, for example, pre-screening based on similarity

[40] or physical properties).

The docking process itself can be made faster through

reduced sampling by limiting the number of initial poses or

optimization steps or by restricting the poses based on

information known about the target. It is crucial to account for

ligand flexibility to screen a molecular database accurately,

and docking methods accomplish this either by flexing the

ligands during the docking stage or by pre-computing

conformers for each compound. Although the former method

requires much less computer disk space, the latter is

particularly useful when the same set of compounds (for

example, a corporate database) is screened against multiple

targets. The PhDock approach [41,42], for example,

significantly improves process speed by docking ensembles of

pre-computed conformers based on the largest 3-D

pharmacophore of each conformer and matching only ligand

pharmacophore points to target-derived pharmacophore

points. Further speedup is obtained by applying a tiered-

scoring approach, whereby the DOCK Contact score is used

as a fast filter of the entire database, followed by re-scoring of

poses for the top ranked molecules with a more physically

realistic function. Lorber et al have recently published a

method of pre-organizing similar conformer databases in a

hierarchical manner in order to consistently represent the

conformers and, furthermore, recognize and omit

incompatible conformations quickly without requiring a full

Page 6

Lead optimization via high-throughput molecular docking Joseph-McCarthy et al 269

docking run [43]. In addition, with conformationally

expanded databases, docking speed can be increased by

limiting the number of starting ligand conformations, and a

number of groups have investigated how this affects the

accuracy of the docking results [44-46]. In contrast, DOCK6.1

is just one example of a program that can be used to account

for ligand flexibility ad hoc during the docking process [47].

The Glide program also flexes the ligands as the docking

proceeds, and can be used with a tiered scoring approach in

a triage process [48,49].

Lower- to medium-throughput methods

Lower- to medium-throughput methods attempt to account

for solvation or protein flexibility in some way. Accounting

for solvation can be accomplished in a post-docking step by

saving multiple poses per ligand from a docking run and

then re-scoring the poses using a MM-GBSA or MM-PBSA

function. Verdonk et al have implemented a novel approach

in the GOLD program that allows explicit water molecules,

in particular those known to be crucial for binding, to switch

on and off and rotate during the docking process [50]. They

have used the approach, which applies a constant penalty to

the score to reward water displacement, to correctly predict

water-mediated protein-ligand complexes and water

displacements.

Protein flexibility is in some ways more difficult than

solvation to address, and many research groups are

attempting to include at least some limited protein flexibility

in the docking process. Erickson et al examined the effect of

docking to the corresponding protein structure of a protein-

ligand complex, an average protein structure, and an apo

protein structure [51]. It was discovered, as predicted, that

docking accuracy falls off dramatically if one uses an

average protein or apo protein structure. Programs such as

Glide dock to a rigid protein but attempt, in the simplest

way, to account for some protein flexibility by using a

reduced van der Waals (vdw) potential. Sherman et al

developed an induced-fit docking (IFD) protocol [52••], in

which multiple receptor conformations are generated using

homology modeling software (PRIME) as starting points for

rigid receptor docking. In the PRIME approach, the protein

backbone is only minimized and not sampled extensively as

the side chain positions are. For the 21 test systems studied,

the IFD proctocol significantly improved the binding

predictions compared with those obtained when using a

rigid body model, and for 18 of the 21 cases the RMSD was

less than or equal to 1.8 Å. The induced-fit docking

approach described by Mizutani et al also uses a reduced

vdw potential grid for the initial docking phase, followed by

minimization of the protein-ligand complex [53].

Another approach for addressing protein flexibility that

continues to be extensively explored is docking to ensembles

of protein conformations (either structures or models). Ferrari

et al compared the use of a reduced vdw potential to docking

to an ensemble of protein structures [54]. They reported that

while the reduced vdw potential was always better than the

full vdw potential at identifying known ligands when

docking to a single protein structure, the reduced vdw

potential was worse than the full potential when docking to

multiple protein structures for a given target. The FlexE

approach creates an ensemble by superimposing the

structures for a given target, merging the similar parts, and

explicitly taking into account and allowing combinatorial

recombination of the varying parts of the protein. In the

original paper describing FlexE, the approach was validated

on a set of ten protein ensembles [55]. However, Polgár and

Keserü more recently compared the use of FlexE with FlexX

and FlexX-Pharm. They reported that, when docking ligands

to c-Jun N-terminal kinase-3 and

-secretase, the FlexE

approach was not capable of predicting protein loop

movements. Furthermore, even when using FlexE to predict

side chain flexibility it did not outperform FlexX which

maintains a rigid protein [56]. However, they did report that

FlexE was useful for rapidly docking to proteins with

different side chain protonation states. Huang and Zou

presented an alternative ensemble docking algorithm that

allows the scoring and global optimization procedure of their

docking approach to automatically select an optimal protein

structure from an ensemble for each ligand [57]. Furthermore,

Cavasotto et al presented a novel algorithm for generating

alternative protein conformations via normal mode analysis

by perturbing the starting structure along a combination of

relevant modes. The resulting protein conformations were

then used for ensemble docking with the ICM program [58].

Finally, protein flexibility is readily incorporated with

Monte Carlo or molecular dynamics simulation approaches

that allow the protein and ligand to relax simultaneously.

Researchers at Wyeth have used rapid Monte Carlo docking

successfully on a number of lead optimization projects, as

described below in the case studies section of this review.

Combinatorial library enumeration and docking

One of the first steps towards optimizing a series of lead

compounds is to determine which compounds can be easily

synthesized using an existing synthetic route. This step

involves examining a list of commercially available reagents

of a given class, which could be substituted at a relevant

position on the lead to generate potentially more active

compounds. In some cases, depending on whether the

synthesis is convergent or divergent and what the level of

automation is, it may be possible to substitute more than one

position leading to a huge number of synthetically accessible

compounds [59•].

With improvements in docking speed it is often practical to

enumerate large virtual libraries and score every compound.

CombiGlide from Schrodinger LLC, is a relatively new tool

for the design of focused combinatorial libraries that can

rapidly screen very large numbers of compounds and

eliminate unpromising ones at an early stage. The authors'

group successfully used CombiGlide to optimize a series to

increase the in vitro activity of a lead compound by 20- to

30-fold in a single step [Feyfant E, unpublished data]. Given

an experimentally determined or a predicted binding mode

for a scaffold in a target-binding site, CombiGlide allows the

user to attach reagents based on desired chemistry. In this

study, the CombiGlide diverse side chain database, which

consists of only 821 reagents, was screened. Substitutions

occurred at only one position on the lead core, and in an

automated process the combinatorial library was fully

Page 7

270 Current Opinion in Drug Discovery & Development 2007 Vol 10 No 3

enumerated; absorption, distribution, metabolism and

excretion (ADME) properties were calculated using QikProp

[60], and the library was docked. Compounds that scored

well, had desired ADME properties and passed a visual

inspection were then proposed for synthesis.

When using more complicated combinatorial libraries,

docking scores can be combined with other criteria into a

fitness score to select compounds for synthesis that optimize

the overall fitness of the library. In a number of publications,

reagents have been either filtered and pruned, or adapted

recursively to optimize the final library while maintaining

the combinatorial nature of the synthesis (see, for example,

[61] and [62]).

Fragment or 'small lead' optimization

Fragment screening is another, increasingly common

approach for lead generation and optimization. Fragments

are defined as small molecules typically with molecular

weight (MW) of less than 300 Da. Because the size of the

chemical space is proportional to the size limit of the

molecules considered, covering chemical space is more

manageable with fragments than with drug-sized molecules.

In addition, as mentioned in the introduction, the ligand

efficiency of a fragment may make it more amenable to

optimization than a drug-like molecule. However, the

challenges of optimizing fragments as leads are somewhat

different to those for complete molecules. Furthermore,

while optimization of a fragment or selection of fragments,

with a view to connecting them, may involve high-

throughput docking, the techniques often need to be

modified or augmented with additional computational

approaches (Figure 3). A number of successful examples of

using fragment-based approaches to develop a drug

candidate or a new lead series in a pharmaceutical project

have been reported (see for example [63]).

Figure 3. A fragment-based screening paradigm.

Target binding site

+ bound fragments

Fragment-based optimization

Connect fragments

Fragment as new scaffold

Library enumeration

Target binding site + lead

Modify existing leads

The diagram demonstrates how optimized ligands can be constructed by (A) connecting bound fragments, (B) optimizing existing leads by

adding fragments, or (C) selecting a bound fragment as a new ligand scaffold.

Page 8

Lead optimization via high-throughput molecular docking Joseph-McCarthy et al 271

Howard et al at Astex Therapeutics Ltd developed a novel

thrombin inhibitor using a fragment-linking approach

[64••]. A thrombin-focused fragment library was designed

by virtual screening of a subset of an in-house library

against several conformations of thrombin. Based on the

molecular docking results, 80 fragments were selected for

screening by X-ray crystallography following Astex's

pyramid screening paradigm [65]. Among the binders

detected by X-ray crystallography screening, the authors

chose three neutral fragments for optimization – two that

bound to thrombin in the S1 pocket (IC

= 330 µM and

> 1 mM, respectively) and one that bound in the S2-S4

pocket (IC

= 100 µM). The overlay of the X-ray crystal

structures revealed a clear opportunity to link the S2-S4

binder with the S1 binders. First, a small library of analogs

of the S2-S4 pocket binder was synthesized, and a more

potent compound was selected (IC

= 12 µM). Then, each of

the S1 pocket-binders was linked to the new S2-S4 binder,

respectively. The resulting compounds showed 50- to

200-fold increased potency (with IC

values of 220 nM and

1.4 nM) and selectivity versus trypsin.

Card et al discovered a new family of phosphodiesterase

(PDE) inhibitors using a similar approach [66]. A library of

20,000 compounds with molecular weight ranging from

125 to 350 Da was screened by biophysical assay against

several PDEs, and 316 binders were identified. After

attempting to co-crystallize all the binders with PDE, one of

the low-affinity hits (PDE4D IC

= 82

M; MW 168 Da) was

selected as a possible scaffold (a scaffold, as defined by the

authors, is a compound that forms key interactions with the

receptor and is expected to retain its binding mode upon

minor substitution). A small number of close analogs were

tested and the most potent analog (IC

= 270 nM) was

co-crystallized with the protein confirming the expected

binding mode. A virtual combinatorial library around this

new scaffold was enumerated, according to a synthetic

schema, docked into the binding site, and scored using an

MM-PBSA method. Using this approach, combining

experimental fragment screening with molecular docking to

grow an initial fragment, the authors were ultimately able

to design a relatively low MW (289 Da), high potency

(IC

= 30 nM) inhibitor.

Case studies

The literature is full of reports of successful cases of

structure-based lead optimization (see for example [67-70])

and, as such, a comprehensive review is impractical here.

Examples from Wyeth include research on the metabolic

disease target phosphotyrosine phosphatase (PTP)1B [71,72],

the inflammation targets tumor necrosis factor-α-converting

enzyme [73] and cytosolic-phospholipase A2α [74], and the

infectious disease target acyl carrier protein synthase [75]. In

the case of PTP1B, structure-guided lead optimization led to

the rapid optimization of a lead compound with an IC

value of 230 µM, through a number of iterations, resulting in

a 4-nM compound in approximately 9 months [Joseph-

McCarthy D, unpublished data]. The inclusion of some

limited protein flexibility was critical to the success of the

modeling. Side chains in the binding site were allowed to

undergo constrained movement during rapid Monte Carlo

docking [76], and the binding site model including the

choice of flexible residues was refined throughout the

process as new protein-ligand X-ray crystal structures were

determined.

Conclusions

Future advances in the area of structure-based lead

optimization will likely involve the development of better

scoring functions and the inclusion of protein flexibility into

more high-throughput docking approaches. Rigorous

theoretical work and increases in computer power continue,

for example, to enable ever more accurate approximations of

solvation and entropic effects [77-80]. Furthermore,

increasingly accurate ligand charge states are being

determined through the use of quantum mechanics in the

presence of continuum solvent [81], and by directly

optimizing ligand charges using sensitivity analysis [82].

As another example of research that may impact lead

optimization methods, Hao has presented a quantum

mechanics approach for calculating hydrogen bond

strengths for drug molecules [83]. Finally, over the next five

years, fragment screening, both experimental and in silico,

will probably play a greater role in the drug discovery

process. As a result, methods that are more finely tuned for

the docking and scoring of low MW fragment molecules as

very weak inhibitors will almost certainly emerge. Overall, it

is anticipated that more accurate, higher-throughput

molecular docking will continue to play an increasing role in

the lead optimization process.

References

••

of outstanding interest

•

of special interest

Wunberg T, Hendrix M, Hillisch A, Lobell M, Meier H, Schmeck C, Wild

H, Hinzen B: Improving the hit-to-lead process: Data-driven

assessment of drug-like and lead-like screening hits. Drug Disc

Today (2006) 11(3-4):175-180.

Carr RA, Congreve M, Murray CW, Rees DC: Fragment-based lead

discovery: Leads by design. Drug Disc Today (2005) 10(14):987-992.

Ciulli A, Williams G, Smith AG, Blundell TL, Abell C: Probing hot spots

at protein-ligand binding sites: A fragment-based approach using

biophysical methods. J Med Chem (2006) 49(16):4992-5000.

Cele AZ, Metz JT: Ligand efficiency indices as guideposts for drug

discovery. Drug Disc Today (2005) 10(7):464-469.

• Reviews the various ligand efficiency definitions and indices, and discusses

their utility.

Topf M, Baker ML, Marti-Renom MA, Chiu W, Sali A: Refinement of

protein structures by iterative comparative modeling and cryoEM

density fitting. J Mol Biol (2006) 357(5):1655-1668.

Kairys V, Fernandes MX, Gilson MK: Screening drug-like compounds

by docking to homology models: A systematic study. J Chem Inf

Model (2006) 46(1):365-379.

• While in general, docking to homology models can enrich for known actives,

standard measures of similarity between the template and target protein do

not correlate with enrichment rates obtained using a given homology mode.

This paper discusses and examines these effects on carboxypeptidase A,

coagulation Factor Xa, peroxisome proliferator-activated receptor

, cyclin

dependent kinase 2 and acetylcholinesterase.

Kellenberger E, Muller P, Schalon C, Bret G, Foata N, Rognan D:

sc-PDB: An annotated database of druggable binding sites from

the protein data bank. J Chem Inf Model (2006) 46(2):717-727.

• Annotations for 6415 binding sites include protein name, function, source,

domain and mutations, ligand name and structure.

Page 9

272 Current Opinion in Drug Discovery & Development 2007 Vol 10 No 3

Pieper U, Eswar N, Davis FP, Braberg H, Madhusudhan MS, Rossi A,

Marti-Renom M, Karchin R, Webb BM, Eramian D, Shen MY et al:

MODBASE: A database of annotated comparative protein structure

models and associated resources. Nucleic Acids Res (2006)

34(Database Issue):D291-D295.

Berman HM, Burley SK, Chiu W, Sali A, Adzhubei A, Bourne PE, Bryant

SH, Dunbrack RL Jr, Fidelis K, Frank J, Godzik A et al: Outcome of a

workshop on archiving structural

models

biological

macromolecules. Structure (2006) 14(8):1211-1217.

10. Perola E: Minimizing false positives in kinase virtual screens.

Proteins (2006) 64(2):422-435.

11. Cummings MD, DesJarlais RL, Gibbs AC, Mohan V, Jaeger EP:

Comparison of automated docking programs as virtual screening

tools. J Med Chem (2005) 48(4):962-976.

12. Chen H, Lyne PD, Giordanetto F, Lovell T, Li J: On evaluating

molecular-docking methods for pose prediction and enrichment

factors. J Chem Inf Model (2006) 46(1):401-415.

• This paper assesses the virtual screening and ligand docking capabilities of

four docking methods (FLexX, GOLD, GLIDE and ICM). On a target-by-target

basis, ICM performed best for 8 of the 12 targets, Glide for 4 and ROCS for 2.

It is noted that binding site preparation and control parameter settings may

affect the results.

13. Warren GL, Andrews CW, Capelli AM, Clarke B, LaLonde J, Lambert

MH, Lindvall M, Nevins N, Semus SF, Senger S, Tedesco G et al:

A critical assessment of docking programs and scoring functions.

J Med Chem (2006) 49(20):5912-5931.

14. Thomsen R, Christensen MH: MolDock: A new technique for high-

accuracy molecular docking. J Med Chem (2006) 49(11):3315-3321.

15. Park H, Lee J, Lee S: Critical assessment of the automated

AutoDock as a new docking tool for virtual screening. Proteins

(2006) 65(3):549-554.

16. Triballeau N, Acher F, Brabet I, Pin JP, Bertrand HO: Virtual screening

workflow development guided by the "receiver operating

characteristic" curve approach. Application to high-throughput

docking on metabotropic glutamate receptor subtype 4. J Med

Chem (2005) 48(7):2534-2547.

• A good description of the use of ROC curves to guide high-throughput

docking.

17. Jain AN: Virtual screening in lead discovery and optimization. Curr

Opin Drug Discovery Dev (2004) 7(4):396-403.

18. Wang R, Lu Y, Fang X, Wang S: An extensive test of 14 scoring

functions using the PDBbind refined set of 800 protein-ligand

complexes. J Chem Inf Comput Sci (2004) 44(6):2114-2125.

19. Kroemer RT, Vulpetti A, McDonald JJ, Rohrer DC, Trosset JY,

Giordanetto F, Cotesta S, McMartin C, Kihlen M, Stouten PF:

Assessment of docking poses: Interactions-based accuracy

classification (IBAC) versus crystal structure deviations. J Chem Inf

Comput Sci (2004) 44(3):871-881.

20. Verdonk ML, Berdini V, Hartshorn MJ, Mooij WT, Murray CW, Taylor

RD, Watson P: Virtual screening using protein-ligand docking:

Avoiding artificial enrichment. J Chem Inf Comput Sci (2004)

44(3):793-806.

21. Huang N, Shoichet BK, Irwin JJ: Benchmarking sets for molecular

docking. J Med Chem (2006) 49(23):6789-6801.

•• A standard benchmarking set of compounds called the directory of useful

decoys (DUD). In this set, the decoy compounds (those that are either known

or assumed to be inactive against a particular target) are physically similar yet

topologically distinct to the known actives.

22. Seifert MH: Assessing the discriminatory power of scoring

functions for virtual screening. J Chem Inf Model (2006) 46(3):1456-

1465.

23. Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, Ferguson DM,

Spellmeyer DC, Fox T, Caldwell JW, Kollman PA: A 2nd generation

force-field for the simulation of proteins, nucleic-acids, and

organic-molecules. J Am Chem Soc (1995) 117(19):5179-5197.

24. MacKerell AD Jr, Bashford D, Bellott M, Dunbrack RL Jr, Evanseck JD,

Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph-McCarthy D et al:

All-atom empirical potential for molecular modeling and dynamics

studies of proteins. J Phys Chem B (1998) 102(18):3586-3616.

25. Huang N, Kalyanaraman C, Irwin JJ, Jacobson MP: Physics-based

scoring of protein-ligand complexes: Enrichment of known

inhibitors in large-scale virtual screening. J Chem Inf Model (2006)

46(1):243-253

26. Lyne PD, Lamb ML, Saeh JC: Accurate prediction of the relative

potencies of members of a series of kinase inhibitors using

molecular docking and MM-GBSA scoring. J Med Chem (2006)

49(16):4805-4808.

•• An MM-GBSA function is used in a post-docking step to 're-score' docked

poses of each ligand.

27. Weis A, Katebzadeh K, Soderhjelm P, Nilsson I, Ryde U: Ligand

affinities predicted with the MM/PBSA method: Dependence on the

simulation method and the force field. J Med Chem (2006)

49(22):6596-6606.

• This investigation showed there was little dependence on choice of the force

field when using the MM-PBSA method to predict ligand affinities. However,

the mixing of force fields using, for example, one force field for molecular

dynamics simulations and another for the MM-PBSA energy calculations is

not recommended.

28. Aqvist J, Medina C, Samuelsson JE: A new method for predicting

binding affinity in computer-aided drug design. Protein Eng (1994)

7(3):385-391.

29. Stjernschantz E, Marelius J, Medina C, Jacobsson M, Vermeulen NP,

Oostenbrink C: Are automated molecular dynamics simulations and

binding free energy

calculations realistic tools in

lead

optimization? An evaluation of the linear interaction energy (LIE)

method. J Chem Inf Model (2006) 46(5):1972-1983.

30. Friesner RA, Murphy RB, Repasky MP, Frye LL, Greenwood JR,

Halgren TA, Sanschagrin PC, Mainz DT: Extra precision glide:

Docking and scoring incorporating a model of hydrophobic

enclosure for protein-ligand complexes. J Med Chem (2006)

49(21):6177-6196.

31.

Gohlke H, Hendlich M, Klebe G: Knowledge-based scoring function to

predict protein-ligand interactions. J Mol Biol (2000) 295(2):337-356.

32. Velec HF, Gohlke H, Klebe G: DrugScore(CSD)-knowledge-based

scoring function derived from small molecule crystal data with

superior recognition rate of near-native ligand poses and better

affinity prediction. J Med Chem (2005) 48(20):6296-6303.

33. Mooij WT, Verdonk ML: General and targeted statistical potentials

for protein-ligand interactions. Proteins (2005) 61(2):272-287.

34. Charifson PS, Corkery JJ, Murcko MA, Walters WP: Consensus

scoring: A method for obtaining improved hit rates from docking

databases of three-dimensional structures into proteins. J Med

Chem (1999) 42(25):5100-5109.

35. Stahl M, Rarey M: Detailed analysis of scoring functions for virtual

screening. J Med Chem (2001) 44(7):1035-1042.

36. Oda A, Tsuchida K, Takakura T, Yamaotsu N, Hirono S: Comparison

of consensus scoring strategies for evaluating computational

models of protein-ligand complexes. J Chem Inf Model (2006)

46(1):380-391.

37. Yang JM, Chen YF, Shen TW, Kristal BS, Hsu DF: Consensus scoring

criteria for improving enrichment in virtual screening. J Chem Inf

Model (2005) 45(4):1134-1146.

38. Baber JC, Shirley WA, Gao Y, Feher M: The use of consensus

scoring in ligand-based virtual screening. J Chem Inf Model (2006)

46(1):277-288.

39. Betzi S, Suhre K, Chetrit B, Guerlesquin F, Morelli X: GFscore:

A general nonlinear consensus scoring function for high-

throughput docking. J Chem Inf Model (2006) 46(4):1704-1712.

40. Vidal D, Thormann M, Pons M: A novel search engine for virtual

screening of very large databases. J Chem Inf Model (2006)

46(2):836-843.

41. Joseph-McCarthy D, McFadyen IJ, Zou J, Walker G, Alvarez JC:

Pharmacophore-based molecular docking: A practical guide. In:

Virtual Screening in Drug Discovery. Alvarez JC, Shoichet B (Eds), CRC

Press, Boca Raton, FL, USA (2004).

Page 10

Lead optimization via high-throughput molecular docking Joseph-McCarthy et al 273

42. Joseph-McCarthy D, Thomas BE 4th, Belmarsh M, Moustakas D,

Alvarez JC: Pharmacophore-based molecular docking to account

for ligand flexibility. Proteins (2003) 51(2):172-188.

43. Lorber DM, Shoichet BK: Hierarchical docking of databases of

multiple ligand conformations. Curr Top Med Chem (2005) 5(8):739-

749.

44. Kirchmair J, Laggner C, Wolber G, Langer T: Comparative analysis of

protein-bound ligand conformations with respect to Catalyst's

conformational space subsampling algorithms. Chem Inf Model

(2005) 45(2):422-430.

45. Knox AJ, Meegan MJ, Carta G, Lloyd DG: Considerations in

compound database preparation-"hidden" impact on virtual

screening results. J Chem Inf Model (2005) 45(6):1908-1919.

46. Yoon S, Welsh WJ: Identification of a minimal subset of receptor

conformations for improved multiple conformation docking and

two-step scoring. J Chem Inf Comput Sci (2004) 44(1):88-96.

47. Moustakas DT, Lang PT, Pegg S, Pettersen E, Kuntz ID, Brooijmans N,

Rizzo RC: Development and validation of a modular, extensible

docking program: DOCK 5. J Comput Aided Mol Des (2006) 20(10-

11):601-619.

48. Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT,

Repasky MP, Knoll EH, Shelley M, Perry JK, Shaw DE et al: Glide:

A new approach for rapid, accurate docking and scoring. 1. Method

and assessment of docking accuracy. J Med Chem (2004)

47(7):1739-1749.

49. Halgren TA, Murphy RB, Friesner RA, Beard HS, Frye LL, Pollard WT,

Banks JL: Glide: a new approach for rapid, accurate docking and

scoring. 2. Enrichment factors in database screening. J Med Chem

(2004) 47(7)1750-1759.

50. Verdonk ML, Chessari G, Cole JC, Hartshorn MJ, Murray CW, Nissink

JW, Taylor RD, Taylor R: Modeling water molecules in protein-ligand

docking using GOLD. J Med Chem (2005) 48(20):6504-6515.

51. Erickson JA, Jalaie M, Robertson DH, Lewis RA, Vieth M: Lessons in

molecular recognition: The effects of ligand and protein flexibility

on molecular docking accuracy. J Med Chem (2004) 47(1):45-55.

52. Sherman W, Day T, Jacobson MP, Friesner RA, Farid R: Novel

procedure for modeling ligand/receptor induced fit effects. J Med

Chem (2006) 49(2):534-553.

•• The paper describes a three-step process in which (i) the ligand is first

docked into the original target structure using Glide, and highly flexible

residues mutated to Ala; (ii) the 20 best poses are imported into PRIME, and

with wild-type side chains present the system is minimized; (iii) the 20 new

receptor conformations are used as rigid receptors for a new set of docking

runs.

53. Mizutani MY, Takamatsu Y, Ichinose T, Nakamura K, Itai A: Effective

handling of induced-fit motion in flexible docking. Proteins (2006)

63(4):878-891.

54. Ferrari AM, Wei, BO, Costantino L, Shoichet BK: Soft Docking and

Multiple Receptor Conformations in Virtual Screening. J Med Chem

(2004) 47(12):5076-5084.

55. Claussen H, Buning C, Rarey M, Lengauer T: FlexE: Efficient

molecular docking considering protein structure variations. J Mol

Biol (2001) 308(2):377-395.

56. Polgar T, Keseru GM: Ensemble docking into flexible active sites.

Critical evaluation of FlexE against JNK-3 and β-secretase. J Chem

Inf Model (2006) 46(4):1795-1805.

57. Huang S-Y, Zou X: Ensemble docking of multiple protein structures:

Considering protein structural variations in molecular docking.

Proteins (2007) 66(2):399-421.

58. Cavasotto CN, Kovacs JA, Abagyan RA: Representing receptor

flexibility in ligand docking through relevant normal modes. J Am

Chem Soc (2005) 127(26):9632-9640.

59. Edwards PJ, Allart B, Andrews MJ, Clase JA, Menet C: Expediting

drug discovery: Recent advances in fast medicinal chemistry

optimization of hits and leads. Curr Opin Drug Discovery Dev (2006)

9(4):425-444.

• A review of recent advances in high-throughput synthesis techniques and

their use in optimization.

60.

Duffy EM, Jorgensen WL: Prediction of properties from simulations:

Free energies of solvation in hexadecane, octanol, and water. J Am

Chem Soc (2000) 122(12):2878-2888.

61. Le Bailly de Tilleghem C, Beck B, Boulanger B, Govaerts B: A fast

exchange algorithm for designing focused libraries in lead

optimization. J Chem Inf Model (2005) 45(3):758-767.

62. Truchon JF, Bayly CI: GLARE: A new approach for filtering large

reagent lists in combinatorial library design using product

properties. J Chem Inf Model (2006) 46(4):1536-1548.

63. Schneider G, Fechner U: Computer-based de novo design of drug-

like molecules. Nat Rev Drug Disc (2005) 4(8):649-663.

64. Howard N, Abell C, Blakemore W, Chessari G, Congreve M, Howard S,

Jhoti H, Murray CW, Seavers LCA, van Montfort RL: Application of

fragment screening and fragment linking to the discovery of novel

thrombin inhibitors. J Med Chem (2006) 49(4):1346-1355.

•• Although an inhibitor with an IC

value of 1.4 nM was designed, the

authors note that they did not achieve the optimal ligand efficiency

theoretically possible, and they attribute this to the fact that the optimized

compounds did not fully retain the interactions of the initial fragment.

65. Hartshorn MJ, Murray CW, Cleasby A, Frederickson M, Tickle IJ, Jhoti

H: Fragment-based lead discovery using X-ray crystallography.

J Med Chem (2005) 48(2):403-413.

66. Card GL, Blasdel L, England BP, Zhang C, Suzuki Y, Gillette S, Fong D,

Ibrahim PN, Artis DR, Bollag G, Milburn MV et al: A family of

phosphodiesterase inhibitors discovered by cocrystallography and

scaffold-based drug design. Nat Biotechnol (2005) 23(2):201-207.

67. Ghosh AK, Sridhar PR, Leshchenko S, Hussain AK, Li J, Kovalevsky

AY, Walters DE, Wedekind JE, Grum-Tokars V, Das D, Koh Y et al:

Structure-based design of novel HIV-1 protease inhibitors to

combat drug resistance. J Med Chem (2006) 49(17):5252-5261.

68. Krovat EM, Fruhwirth KH, Langer T: Pharmacophore identification,

in silico screening, and virtual library design for inhibitors of the

human factor X

. J Chem Inf Model (2005) 45(1):146-159.

69. Lu IL, Huang CF, Peng YH, Lin YT, Hsieh HP, Chen CT, Lien TW, Lee

HJ, Mahindroo N, Prakash E, Yueh A et al: Structure-based drug

design of a novel family of PPARγ partial agonists: Virtual

screening, X-ray crystallography, and in vitro/in vivo biological

activities. J Med Chem (2006) 49(9):2703-2712.

70. Trosset JY, Dalvit C, Knapp S, Fasolini M, Veronesi M, Mantegani S,

Gianellini LM, Catana C, Sundstrom M, Stouten PF, Moll JK: Inhibition

of protein-protein interactions: The discovery of druglike β-catenin

inhibitors by combining virtual and biophysical screening. Proteins

(2006) 64(1):60-67.

71.

Moretto AF, Kirincich SJ, Xu WX, Smith MJ, Wan ZK, Wilson DP,

Follows BC, Binnun E, Joseph-McCarthy D, Foreman K, Erbe DV et al:

Bicyclic and tricyclic thiophenes as protein tyrosine phosphatase

1B inhibitors. Bioorg Med Chem (2006) 14(7):2162-2177.

72. Wan ZK, Lee J, Xu W, Erbe DV, Joseph-McCarthy D, Follows BC,

Zhang YL: Monocyclic thiophenes as protein tyrosine phosphatase

1B inhibitors: Capturing interactions with Asp

. Bioorg Med Chem

Lett (2006) 16(18):4941-4945.

73. Condon JS, Joseph-McCarthy D, Levin JI, Lombart H-G, Lovering FE,

Sun L, Wang W, Xu W, Zhang Y: Identification of potent and selective

TACE inhibitors via the S1 pocket. Bioorg Med Chem Lett (2007)

17(1):34-9.

74. Gopalsamy A, Yang H, Ellingboe JW, McKew JC, Tam S, Joseph-

McCarthy D, Zhang W, Shen M, Clark JD: 1,2,4-Oxadiazolidin-3,5-

diones and 1,3,5-triazin-2,4,6-triones as cytosolic phospholipase

A(2)α inhibitors. Bioorg Med Chem Lett (2006) 16(11):2978-2981.

75. Joseph-McCarthy D, Parris K, Huang A, Failli A, Quagliato D, Dushin

EG, Novikova E, Severina E, Tuckman M, Petersen PJ, Dean C et al:

Use of structure-based drug design approaches to obtain novel

anthranilic acid acyl carrier protein synthase inhibitors. J Med

Chem (2005) 48(25):7960-7969.

76. McMartin C, Bohacek RS: QXP: Powerful, rapid computer algorithms

for structure-based drug design. J Comput Aided Mol Des (1997)

11(4):333-344.

Page 11

274 Current Opinion in Drug Discovery & Development 2007 Vol 10 No 3

77. Michel J, Taylor RD, Essex JW: Efficient generalized born models for

Monte Carlo simulations. J Chem Theory Computat (2006) 2(3):

732-739.

78. Lu B, Cheng X, Huang J, McCammon JA: Order N algorithm for

computation of electrostatic interactions in biomolecular systems.

Proc Natl Acad Sci (2006) 103(51):19314-19319.

79. Carlsson J, Aqvist J: Calculations of solute and solvent entropies

from molecular dynamics simulations. Phys Chem Chem Phys

(2006) 8(46):5385-5395.

80. Salaniwal S, Manas ES, Alvarez JC, Unwalla RJ: Critical evaluation of

methods to incorporate entropy loss upon binding in high-

throughput docking. Proteins (2007) 66(2):422-435.

81. Grater F, Schwarzl SM, Dejaegere A, Fischer S, Smith JC:

Protein/ligand binding free energies calculated with quantum

mechanics/molecular mechanics.

J Phys

Chem

B (2005)

109(20):10474-10483.

82. Gilson MK: Sensitivity analysis and charge-optimization for flexible

ligands: Applicability to lead optimization. J Chem Theory Comput

(2006) 2(2):259-270.

83. Hao MH: Theoretical calculation of hydrogen-bonding strength for

drug molecules. J Chem Theory Comput (2006) 2(3):863-872.