Using Recursive Partitioning Analysis to Evaluate Compound Selection Methods

Young, S. Stanley; Hawkins, Douglas M.

doi:10.1385/1-59259-802-1:317

S. Stanley Young³ &
Douglas M. Hawkins⁴

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 275))

1204 Accesses
11 Citations

Abstract

The design and analysis of a screening set for high throughput screening is complex. We examine three statistical strategies for compound selection, random, clustering, and space-filling. We examine two types of chemical descriptors, BCUTs and principal components of Dragon Constitutional descriptors. Based on the predictive power of multiple tree recursive partitioning, we reached the following tentative conclusions. Random designs appear to be as good as clustering and space-filling designs. For analysis, BCUTs appear to be better than principal components scores based upon Constitutional Descriptors. We confirm previous results that model-based selection of compounds can lead to improved screening hit rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ishikawa, K. (1986) Guide to quality control, Productivity, Inc., Shelton, CT. See also, http://www.hci.com.au/hcisite2/toolkit/causeand/htm.
Google Scholar
Lam, R. L. H., Welch, W. J., and Young, S. S. (2002) Uniform coverage designs for molecule selection. Technometrics 44, 99–109.
Article Google Scholar
Hawkins, D. M., Young, S. S., and Rusinko, A. (1997) Analysis of a large structure-activity data set using recursive partitioning. Quantitaive Structure-Activity Relationship 16, 296–302.
Article CAS Google Scholar
Rusinko, A. III, Farmen, M. W., Lambert, C. G., Brown, P. L., and Young, S. S. (1999) Analysis of a large structure/biological activity data set using recursive partitioning. J. Chem. Inf. Comput. Sci. 39, 1017–1026.
PubMed CAS Google Scholar
van Rhee, A. M., Stocker, J., Printzenhoff, D., Creeh, C., Wagoner, P. K., and Spear, K. L. (2001) Retrospective analysis of an experimental high-throughput screening data set by recursive partitioning. J. Comb. Chem. 3, 267–277.
Article PubMed Google Scholar
Abt, M., Lim, Y-B., Sacks, J., Xie, M., and Young, S. S. (2001) A sequential approach for identifying lead compounds in large chemical databases. Stat. Sci. 16, 154–168.
Article Google Scholar
Engels, M. F., and Venkatarangan, P. (2001) Smart screening: approaches to efficient HTS. Current Opinion Drug Discovery & Development 4, 275–283.
CAS Google Scholar
Xu, J. and Hagler, A. (2002) Review: chemoinformatics and drug discovery. Molecules 7, 566–600.
Article CAS Google Scholar
Hawkins, D. M. and Kass, G. V. (1982) Automatic interaction detection. In Topics in applied multivariate analysis, Hawkins, D. M. (ed.), Cambridge Univ. Press, pp. 269–302.
Google Scholar
Breiman, L., Friedman, J., Olshen, R. A., and Stone, C. J. (1984) Classification and regression trees. Wadsworth, New York, NY.
Google Scholar
Quinlan, J. R. (1992) C4.5 programs for machine learning. Morgan Kaufmann Publishers, San Mateo, CA.
Google Scholar
Burden, F. R. (1989) Molecular identification number for substructure searches. J. Chem. Inf. Comput. Sci. 29, 225–227.
CAS Google Scholar
Pearlman, R. S. and Smith, K. M. (1999) Metric validation and the receptor-relevant subspace concept. J. Chem. Inf. Comput. Sci. 39, 28–35.
CAS Google Scholar
Westfall, P. H. and Young, S. S. (1993) Resampling-based multiple testing. Wiley, New York, NY.
Google Scholar
Hawkins, D. M. and Musser, B. J. (1999) One tree or a forest? Alternative dendrographic models. Computing Science and Statistics 30, 534–542
Google Scholar
FIRMPlus® http://www.goldenhelix.com.
Breiman, L. (2001) Statistical modeling: the two cultures. Stat. Sci. 16, 199–231.
Article Google Scholar
Stanton, D. T. (1999) Evaluation and use of BCUT descriptors in QSAR and QSPR studies. Chem. Inf. Comput. Sci. 39, 11–20.
CAS Google Scholar
Lam, R. L. H. (2001) Design and analysis of large chemical databases for drug discovery. Ph.D. Dissertation, University of Waterloo.
Google Scholar
Yi, B., Hughes-Oliver, J. M., Zhu, L., and Young, S. S. (2002) A factorial design to optimize cell-based drug discovery analysis. J. Chem. Inf. Comput. Sci. 42, 1221–1229.
PubMed CAS Google Scholar
Dragon, http://www.disat.unimib.it/chm/Dragon.
Burden, F. R., and Winkler, D. A. (2000) A quantitative structure-activity relationships model for the acute toxicity of substituted benzenes to Tetrahymena pyriformis using Bayesian-regularized neural networks. Chem. Res. Toxicol. 13, 436–440.
Article PubMed CAS Google Scholar
Jones-Hertzog, D. K., Mukhopadhyay, P., Keefer, C. E., and Young, S. S. (1999) Use of recursive partitioning in the sequential screening of G-protein-coupled receptors. J. Pharmacol. Toxicol. 42, 207–215.
Article CAS Google Scholar
Young, S. S., Farmen, M., and Rusinko, A. III. Random versus rational: Which is better for general compound screening? http://www.netsci.org/Science/Screening/feature09.

Download references

Author information

Authors and Affiliations

National Institute of Statistical Sciences, Research Triangle Park, North Carolina, USA
S. Stanley Young
School of Statistics, University of Minnesota, Minneapolis, Minnesota, USA
Douglas M. Hawkins

Authors

S. Stanley Young
View author publications
You can also search for this author in PubMed Google Scholar
Douglas M. Hawkins
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Albany Molecular Research Inc., Bothell Research Center, Bothell, WA
Jürgen Bajorath
University of Washington, Seattle, WA
Jürgen Bajorath

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Young, S.S., Hawkins, D.M. (2004). Using Recursive Partitioning Analysis to Evaluate Compound Selection Methods. In: Bajorath, J. (eds) Chemoinformatics. Methods in Molecular Biology™, vol 275. Humana Press. https://doi.org/10.1385/1-59259-802-1:317

Download citation

DOI: https://doi.org/10.1385/1-59259-802-1:317
Publisher Name: Humana Press
Print ISBN: 978-1-58829-261-2
Online ISBN: 978-1-59259-802-1
eBook Packages: Springer Protocols

Publish with us

Policies and ethics