Abstract
In this paper, local and global intrinsic dimensionality estimation methods are reviewed. The aim of this paper is to illustrate the capacity of these methods in generating a lower dimensional chemical space with minimum information error. We experimented with five estimation techniques, comprising both local and global estimation methods. Extensive experiments reveal that it is possible to represent chemical compound datasets in three dimensional space. Further, we verified this result by selecting representative molecules and projecting them to 3D space using principal component analysis. Our results demonstrate that the resultant 3D projection preserves spatial relationships among the molecules. The methodology has potential implications for chemoinformatics issues such as diversity, coverage, lead compound selection, etc.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brown, N.: Chemoinformatics – An Introduction for computer Scientists. ACM Computing Surveys 41(2), 8:1–8:36 (2009)
Burden, F.R.: Molecular identification number for substructure searches. J. Chem. Inf. Comput. Sci. 29(3), 225–227 (1989)
Burgess, C.J.C.: Dimension Reduction: A Guided Tour. Foundations and Trends in Machine Learning 2(4), 275–365 (2010)
Consonni, V., Todeschini, R.: Challenges and Advances in Computational Chemistry and Physics (8), 29–102 (2010)
Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press Professional, Inc., San Diego (1990)
Levina, E., Bickel, P.J.: Maximum likelihood estimation of intrinsic dimension. In: Advances in Neural Information Processing Systems, vol. 17. The MIT Press, Cambridge (2004)
Lipinski, C., Hopkins, A.: Chemical space and biology. Nature 432, 855–861 (2004)
Maaten, L.: An Introduction to Dimensionality Reduction Using Matlab, Technical Report MICCIKAT 07-07 (2007)
Pearlman, R.S., Smith, K.M.: Novel Software Tools for chemical Diversity. Perspectives in Drug Discovery and Design, 339–353 (1998)
Todeschini, R., Consonni, V.: Handbook of Molecular Descriptors, vol. (1), pp. 20–100. Wiley-VCH, Weinheim (2002)
Todeschini, R., Consonni, V.: Molecular Descriptors for Chemoinformatics, 2nd Revised and Enlarged edn., vol. (1)(2), pp. 39–77. Wiley-VCH (2009)
Vadapalli, S., Valluri, S.R., Karlapalem, K.: A Simple Yet Effective Data Clustering Algorithm. In: Proceedings of the Sixth International Conference on Data Mining (ICDM 2006) (2006)
ZINC Database, http://zinc.docking.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shukur, M.H., Rani, T.S., Bhavani, S.D., Sastry, G.N., Raju, S.B. (2011). Local and Global Intrinsic Dimensionality Estimation for Better Chemical Space Representation. In: Sombattheera, C., Agarwal, A., Udgata, S.K., Lavangnananda, K. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2011. Lecture Notes in Computer Science(), vol 7080. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25725-4_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-25725-4_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25724-7
Online ISBN: 978-3-642-25725-4
eBook Packages: Computer ScienceComputer Science (R0)