Skip to main content

Selection of Negative Examples for Node Label Prediction Through Fuzzy Clustering Techniques

  • Conference paper
  • First Online:
Advances in Neural Networks (WIRN 2015)

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 54))

Included in the following conference series:

Abstract

Negative examples, which are required for most machine learning methods to infer new predictions, are rarely directly recorded in several real world databases for classification problems. A variety of heuristics for the choice of negative examples have been proposed, ranging from simply under-sampling non positive instances, to the analysis of class taxonomy structures. Here we propose an efficient strategy for selecting negative examples designed for Hopfield networks which exploits the clustering properties of positive instances. The method has been validated on the prediction of protein functions of a model organism.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.genemania.org.

References

  1. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al.: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25–29 (2000)

    Article  Google Scholar 

  2. Bertoni, A., Frasca, M., Valentini, G.: Cosnet: A cost sensitive neural network for semi-supervised learning in graphs. In: Machine Learning and Knowledge Discovery in Databases—European Conference, ECML PKDD 2011, Athens, Greece, 5–9 September 2011. Proceedings, Part I. LNAI, vol. 6911, pp. 219–234. Springer-Verlag (2011)

    Google Scholar 

  3. Bezdek, J.C., Ehrlich, R., Full, W.: Fcm: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2), 191–203 (1984)

    Article  Google Scholar 

  4. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  5. Burghouts, G.J., Schutte, K., Bouma, H., den Hollander, R.J.M.: Selection of negative samples and two-stage combination of multiple features for action detection in thousands of videos. Mach. Vis. Appl. 25(1), 85–98 (2014)

    Article  Google Scholar 

  6. Campello, R.J.G.B., Hruschka, E.R.: A fuzzy extension of the silhouette width criterion for cluster analysis. Fuzzy Sets Syst. 157(21), 2858–2875 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  7. Fagni, T., Sebastiani, F.: On the selection of negative examples for hierarchical text categorization. In: Proceedings of the 3rd Language & Technology Conference (LTC07). pp. 24–28 (2007)

    Google Scholar 

  8. Ferretti, E., Errecalde, M.L., Anderka, M., Stein, B.: On the use of reliable-negatives selection strategies in the PU learning approach for quality flaws prediction in wikipedia. In: 2014 25th International Workshop on Database and Expert Systems Applications (DEXA), pp. 211–215 (2014)

    Google Scholar 

  9. Frasca, M., Bertoni, A., Re, M., Valentini, G.: A neural network algorithm for semi-supervised node label learning from unbalanced data. Neural Netw. 43, 84–98 (2013)

    Article  MATH  Google Scholar 

  10. Gomez, S.M., Noble, W.S., Rzhetsky, A.: Learning to predict protein-protein interactions from protein sequences. Bioinformatics 19(15), 1875–1881 (2003)

    Google Scholar 

  11. Hopfield, J.J.: Neural networks and physical systems with emergent collective compatational abilities. Proc. Natl. Acad. Sci. 79(8), 2554–2558 (1982)

    Article  MathSciNet  Google Scholar 

  12. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data. Wiley, New York (1990)

    Book  Google Scholar 

  13. Lin, H.T., Lin, C.J., Weng, R.C.: A note on Platt’s probabilistic outputs for support vector machines. Mach. Learn. 68(3), 267–276 (2007)

    Article  Google Scholar 

  14. Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.S.: Building text classifiers using positive and unlabeled examples. In: Third IEEE International Conference on Data Mining, 2003. ICDM 2003. pp. 179–186 (2003)

    Google Scholar 

  15. Lovász, L.: Random walks on graphs: A survey. In: Combinatorics, Paul Erdős is Eighty. pp. 353–397 (1993)

    Google Scholar 

  16. Marshall, E.: Getting the noise out of gene arrays. Science 306(5696), 630–631 (2004)

    Article  Google Scholar 

  17. Mostafavi, S., Goldenberg, A., Morris, Q.: Labeling nodes using three degrees of propagation. PLoS ONE 7(12), e51947 (2012)

    Article  Google Scholar 

  18. Mostafavi, S., Morris, Q.: Using the gene ontology hierarchy when predicting gene function. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. pp. 419–427 (2009)

    Google Scholar 

  19. Wilcoxon, F.: Individual comparisons by ranking methods. Biom. Bull. 1, 80–83 (1945)

    Article  Google Scholar 

  20. Youngs, N., Penfold-Brown, D., Drew, K., Shasha, D., Bonneau, R.: Parametric bayesian priors and better choice of negative examples improve protein function prediction. Bioinformatics 29(9), tt10–98 (2013)

    Google Scholar 

  21. Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML. pp. 912–919 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Frasca .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Frasca, M., Malchiodi, D. (2016). Selection of Negative Examples for Node Label Prediction Through Fuzzy Clustering Techniques. In: Bassis, S., Esposito, A., Morabito, F., Pasero, E. (eds) Advances in Neural Networks. WIRN 2015. Smart Innovation, Systems and Technologies, vol 54. Springer, Cham. https://doi.org/10.1007/978-3-319-33747-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-33747-0_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-33746-3

  • Online ISBN: 978-3-319-33747-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics