Markov Blanket Discovery in Positive-Unlabelled and Semi-supervised Data

Sechidis, Konstantinos; Brown, Gavin

doi:10.1007/978-3-319-23528-8_22

Konstantinos Sechidis¹⁰ &
Gavin Brown¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9284))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

4779 Accesses
3 Citations

Abstract

The importance of Markov blanket discovery algorithms is twofold: as the main building block in constraint-based structure learning of Bayesian network algorithms and as a technique to derive the optimal set of features in filter feature selection approaches. Equally, learning from partially labelled data is a crucial and demanding area of machine learning, and extending techniques from fully to partially supervised scenarios is a challenging problem. While there are many different algorithms to derive the Markov blanket of fully supervised nodes, the partially-labelled problem is far more challenging, and there is a lack of principled approaches in the literature. Our work derives a generalization of the conditional tests of independence for partially labelled binary target variables, which can handle the two main partially labelled scenarios: positive-unlabelled and semi-supervised. The result is a significantly deeper understanding of how to control false negative errors in Markov Blanket discovery procedures and how unlabelled data can help.

Download to read the full chapter text

Chapter PDF

Efficient Discovery of Expressive Multi-label Rules Using Relaxed Pruning

Extremely Randomized CNets for Multi-label Classification

Learning rules for multi-label classification: a stacking and a separate-and-conquer approach

Article 29 April 2016

Keywords

References

Agresti, A.: Categorical Data Analysis. Wiley Series in Probability and Statistics, 3rd edn. Wiley-Interscience (2013)
Google Scholar
Aliferis, C.F., Statnikov, A., Tsamardinos, I., Mani, S., Koutsoukos, X.D.: Local causal and Markov blan. induction for causal discovery and feat. selection for classification part I: Algor. and empirical eval. JMLR 11, 171–234 (2010)
MathSciNet MATH Google Scholar
Allison, P.: Missing Data. Sage University Papers Series on Quantitative Applications in the Social Sciences, 07–136 (2001)
Google Scholar
Bacciu, D., Etchells, T., Lisboa, P., Whittaker, J.: Efficient identification of independence networks using mutual information. Comp. Stats 28(2), 621–646 (2013)
Article MathSciNet MATH Google Scholar
Brown, G., Pocock, A., Zhao, M.J., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. The Journal of Machine Learning Research (JMLR) 13(1), 27–66 (2012)
MATH Google Scholar
Cai, R., Zhang, Z., Hao, Z.: BASSUM: A Bayesian semi-supervised method for classification feature selection. Pattern Recognition 44(4), 811–820 (2011)
Article MATH Google Scholar
Cohen, J.: Statistical Power Analysis for the Behavioral Sciences, 2nd edn. Routledge Academic (1988)
Google Scholar
Cover, T.M., Thomas, J.A.: Elements of information theory. J. Wiley & Sons (2006)
Google Scholar
Elkan, C., Noto, K.: Learning classifiers from only positive and unlabeled data. In: ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (2008)
Google Scholar
Koller, D., Sahami, M.: Toward optimal feature selection. In: International Conference of Machine Learning (ICML), pp. 284–292 (1996)
Google Scholar
Lawrence, N.D., Jordan, M.I.: Gaussian processes and the null-category noise model. In: Semi-Supervised Learning, chap. 8, pp. 137–150. MIT Press (2006)
Google Scholar
Margaritis, D., Thrun, S.: Bayesian network induction via local neighborhoods. In: NIPS, pp. 505–511. MIT Press (1999)
Google Scholar
Mohan, K., Van den Broeck, G., Choi, A., Pearl, J.: Efficient algorithms for bayesian network parameter learning from incomplete data. In: Conference on Uncertainty in Artificial Intelligence (UAI) (2015)
Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., San Francisco (1988)
Google Scholar
Pellet, J.P., Elisseeff, A.: Using Markov blankets for causal structure learning. The Journal of Machine Learning Research (JMLR) 9, 1295–1342 (2008)
MathSciNet MATH Google Scholar
Plessis, M.C.d., Sugiyama, M.: Semi-supervised learning of class balance under class-prior change by distribution matching. In: 29th ICML (2012)
Google Scholar
Pocock, A., Luján, M., Brown, G.: Informative priors for Markov blanket discovery. In: 15th AISTATS (2012)
Google Scholar
Rosset, S., Zhu, J., Zou, H., Hastie, T.J.: A method for inferring label sampling mechanisms in semi-supervised learning. In: NIPS (2004)
Google Scholar
Sechidis, K., Calvo, B., Brown, G.: Statistical hypothesis testing in positive unlabelled data. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014, Part III. LNCS, vol. 8726, pp. 66–81. Springer, Heidelberg (2014)
Google Scholar
Smith, A.T., Elkan, C.: Making generative classifiers robust to selection bias. In: 13th ACM SIGKDD Inter. Conf. on Knwl. Disc. and Data Min., pp. 657–666 (2007)
Google Scholar
Tsamardinos, I., Aliferis, C.F.: Towards principled feature selection: relevancy, filters and wrappers. In: AISTATS (2003)
Google Scholar
Tsamardinos, I., Aliferis, C.F., Statnikov, A.: Time and sample efficient discovery of Markov blankets and direct causal relations. In: ACM SIGKDD (2003)
Google Scholar
Yaramakala, S., Margaritis, D.: Speculative Markov blanket discovery for optimal feature selection. In: 5th ICDM. IEEE (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, University of Manchester, Manchester, M13 9PL, UK
Konstantinos Sechidis & Gavin Brown

Authors

Konstantinos Sechidis
View author publications
You can also search for this author in PubMed Google Scholar
Gavin Brown
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Konstantinos Sechidis .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Porto, Porto, Portugal
Pedro Pereira Rodrigues
University of Porto - CRACS/INESC TEC, Porto, Portugal
Vítor Santos Costa
University of Porto - INESC TEC, Porto, Portugal
Carlos Soares
University of Porto - INESC TEC, Porto, Portugal
João Gama
University of Porto - INESC TEC, Porto, Portugal
Alípio Jorge

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sechidis, K., Brown, G. (2015). Markov Blanket Discovery in Positive-Unlabelled and Semi-supervised Data. In: Appice, A., Rodrigues, P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science(), vol 9284. Springer, Cham. https://doi.org/10.1007/978-3-319-23528-8_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-23528-8_22
Published: 29 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23527-1
Online ISBN: 978-3-319-23528-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Markov Blanket Discovery in Positive-Unlabelled and Semi-supervised Data

Abstract

Chapter PDF

Similar content being viewed by others

Efficient Discovery of Expressive Multi-label Rules Using Relaxed Pruning

Extremely Randomized CNets for Multi-label Classification

Learning rules for multi-label classification: a stacking and a separate-and-conquer approach

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Markov Blanket Discovery in Positive-Unlabelled and Semi-supervised Data

Abstract

Chapter PDF

Similar content being viewed by others

Efficient Discovery of Expressive Multi-label Rules Using Relaxed Pruning

Extremely Randomized CNets for Multi-label Classification

Learning rules for multi-label classification: a stacking and a separate-and-conquer approach

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation