Skip to main content

Discriminative Feature Adaptation via Conditional Mean Discrepancy for Cross-Domain Text Classification

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12682))

Included in the following conference series:

  • 2839 Accesses

Abstract

This paper concerns the problem of Unsupervised Domain Adaptation (UDA) in text classification, aiming to transfer the knowledge from a source domain to a different but related target domain. Previous methods learn the discriminative feature of target domain in terms of noisy pseudo labels, which inevitably produces negative effects on training a robust model. In this paper, we propose a novel criterion Conditional Mean Discrepancy (CMD) to learn the discriminative features by matching the conditional distributions across domains. CMD embeds both the conditional distributions of source and target domains into tensor-product Hilbert space and computes Hilbert-Schmidt norm instead. We shed a new light on discriminative feature adaptation: the collective knowledge of discriminative features of different domains is naturally discovered by minimizing CMD. We propose Aligned Adaptation Networks (AAN) to learn the domain-invariant and discriminative features simultaneously based on Maximum Mean Discrepancy (MMD) and CMD. Meanwhile, to trade off between the marginal and conditional distributions, we further maximize both MMD and CMD criterions using adversarial strategy to make the features of AAN more discrepancy-invariant. To the best of our knowledge, this is the first work to definitely evaluate the shifts in the conditional distributions across domains. Experiments on cross-domain text classification demonstrate that AAN achieves better classification accuracy but less convergence time compared to the state-of-the-art deep methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://jmcauley.ucsd.edu/data/amazon/.

  2. 2.

    https://github.com/huggingface/transformers.

References

  1. Berlinet, A., Thomas-Agnan, C.: Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer, Dordrecht (2011). https://doi.org/10.1007/978-1-4419-9096-9

    Book  MATH  Google Scholar 

  2. Borgwardt, K.M., Gretton, A., Rasch, M.J., Kriegel, H., Scholkopf, B., Smola, A.J.: Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22(14), 49–57 (2006)

    Article  Google Scholar 

  3. Chen, M., Xu, Z., Weinberger, K.Q., Sha, F.: Marginalized denoising autoencoders for domain adaptation. In: Proceedings of the 29th International Conference on International Conference on Machine Learning, pp. 1627–1634 (2012)

    Google Scholar 

  4. Chen, X., Sun, Y., Athiwaratkun, B., Cardie, C., Weinberger, K.: Adversarial deep averaging networks for cross-lingual sentiment classification. Trans. Assoc. Comput. Linguist. 6, 557–570 (2018)

    Article  Google Scholar 

  5. Cho, K., et al.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734 (2014)

    Google Scholar 

  6. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT 2019, pp. 4171–4186. Association for Computational Linguistics (2019)

    Google Scholar 

  7. Fang, X., Bai, H., Guo, Z., Shen, B., Hoi, S., Xu, Z.: DART: domain-adversarial residual-transfer networks for unsupervised cross-domain image classification. Neural Netw. 127, 182–192 (2020)

    Article  Google Scholar 

  8. Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: International Conference on Machine Learning, pp. 1180–1189 (2015)

    Google Scholar 

  9. Ge, Y., Chen, D., Li, H.: Mutual mean-teaching: pseudo label refinery for unsupervised domain adaptation on person re-identification. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020 (2020)

    Google Scholar 

  10. Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2066–2073. IEEE (2012)

    Google Scholar 

  11. Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.: A kernel two-sample test. J. Mach. Learn. Res. 13, 723–773 (2012)

    MathSciNet  MATH  Google Scholar 

  12. Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017)

  13. Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP 2014, pp. 1746–1751. ACL (2014)

    Google Scholar 

  14. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv e-prints arXiv:1412.6980 (December 2014)

  15. Long, M., Cao, Z., Wang, J., Jordan, M.I.: Conditional adversarial domain adaptation. In: Advances in Neural Information Processing Systems, pp. 1640–1650 (2018)

    Google Scholar 

  16. Long, M., Wang, J., Ding, G., Sun, J., Yu, P.S.: Transfer feature learning with joint distribution adaptation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2200–2207 (2013)

    Google Scholar 

  17. Long, M., Zhu, H., Wang, J., Jordan, M.I.: Deep transfer learning with joint adaptation networks. In: International Conference on Machine Learning, pp. 2208–2217 (2017)

    Google Scholar 

  18. Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  19. Muandet, K., Fukumizu, K., Sriperumbudur, B., Schölkopf, B., et al.: Kernel mean embedding of distributions: a review and beyond. Found. Trends Mach. Learn. 10(1–2), 1–141 (2017)

    Article  Google Scholar 

  20. Pan, S.J., Tsang, I.W., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 22(2), 199–210 (2010)

    Article  Google Scholar 

  21. Pei, Z., Cao, Z., Long, M., Wang, J.: Multi-adversarial domain adaptation. In: 32nd AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  22. Saito, K., Ushiku, Y., Harada, T.: Asymmetric tri-training for unsupervised domain adaptation. In: International Conference on Machine Learning, pp. 2988–2997 (2017)

    Google Scholar 

  23. Song, L., Huang, J., Smola, A., Fukumizu, K.: Hilbert space embeddings of conditional distributions with applications to dynamical systems. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 961–968 (2009)

    Google Scholar 

  24. Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance. arXiv preprint arXiv:1412.3474 (2014)

  25. Wang, J., Chen, Y., Hao, S., Feng, W., Shen, Z.: Balanced distribution adaptation for transfer learning. In: 2017 IEEE International Conference on Data Mining (ICDM), pp. 1129–1134. IEEE (2017)

    Google Scholar 

  26. Wang, J., Feng, W., Chen, Y., Yu, H., Huang, M., Yu, P.S.: Visual domain adaptation with manifold embedded distribution alignment. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 402–410 (2018)

    Google Scholar 

  27. Wolf, T., et al.: HuggingFace’s transformers: state-of-the-art natural language processing. arXiv arXiv:abs/1910.03771 (2019)

  28. Ye, H., Tan, Q., He, R., Li, J., Ng, H.T., Bing, L.: Feature adaptation of pre-trained language models across languages and domains for text classification. In: Empirical Methods in Natural Language Processing (2020)

    Google Scholar 

  29. Yu, C., Wang, J., Chen, Y., Huang, M.: Transfer learning with dynamic adversarial adaptation network. In: International Conference on Data Mining, pp. 778–786. IEEE (2019)

    Google Scholar 

  30. Zou, Y., Yu, Z., Liu, X., Kumar, B., Wang, J.: Confidence regularized self-training. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5982–5991 (2019)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by Fund of the State Key Laboratory of Software Development Environment and in part by the Open Research Fund from Shenzhen Research Institute of Big Data (No. 2019ORF01012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoming Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, B., Zhang, X., Liu, Y., Chen, L. (2021). Discriminative Feature Adaptation via Conditional Mean Discrepancy for Cross-Domain Text Classification. In: Jensen, C.S., et al. Database Systems for Advanced Applications. DASFAA 2021. Lecture Notes in Computer Science(), vol 12682. Springer, Cham. https://doi.org/10.1007/978-3-030-73197-7_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-73197-7_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-73196-0

  • Online ISBN: 978-3-030-73197-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics