FABERT: A Feature Aggregation BERT-Based Model for Document Reranking

Zhu, Xiaozhi; Wong, Leung-Pun; Lee, Lap-Kei; Liu, Hai; Hao, Tianyong

doi:10.1007/978-3-030-88483-3_11

Xiaozhi Zhu¹²,
Leung-Pun Wong¹³,
Lap-Kei Lee¹³,
Hai Liu¹² &
…
Tianyong Hao¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13029))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

1565 Accesses

Abstract

In a document reranking task, pre-trained language models such as BERT have been successfully applied due to their powerful capability in extracting informative features from queries and candidate answers. However, these language models always generate discriminative features and pay less attention to generalized features which contain shared information of query-answer pairs to assist question answering. In this paper, we propose a BERT-based model named FABERT by integrating both discriminative features and generalized features produced by a gradient reverse layer into one answer vector with an attention mechanism for document reranking. Extensive experiments on the MS MARCO passage ranking task and TREC Robust dataset show that FABERT outperforms baseline methods including a feature projection method which projects existing feature vectors into the orthogonal space of generalized feature vector to eliminate common information of generalized feature vectors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 4171–4186 (2019)
Google Scholar
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)
Google Scholar
Dai, Z., Xiong, C., Callan, J., Liu, Z.: Convolutional neural networks for soft-matching n-grams in ad-hoc search. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 126–134 (2018)
Google Scholar
Robertson, S., Zaragoza, H., et al.: The probabilistic relevance framework: BM25 and beyond. Found. Trends® Inf. Retr. 3(4), 333–389 (2009)
Google Scholar
Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 55–64 (2016)
Google Scholar
Hui, K., Yates, A., Berberich, K., De Melo, G.: CO-PACRR: a context-aware neural IR model for ad-hoc retrieval. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 279–287 (2018)
Google Scholar
Mitra, B., Craswell, N.: An updated duet model for passage re-ranking. arXiv preprint arXiv:1903.07666 (2019)
Nogueira, R., Cho, K.: Passage re-ranking with BERT. arXiv preprint arXiv:1901.04085 (2019)
Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020)
Ding, Y.Q.Y., et al.: RocketQA: an optimized training approach to dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2010.08191 (2020)
Nogueira, R., Yang, W., Cho, K., Lin, J.: Multi-stage document ranking with BERT. arXiv preprint arXiv:1910.14424 (2019)
Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., Lempitsky, V.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 2096–2130 (2016)
MathSciNet MATH Google Scholar
Zhang, K., Zhang, H., Liu, Q., Zhao, H., Zhu, H., Chen, E.: Interactive attention transfer network for cross-domain sentiment classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 5773–5780 (2019)
Google Scholar
Du, C., Sun, H., Wang, J., Qi, Q., Liao, J.: Adversarial and domain-aware BERT for cross-domain sentiment analysis. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. pp. 4019–4028 (2020)
Google Scholar
Qin, Q., Hu, W., Liu, B.: Feature projection for improved text classification. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8161–8171 (2020)
Google Scholar
MacAvaney, S., Yates, A., Cohan, A., Goharian, N.: CEDR: contextualized embeddings for document ranking. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1101–1104 (2019)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010 (2017)
Google Scholar
Li, C., Yates, A., MacAvaney, S., He, B., Sun, Y.: PARADE: passage representation aggregation for document reranking. arXiv preprint arXiv:2008.09093 (2020)
Kumar, P., Brahma, D., Karnick, H., Rai, P.: Deep attentive ranking networks for learning to order sentences. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8115–8122 (2020)
Google Scholar
Belinkov, Y., Poliak, A., Shieber, S.M., Van Durme, B., Rush, A.M.: On adversarial removal of hypothesis-only bias in natural language inference. In: Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (* SEM 2019), pp. 256–262 (2019)
Google Scholar
Grand, G., Belinkov, Y.: Adversarial regularization for visual question answering: Strengths, shortcomings, and side effects. In: Proceedings of the Second Workshop on Shortcomings in Vision and Language, pp. 1–13 (2019)
Google Scholar
Nguyen, T., et al.: MS MARCO: a human generated machine reading comprehension dataset. In: CoCo@ NIPS (2016)
Google Scholar
Voorhees, E.M.: Overview of the TREC 2004 robust retrieval track (2004)
Google Scholar
Yang, P., Fang, H., Lin, J.: Anserini: enabling the use of Lucene for information retrieval research. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1253–1256 (2017)
Google Scholar
Xiong, C., et al.: CMT in TREC-COVID round 2: mitigating the generalization gaps from web to special domain search. arXiv preprint arXiv:2011.01580 (2020)

Download references

Author information

Authors and Affiliations

School of Computer Science, South China Normal University, Guangzhou, China
Xiaozhi Zhu, Hai Liu & Tianyong Hao
The Open University of Hong Kong, Hong Kong, China
Leung-Pun Wong & Lap-Kei Lee

Authors

Xiaozhi Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Leung-Pun Wong
View author publications
You can also search for this author in PubMed Google Scholar
Lap-Kei Lee
View author publications
You can also search for this author in PubMed Google Scholar
Hai Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tianyong Hao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tianyong Hao .

Editor information

Editors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Lu Wang
Peking University, Beijing, China
Yansong Feng
Soochow University, Suzhou, China
Yu Hong
Tianjin University, Tianjin, China
Ruifang He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, X., Wong, LP., Lee, LK., Liu, H., Hao, T. (2021). FABERT: A Feature Aggregation BERT-Based Model for Document Reranking. In: Wang, L., Feng, Y., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2021. Lecture Notes in Computer Science(), vol 13029. Springer, Cham. https://doi.org/10.1007/978-3-030-88483-3_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-88483-3_11
Published: 06 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88482-6
Online ISBN: 978-3-030-88483-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)