Improving completeness and consistency of co-reference annotation standard

Xu, Yang; Farha, Fadi; Wan, Yueliang; Xu, Jiabo; Liu, Hong; Ning, Huansheng

doi:10.1007/s11276-022-03077-8

Improving completeness and consistency of co-reference annotation standard

Published: 09 August 2022

(2022)
Cite this article

Wireless Networks Aims and scope Submit manuscript

144 Accesses
Explore all metrics

Abstract

As the processing power of mobile terminals increases, wireless network applications such as voice assistants can put more context-sensitive tasks on the mobile terminals, thus reducing the wireless network bandwidth needed and the cost of data storage in the cloud. Co-reference annotation, identifying the same semantics in context, is one of the critical techniques in these tasks. However, there are some problems with the existing co-reference annotation standards. First, the annotation is incomplete. Second, the types of annotated mentions are inconsistent. Third, there are currently no metrics for the above characteristics. Analyzing the above-mentioned issues, this paper proposes a new co-reference annotation standard. The new standard can annotate more semantics and co-reference relations and only adopts two types of mentions for annotation. Meanwhile, this paper presents a performance evaluation corpus and designs three performance metrics for evaluating the new standard according to the completeness of semantic annotation, the completeness of co-reference annotation, and the consistency of mention. The experiment shows that the new standard outperforms all the baseline methods and achieves 0.95 in the completeness of semantic annotation, 0.68 in the completeness of co-reference annotation, and 0.57 in the consistency of types of mentions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Co-mention and Context-Based Entity Linking

Exploiting Semantics from Ontologies to Enhance Accuracy of Similarity Measures

Web Person Disambiguation Using Hierarchical Co-reference Model

Data availability

All data generated or analyzed during this study are included in this published article and its supplementary information file.

References

Cybulska, A., & Vossen, P. (2014). Guidelines for ECB+ annotation of events and their coreference. Retrieved from http://www.newsreader-project.eu/files/2013/01/NWR-2014-1.pdf
Barhom, S., Shwartz, V., Eirew, A., Bugert, M., Reimers, N., & Dagan, I. (2019). Revisiting joint modeling of cross-document entity and event coreference resolution. In Proceedings of the 57th annual meeting of the association for computational linguistics (pp. 4179–4189). Presented at the ACL 2019, Florence, Italy: Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1409
Soon, W. M., Ng, H. T., & Lim, D. C. Y. (2001). A machine learning approach to coreference resolution of noun phrases. Computational linguistics, 27(4), 521–544. https://doi.org/10.1162/089120101753342653
Article Google Scholar
Moosavi, N. S., & Strube, M. (2017). Lexical features in coreference resolution: To be used with caution. In Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 2: Short Papers) (Vol. 2, pp. 14–19). Presented at the ACL 2017, Vancouver, Canada: Association for computational linguistics. https://doi.org/10.18653/v1/P17-2003
Xu, Y., Xia, B., Wan, Y., Zhang, F., Xu, J., & Ning, H. (2021). CDCAT: A multi-language cross-document entity and event coreference annotation tool. Tsinghua Science and Technology, 27(3), 589–598. https://doi.org/10.26599/TST.2020.9010060
Article Google Scholar
Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., & Zhang, Y. (2012). CoNLL-2012 shared task: Modeling multilingual unrestricted coreference in OntoNotes. In Proceedings of the shared task: Modeling multilingual unrestricted coreference in OntoNotes (pp. 1–40). Presented at the joint conference on EMNLP and CoNLL, Jeju Island, Korea: Association for Computational Linguistics. Retrieved from https://aclanthology.org/W12-4501
Wu, W., Wang, F., Yuan, A., Wu, F., & Li, J. (2020). CorefQA: Coreference resolution as query-based span prediction. In Proceedings of the 58th annual meeting of the association for computational linguistics. Presented at the ACL 2020, Online. Retrieved from https://virtual.acl2020.org/paper_main.622.html
Luan, Y., He, L., Ostendorf, M., & Hajishirzi, H. (2018). Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction. In Proceedings of the 2018 conference on empirical methods in natural language processing (pp. 3219–3232). Presented at the EMNLP 2018, Brussels, Belgium: Association for Computational Linguistics. https://doi.org/10.18653/v1/D18-1360
Kang, Y., Ou, R., Zhang, Y., Li, H., & Tian, S. (2022). PG-CODE: Latent dirichlet allocation embedded policy knowledge graph for government department coordination. Tsinghua Science and Technology, 27(4), 680–691. https://doi.org/10.26599/TST.2021.9010059
Article Google Scholar
Liao, X., Zheng, D., & Cao, X. (2021). Coronavirus pandemic analysis through tripartite graph clustering in online social networks. Big Data Mining and Analytics, 4(4), 242–251. https://doi.org/10.26599/BDMA.2021.9020010
Article Google Scholar
Humphreys, K., Gaizauskas, R., & Azzam, S. (1997). Event coreference for information extraction. In Proceedings of a workshop on operational factors in practical, robust anaphora resolution for unrestricted texts (pp. 75–81). Madrid, Spain. https://doi.org/10.3115/1598819.1598830
Xiong, A., Liu, D., Tian, H., Liu, Z., Yu, P., & Kadoch, M. (2021). News keyword extraction algorithm based on semantic clustering and word graph model. Tsinghua Science and Technology, 26(6), 886–893. https://doi.org/10.26599/TST.2020.9010051
Article Google Scholar
Peng, C., Zhang, C., Xue, X., Gao, J., Liang, H., & Niu, Z. (2022). Cross-modal complementary network with hierarchical fusion for multimodal sentiment classification. Tsinghua Science and Technology, 27(4), 664–679. https://doi.org/10.26599/TST.2021.9010055
Article Google Scholar
Bai, H., Yang, Y., & Wang, J. (2022). Exploiting more associations between slots for multi-domain dialog state tracking. Big Data Mining and Analytics, 5(1), 41–52.
Article Google Scholar
Cybulska, A., & Vossen, P. (2014). Using a sledgehammer to crack a nut? Lexical diversity and event coreference resolution. In Proceedings of the ninth international conference on language resources and evaluation (pp. 4545–4552). Presented at the LREC 2014. Retrieved from http://www.lrec-conf.org/proceedings/lrec2014/pdf/840_Paper.pdf
Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., & Weischedel, R. (2006). OntoNotes: The 90% Solution. In Proceedings of the human language technology conference of the NAACL, companion volume: Short papers (pp. 57–60). Presented at the HLT-NAACL 2006, New York City, USA: Association for Computational Linguistics. https://doi.org/10.3115/1614049.1614064
Zeldes, A. (2017). The GUM corpus: Creating multilayer resources in the classroom. Language Resources and Evaluation, 51(3), 581–612. https://doi.org/10.1007/s10579-016-9343-x
Article Google Scholar
Walker, C., Strassel, S., Medero, J., & Maeda, K. (2006). ACE 2005 multilingual training corpus. Retrieved April 10, 2022 from https://catalog.ldc.upenn.edu/LDC2006T06
Bhardwaj, N., & Sharma, P. (2021). An advanced uncertainty measure using fuzzy soft sets: application to decision-making problems. Big Data Mining and Analytics, 4(2), 94–103. https://doi.org/10.26599/BDMA.2020.9020020
Article Google Scholar
McNamee, P., & Dang, H. T. (2009). Overview of the TAC 2009 knowledge base population track. In Text analysis conference (TAC) (pp. 111–113).
Bagga, A., & Baldwin, B. (1998). Entity-based cross-document coreferencing using the vector space model. In Proceedings of the 17th international conference on Computational linguistics (Vol. 1). Presented at the COLING 1998, Montreal, Quebec, Canada. https://doi.org/10.3115/980845.980859
Sandhaus, E. (2008). The New York times annotated corpus. Linguistic Data Consortium. https://doi.org/10.35111/77BA-9X74
Book Google Scholar
Lu, J., & Ng, V. (2018). Event coreference resolution: A survey of two decades of research. In Proceedings of the twenty-seventh international joint conference on artificial intelligence (pp. 5479–5486). Presented at the IJCAI-18, Stockholm, Sweden. https://doi.org/10.24963/ijcai.2018/773

Download references

Acknowledgements

The authors would like to thank the editors and the reviewers who made valuable comments that helped us improve this paper.

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
Yang Xu & Huansheng Ning
Faculty of Informatics Engineering, Aleppo University, Aleppo, Syria
Fadi Farha
Research Institute with Run Technologies Company, Ltd., Beijing, China
Yueliang Wan
Beijing Engineering Research Center for Cyberspace Data Analysis and Applications, Beijing, China
Yueliang Wan & Huansheng Ning
School of Information Engineering, Xinjiang Institute of Engineering, Urumqi, China
Jiabo Xu
School of Software Engineering, East China Normal University, Shanghai, China
Hong Liu

Authors

Yang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Fadi Farha
View author publications
You can also search for this author in PubMed Google Scholar
Yueliang Wan
View author publications
You can also search for this author in PubMed Google Scholar
Jiabo Xu
View author publications
You can also search for this author in PubMed Google Scholar
Hong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Huansheng Ning
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huansheng Ning.

Ethics declarations

Conflict of interest

The authors declare they have no financial interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 303 kb)

Supplementary file2 (PDF 287 kb)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xu, Y., Farha, F., Wan, Y. et al. Improving completeness and consistency of co-reference annotation standard. Wireless Netw (2022). https://doi.org/10.1007/s11276-022-03077-8

Download citation

Accepted: 11 July 2022
Published: 09 August 2022
DOI: https://doi.org/10.1007/s11276-022-03077-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving completeness and consistency of co-reference annotation standard

Abstract

Access this article

Similar content being viewed by others

Co-mention and Context-Based Entity Linking

Exploiting Semantics from Ontologies to Enhance Accuracy of Similarity Measures

Web Person Disambiguation Using Hierarchical Co-reference Model

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 303 kb)

Supplementary file2 (PDF 287 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving completeness and consistency of co-reference annotation standard

Abstract

Access this article

Similar content being viewed by others

Co-mention and Context-Based Entity Linking

Exploiting Semantics from Ontologies to Enhance Accuracy of Similarity Measures

Web Person Disambiguation Using Hierarchical Co-reference Model

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 303 kb)

Supplementary file2 (PDF 287 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation