Distant Supervision for Relation Extraction via Sparse Representation

Zeng, Daojian; Lai, Siwei; Wang, Xuepeng; Liu, Kang; Zhao, Jun; Lv, Xueqiang

doi:10.1007/978-3-319-12277-9_14

Daojian Zeng²¹,
Siwei Lai²¹,
Xuepeng Wang²¹,
Kang Liu²¹,
Jun Zhao²¹ &
…
Xueqiang Lv²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8801))

Included in the following conference series:

1663 Accesses
2 Citations

Abstract

In relation extraction, distant supervision is proposed to automatically generate a large amount of labeled data. Distant supervision heuristically aligns the given knowledge base to free text and consider the alignment as labeled data. This procedure is effective to get training data. However, this heuristically label procedure is confronted with wrong labels. Thus, the extracted features are noisy and cause poor extraction performance. In this paper, we exploit the sparse representation to address the noise feature problem. Given a new test feature vector, we first compute its sparse linear combination of all the training features. To reduce the influence of noise features, a noise term is adopted in the procedure of finding the sparse solution. Then, the residuals to each class are computed. Finally, we classify the test sample by assigning it to the object class that has minimal residual. Experimental results demonstrate that the noise term is effective to noise features and our approach significantly outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bunescu, R., Mooney, R.: Subsequence kernels for relation extraction. In: Advances in Neural Information Processing Systems 18, p. 171 (2006)
Google Scholar
Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 724–731 (2005)
Google Scholar
Chen, J., Ji, D., Tan, C.L., Niu, Z.: Unsupervised feature selection for relation extraction. In: Proceedings of IJCNLP (2005)
Google Scholar
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM Rev. 43(1), 129–159 (2001)
Article MathSciNet MATH Google Scholar
Donoho, D.L.: For most large underdetermined systems of linear equations the minimal ℓ1-norm solution is also the sparsest solution. Comm. Pure and Applied Math. 59, 797–829 (2006)
Article MathSciNet MATH Google Scholar
Harris, Z.: Distributional structure. Word 10(23), 146–162 (1954)
Google Scholar
Hasegawa, T., Sekine, S., Grishman, R.: Discovering relations among named entities from large corpora. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (2004)
Google Scholar
Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer, L., Weld, D.S.: Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 541–550 (2011)
Google Scholar
Huang, K., Aviyente, S.: Sparse representation for signal classification. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, pp. 609–616 (2006)
Google Scholar
Jafari, M.G., Plumbley, M.D.: Fast dictionary learning for sparse representations of speech signals. J. Sel. Topics Signal Processing 5(5), 1025–1031 (2011)
Article Google Scholar
Kambhatla, N.: Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions (2004)
Google Scholar
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 2, pp. 1003–1011 (2009)
Google Scholar
Qian, L., Zhou, G., Kong, F., Zhu, Q., Qian, P.: Exploiting constituent dependencies for tree kernel-based semantic relation extraction. In: Proceedings of the 22nd International Conference on Computational Linguistics, pp. 697–704 (August 2008)
Google Scholar
Riedel, S., Yao, L., McCallum, A.: Modeling relations and their mentions without labeled text. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part III. LNCS, vol. 6323, pp. 148–163. Springer, Heidelberg (2010)
Chapter Google Scholar
Suchanek, F.M., Ifrim, G., Weikum, G.: Combining linguistic and statistical analysis to extract relations from web documents. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 712–717 (2006)
Google Scholar
Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 455–465 (2012)
Google Scholar
Takamatsu, S., Sato, I., Nakagawa, H.: Reducing wrong labels in distant supervision for relation extraction. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 721–729 (2012)
Google Scholar
Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(2), 210–227 (2009)
Article Google Scholar
Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse representation for computer vision and pattern recognition. Proceedings of the IEEE 98(6), 1031–1044 (2010)
Article Google Scholar
Zelenko, D., Aone, C., Richardella, A.: Kernel methods for relation extraction. The Journal of Machine Learning Research 3, 1083–1106 (2003)
MathSciNet MATH Google Scholar
Zhang, X., Zhang, J., Zeng, J., Yan, J., Chen, Z., Sui, Z.: Towards accurate distant supervision for relational facts extraction. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 810–815 (2013)
Google Scholar
Zhou, G., Jian, S., Jie, Z., Min, Z.: Exploring various knowledge in relation extraction. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 427–434 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China
Daojian Zeng, Siwei Lai, Xuepeng Wang, Kang Liu & Jun Zhao
Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science & Technology University, China
Xueqiang Lv

Authors

Daojian Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Siwei Lai
View author publications
You can also search for this author in PubMed Google Scholar
Xuepeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Kang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xueqiang Lv
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghua University, Haidian District, 100084, Beijing, China
Maosong Sun & Yang Liu &
Chinese Academy of Sciences, Institute of Automation, 100190, Beijing, China
Jun Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zeng, D., Lai, S., Wang, X., Liu, K., Zhao, J., Lv, X. (2014). Distant Supervision for Relation Extraction via Sparse Representation. In: Sun, M., Liu, Y., Zhao, J. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2014 2014. Lecture Notes in Computer Science(), vol 8801. Springer, Cham. https://doi.org/10.1007/978-3-319-12277-9_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-12277-9_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12276-2
Online ISBN: 978-3-319-12277-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics