Skip to main content

Abstract

The training phase is time-consuming for structured learning, especially for supper-tagging tasks. In this paper, we propose an online distributed Passive-Aggression (PA) by averaging parameters for parallel training, which can reduce the training time significantly. We also give theoretic analysis for its convergence. Experimental results show that our method can accelerate the training process significantly with comparable or even better accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817–1853 (2005)

    MathSciNet  MATH  Google Scholar 

  2. Zhang, Y., Clark, S.: A fast decoder for joint word segmentation and POS-tagging using a single discriminative model. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP 2010, pp. 843–852. Association for Computational Linguistics, Stroudsburg (2010)

    Google Scholar 

  3. Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. Journal of Machine Learning Research 7, 551–585 (2006)

    MathSciNet  MATH  Google Scholar 

  4. Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65(6), 386–408 (1958)

    Article  MathSciNet  Google Scholar 

  5. Collins, M.: Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (2002)

    Google Scholar 

  6. McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: Proc. of HLT-EMNLP (2005)

    Google Scholar 

  7. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008)

    Article  Google Scholar 

  8. Qiu, X., Zhang, Q., Huang, X.: FudanNLP: A toolkit for Chinese natural language processing. In: Proceedings of ACL (2013)

    Google Scholar 

  9. Jin, C., Chen, X.: The fourth international Chinese language processing bakeoff: Chinese word segmentation, named entity recognition and Chinese pos tagging. In: Sixth SIGHAN Workshop on Chinese Language Processing, p. 69 (2008)

    Google Scholar 

  10. Chu, C.T., Kim, S.K., Lin, Y.A., Ng, A.Y.: Map-reduce for machine learning on multicore. Architecture 19, 281 (2007)

    Google Scholar 

  11. Chang, E.Y., Zhu, K., Wang, H., Bai, H., Li, J., Qiu, Z., Cui, H.: Psvm: Parallelizing support vector machines on distributed computers. Change 20(2), 1–8 (2007)

    Google Scholar 

  12. Cristianini, N., Shawe-Taylor, J.: An introduction to support Vector Machines: and other kernel-based learning methods. Cambridge Univ. Pr. (2000)

    Google Scholar 

  13. Wolfe, J., Haghighi, A., Klein, D.: Fully distributed em for very large datasets. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 1184–1191. ACM, New York (2008)

    Chapter  Google Scholar 

  14. Chiang, D., Marton, Y., Resnik, P.: Online large-margin training of syntactic and structural translation features. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 224–233. Association for Computational Linguistics (2008)

    Google Scholar 

  15. Crammer, K., Singer, Y.: Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research 3, 951–991 (2003)

    MathSciNet  MATH  Google Scholar 

  16. McDonald, R., Hall, K., Mann, G.: Distributed training strategies for the structured perceptron. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT 2010, pp. 456–464. Association for Computational Linguistics, Stroudsburg (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhao, J., Qiu, X., Liu, Z., Huang, X. (2013). Online Distributed Passive-Aggressive Algorithm for Structured Learning. In: Sun, M., Zhang, M., Lin, D., Wang, H. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2013 2013. Lecture Notes in Computer Science(), vol 8202. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41491-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41491-6_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41490-9

  • Online ISBN: 978-3-642-41491-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics