Skip to main content

Race Against the Teens – Benchmarking Mechanized Math on Pre-university Problems

  • Conference paper
  • First Online:
Automated Reasoning (IJCAR 2016)

Abstract

This paper introduces a benchmark problem library for mechanized math technologies including computer algebra and automated theorem proving. The library consists of pre-university math problems taken from exercise problem books, university entrance exams, and the International Mathematical Olympiads. It thus includes problems in various areas of pre-university math and with a variety of difficulty. Unlike other existing benchmark libraries, this one contains problems that are formalized so that they are obtainable as the result of mechanical translation of the original problems expressed in natural language. In other words, the library is designed to support the integration of the technologies of mechanized math and natural language processing towards the goal of end-to-end automatic math problem solving. The paper also presents preliminary experimental results of our prototype reasoning component of an end-to-end system on the library. The library is publicly available through the Internet.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The statistics were taken from the official IMO website: https://www.imo-official.org/results_year.aspx.

References

  1. Barrett, C., Stump, A., Tinelli, C.: The Satisfiability Modulo Theories Library (SMT-LIB) (2010). www.SMT-LIB.org

  2. Bos, J.: Wide-coverage semantic analysis with boxer. In: Bos, J., Delmonte, R. (eds.) Semantics in Text Processing, STEP 2008 Conference Proceedings, pp. 277–286. Research in Computational Semantics, College Publications (2008)

    Google Scholar 

  3. Clark, S., Curran, J.R.: Wide-coverage efficient statistical parsing with CCG and log-linear models. Comput. Linguist. 33, 493–552 (2007)

    Article  MATH  Google Scholar 

  4. Dennis, L.A., Gow, J., Schürmann, C.: Challenge problems for inductive theorem provers v1.0. Technical report ULCS-07-004, University of Liverpool, Department of Computer Science (2007)

    Google Scholar 

  5. Grabowski, A., Korni lowicz, A., Naumowicz, A.: Mizar in a nutshell. J. Formalized Reasoning 3(2), 153–245 (2010)

    MathSciNet  MATH  Google Scholar 

  6. Hoos, H.H., Stützle, T.: SATLIB: An Online Resource for Research on SAT. In: Sat2000: Highlights of Satisfiability Research in the Year 2000, pp. 283–292. IOS Press, Amsterdam (2000)

    Google Scholar 

  7. Iwane, H., Matsuzaki, T., Arai, N., Anai, H.: Automated natural language geometry math problem solving by real quantier elimination. In: Proceedings of the 10th International Workshop on Automated Deduction (ADG2014), pp. 75–84 (2014)

    Google Scholar 

  8. Iwane, H., Yanami, H., Anai, H., Yokoyama, K.: An effective implementation of symbolic-numeric cylindrical algebraic decomposition for quantifier elimination. Theor. Comput. Sci. 479, 43–69 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  9. Kwiatkowksi, T., Zettlemoyer, L., Goldwater, S., Steedman, M.: Inducing probabilistic CCG grammars from logical form with higher-order unification. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1223–1233. Association for Computational Linguistics (2010)

    Google Scholar 

  10. Matsuzaki, T., Iwane, H., Anai, H., Arai, N.H.: The most uncreative examinee: A first step toward wide coverage natural language math problem solving. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 1098–1104 (2014)

    Google Scholar 

  11. Quaresma, P.: Thousands of geometric problems for geometric theorem provers (TGTP). In: Schreck, P., Narboux, J., Richter-Gebert, J. (eds.) ADG 2010. LNCS, vol. 6877, pp. 169–181. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Sutcliffe, G.: The TPTP problem library and associated infrastructure: the FOF and CNF Parts, v3.5.0. J. Autom. Reasoning 43(4), 337–362 (2009)

    Article  MATH  Google Scholar 

  13. Sutcliffe, G., Benzmüller, C.: Automated reasoning in higher-order logic using the TPTP THF infrastructure. J. Formalized Reasoning 3(1), 1–27 (2010)

    MathSciNet  MATH  Google Scholar 

  14. Sutcliffe, G., Stickel, M., Schulz, S., Urban, J.: Answer extraction for TPTP. http://www.cs.miami.edu/~tptp/TPTP/Proposals/AnswerExtraction.html

  15. Tarski, A.: A Decision Method for Elementary Algebra and Geometry. University of California Press, Berkeley (1951)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takuya Matsuzaki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Matsuzaki, T. et al. (2016). Race Against the Teens – Benchmarking Mechanized Math on Pre-university Problems. In: Olivetti, N., Tiwari, A. (eds) Automated Reasoning. IJCAR 2016. Lecture Notes in Computer Science(), vol 9706. Springer, Cham. https://doi.org/10.1007/978-3-319-40229-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40229-1_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40228-4

  • Online ISBN: 978-3-319-40229-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics