Skip to main content

Introduction

  • Chapter
  • First Online:
Handbook of Floating-Point Arithmetic

Abstract

Representing and manipulating real numbers efficiently is required in many fields of science, engineering, finance, and more. Since the early years of electronic computing, many different ways of approximating real numbers on computers have been introduced. One can cite (this list is far from being exhaustive): fixed-point arithmetic, logarithmic [337, 585] and semi-logarithmic [444] number systems, intervals [428], continued fractions [349, 622], rational numbers [348] and possibly infinite strings of rational numbers [418], level-index number systems [100, 475], fixed-slash and floating-slash number systems [412], tapered floating-point arithmetic [432, 22], 2-adic numbers [623], and most recently unums and posits [228, 229].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For legal reasons, financial calculations frequently require special rounding rules that are very tricky to implement if the underlying arithmetic is binary: this is illustrated in [320, Section 2].

  2. 2.

    If p and β were real numbers, the value of β that would minimize β × p while letting β p be constant would be e = 2. 7182818⋯.

  3. 3.

    Even if sometimes you need to dive into the compiler documentation to find the right options; see Chapter 6

  4. 4.

    http://www.cs.berkeley.edu/~wkahan/.

  5. 5.

    That conjecture asserts that there are infinitely many pairs of prime numbers that differ by 2.

  6. 6.

    A detailed analysis of this bug can be found at http://www.lomont.org/Math/Papers/2007/Excel2007/Excel2007Bug.pdf.

  7. 7.

    See http://www.science20.com/news_articles/what_happens_bridge_when_one_side_ uses_mediterranean_sea_level_and_another_north_sea-121600.

  8. 8.

    Interestingly enough, the decimal value a 0 = 1. 71828182845904523536028747135 given in Program 1.2 is less than e − 1, but when rounded to the nearest binary64/double precision floating-point number, it becomes larger than e − 1.

References

  1. E. Allen, D. Chase, V. Luchangco, J.-W. Maessen, and G. L. Steele, Jr. Object-oriented units of measurement. In 19th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 384–403, 2004.

    Google Scholar 

  2. American National Standards Institute and Institute of Electrical and Electronic Engineers. IEEE Standard for Binary Floating-Point Arithmetic. ANSI/IEEE Standard 754–1985, 1985.

    Google Scholar 

  3. American National Standards Institute and Institute of Electrical and Electronic Engineers. IEEE Standard for Radix Independent Floating-Point Arithmetic. ANSI/IEEE Standard 854–1987, 1987.

    Google Scholar 

  4. W. Aspray, A. G. Bromley, M. Campbell-Kelly, P. E. Ceruzzi, and M. R. Williams. Computing Before Computers. Iowa State University Press, Ames, Iowa, 1990. Available at http://ed-thelen.org/comp-hist/CBC.html.

    Google Scholar 

  5. A. Azmi and F. Lombardi. On a tapered floating point system. In 9th IEEE Symposium on Computer Arithmetic (ARITH-9), pages 2–9, September 1989.

    Google Scholar 

  6. D. H. Bailey. Some background on Kanada’s recent pi calculation. Technical report, Lawrence Berkeley National Laboratory, 2003. Available at http://crd.lbl.gov/~dhbailey/dhbpapers/dhb-kanada.pdf.

  7. R. P. Brent. On the precision attainable with various floating-point number systems. IEEE Transactions on Computers, C-22(6):601–607, 1973.

    Article  MathSciNet  Google Scholar 

  8. F. Y. Busaba, C. A. Krygowski, W. H. Li, E. M. Schwarz, and S. R. Carlough. The IBM z900 decimal arithmetic unit. In 35th Asilomar Conference on Signals, Systems, and Computers, volume 2, pages 1335–1339, November 2001.

    Google Scholar 

  9. S. Carlough, A. Collura, S. Mueller, and M. Kroener. The IBM zEnterprise-196 decimal floating-point accelerator. In 20th IEEE Symposium on Computer Arithmetic (ARITH-20), pages 139–146, July 2011.

    Google Scholar 

  10. P. E. Ceruzzi. The early computers of Konrad Zuse, 1935 to 1945. Annals of the History of Computing, 3(3):241–262, 1981.

    Article  MathSciNet  Google Scholar 

  11. P. E. Ceruzzi. A History of Modern Computing. MIT Press, 2nd edition, 2003.

    Google Scholar 

  12. C. W. Chou, D. B. Hume, T. Rosenband, and D. J. Wineland. Optical clocks and relativity. Science, 329(5999):1630–1633, 2010.

    Article  Google Scholar 

  13. C. W. Clenshaw and F. W. J. Olver. Beyond floating point. Journal of the ACM, 31:319–328, 1985.

    Article  MathSciNet  Google Scholar 

  14. W. J. Cody. Static and dynamic numerical characteristics of floating-point arithmetic. IEEE Transactions on Computers, C-22(6):598–601, 1973.

    Article  Google Scholar 

  15. T. Coe and P. T. P. Tang. It takes six ones to reach a flaw. In 12th IEEE Symposium on Computer Arithmetic (ARITH-12), pages 140–146, July 1995.

    Google Scholar 

  16. M. Cornea, C. Anderson, J. Harrison, P. T. P. Tang, E. Schneider, and C. Tsen. A software implementation of the IEEE 754R decimal floating-point arithmetic using the binary encoding format. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 29–37, June 2007.

    Google Scholar 

  17. M. A. Cornea-Hasegan, R. A. Golliver, and P. Markstein. Correctness proofs outline for Newton–Raphson based floating-point divide and square root algorithms. In 14th IEEE Symposium on Computer Arithmetic (ARITH-14), pages 96–105, April 1999.

    Google Scholar 

  18. M. F. Cowlishaw. Decimal floating-point: algorism for computers. In 16th IEEE Symposium on Computer Arithmetic (ARITH-16), pages 104–111, June 2003.

    Google Scholar 

  19. M. F. Cowlishaw, E. M. Schwarz, R. M. Smith, and C. F. Webb. A decimal floating-point specification. In 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pages 147–154, June 2001.

    Google Scholar 

  20. X. Cui, W. Liu, D. Wenwen, and F. Lombardi. A parallel decimal multiplier using hybrid binary coded decimal (BCD) codes. In 23rd IEEE Symposium on Computer Arithmetic (ARITH-23), pages 150–155, July 2016.

    Google Scholar 

  21. A. Cuyt, B. Verdonk, S. Becuwe, and P. Kuterna. A remarkable example of catastrophic cancellation unraveled. Computing, 66:309–320, 2001.

    Article  MathSciNet  Google Scholar 

  22. R. Descartes. La Géométrie. Paris, 1637.

    Google Scholar 

  23. A. Edelman. The mathematics of the Pentium division bug. SIAM Review, 39(1):54–67, 1997.

    Article  MathSciNet  Google Scholar 

  24. M. A. Erle, M. J. Schulte, and B. J. Hickmann. Decimal floating-point multiplication via carry-save addition. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 46–55, June 2007.

    Google Scholar 

  25. D. Fowler and E. Robson. Square root approximations in old Babylonian mathematics: YBC 7289 in context. Historia Mathematica, 25:366–378, 1998.

    Article  MathSciNet  Google Scholar 

  26. G. Gerwig, H. Wetter, E. M. Schwarz, J. Haess, C. A. Krygowski, B. M. Fleischer, and M. Kroener. The IBM eServer z990 floating-point unit. IBM Journal of Research and Development, 48(3.4):311–322, 2004.

    Article  Google Scholar 

  27. J. L. Gustafson. The End of Error: Unum Computing. Chapman & Hall/CRC Computational Science. Taylor & Francis, 2015.

    MATH  Google Scholar 

  28. J. L. Gustafson and I. Yonemoto. Beating floating point at its own game: Posit arithmetic. In Supercomputing Frontiers and Innovations, pages 71–86, July 2017.

    Google Scholar 

  29. J. Harrison. A machine-checked theory of floating point arithmetic. In 12th International Conference in Theorem Proving in Higher Order Logics (TPHOLs), volume 1690 of Lecture Notes in Computer Science, pages 113–130, Nice, France, September 1999.

    Google Scholar 

  30. J. Harrison. Formal verification of IA-64 division algorithms. In 13th International Conference on Theorem Proving in Higher Order Logics (TPHOLs), volume 1869 of Lecture Notes in Computer Science, pages 233–251, 2000.

    Google Scholar 

  31. J. Harrison. Floating-point verification using theorem proving. In Formal Methods for Hardware Verification, 6th International School on Formal Methods for the Design of Computer, Communication, and Software Systems, SFM 2006, volume 3965 of Lecture Notes in Computer Science, pages 211–242, Bertinoro, Italy, 2006.

    Chapter  Google Scholar 

  32. B. Hayes. Third base. American Scientist, 89(6):490–494, 2001.

    Article  Google Scholar 

  33. A. Hirshfeld. Eureka Man, The life and legacy of Archimedes. Walker & Company, 2009.

    Google Scholar 

  34. IEEE Computer Society. IEEE Standard for Floating-Point Arithmetic. IEEE Standard 754-2008, August 2008. Available at http://ieeexplore.ieee.org/servlet/opac?punumber=4610933.

  35. P. Johnstone and F. E. Petry. Rational number approximation in higher radix floating-point systems. Computers & Mathematics with Applications, 25(6):103–108, 1993.

    Article  MathSciNet  Google Scholar 

  36. W. Kahan. How futile are mindless assessments of roundoff in floating-point computation? Available at http://http.cs.berkeley.edu/~wkahan/Mindless.pdf, 2004.

  37. N. G. Kingsbury and P. J. W. Rayner. Digital filtering using logarithmic arithmetic. Electronic Letters, 7:56–58, 1971. Reprinted in [583].

    Article  Google Scholar 

  38. D. E. Knuth. The Art of Computer Programming, volume 2. Addison-Wesley, Reading, MA, 3rd edition, 1998.

    Google Scholar 

  39. P. Kornerup and D. W. Matula. Finite-precision rational arithmetic: an arithmetic unit. IEEE Transactions on Computers, C-32:378–388, 1983.

    Article  MathSciNet  Google Scholar 

  40. P. Kornerup and D. W. Matula. Finite precision lexicographic continued fraction number systems. In 7th IEEE Symposium on Computer Arithmetic (ARITH-7), 1985. Reprinted in [584].

    Google Scholar 

  41. H. Kuki and W. J. Cody. A statistical study of the accuracy of floating point number systems. Communications of the ACM, 16(4):223–230, 1973.

    Article  MathSciNet  Google Scholar 

  42. J. Laskar, P. Robutel, F. Joutel, M. Gastineau, A. C. M. Correia, and B. Levrard. A long term numerical solution for the insolation quantities of the Earth. Astronomy & Astrophysics, 428:261–285, 2004.

    Article  Google Scholar 

  43. C. Lichtenau, S. Carlough, and S. M. Mueller. Quad precision floating point on the IBM z13TM. 23rd IEEE Symposium on Computer Arithmetic (ARITH-23), pages 87–94, 2016.

    Google Scholar 

  44. E. Loh and G. W. Walster. Rump’s example revisited. Reliable Computing, 8(3):245–248, 2002.

    Article  MathSciNet  Google Scholar 

  45. D. W. Matula and P. Kornerup. Finite precision rational arithmetic: Slash number systems. IEEE Transactions on Computers, 34(1):3–18, 1985.

    Article  Google Scholar 

  46. V. Ménissier. Arithmétique Exacte. Ph.D. thesis, Université Pierre et Marie Curie, Paris, December 1994. In French.

    Google Scholar 

  47. R. E. Moore. Interval analysis. Prentice Hall, 1966.

    Google Scholar 

  48. R. Morris. Tapered floating point: A new floating-point representation. IEEE Transactions on Computers, 20(12):1578–1579, 1971.

    Article  Google Scholar 

  49. J.-M. Muller. Arithmétique des Ordinateurs. Masson, Paris, 1989. In French.

    Google Scholar 

  50. J.-M. Muller. Algorithmes de division pour microprocesseurs: illustration à l’aide du “bug” du pentium. Technique et Science Informatiques, 14(8), 1995.

    Google Scholar 

  51. J.-M. Muller, A. Scherbyna, and A. Tisserand. Semi-logarithmic number systems. IEEE Transactions on Computers, 47(2):145–151, 1998.

    Article  MathSciNet  Google Scholar 

  52. J. Oberg. Why the Mars probe went off course. IEEE Spectrum, 36(12):34–39, 1999.

    Article  Google Scholar 

  53. F. W. J. Olver and P. R. Turner. Implementation of level-index arithmetic using partial table look-up. In 8th IEEE Symposium on Computer Arithmetic (ARITH-8), May 1987.

    Google Scholar 

  54. C. Proust. Masters’ writings and students’ writings: School material in Mesopotamia. In Gueudet, Pepin, and Trouche, editors, Mathematics curriculum material and teacher documentation: from textbooks to shared living resources, pages 161–180. Springer, 2011.

    Chapter  Google Scholar 

  55. B. Randell. From analytical engine to electronic digital computer: the contributions of Ludgate, Torres, and Bush. IEEE Annals of the History of Computing, 4(4):327–341, 1982.

    Article  MathSciNet  Google Scholar 

  56. R. Rojas. The Z1: Architecture and algorithms of Konrad Zuse’s first computer. Technical report, Freie Universität Berlin, June 2014. Available at https://arxiv.org/abs/1406.1886.

    Google Scholar 

  57. R. Rojas, F. Darius, C. Göktekin, and G. Heyne. The reconstruction of Konrad Zuse’s Z3. IEEE Annals of the History of Computing, 27(3):23–32, 2005.

    Article  MathSciNet  Google Scholar 

  58. S. M. Rump. Algorithms for verified inclusions: theory and practice. In Reliability in Computing, Perspectives in Computing, pages 109–126, 1988.

    Google Scholar 

  59. C. Severance. IEEE 754: An interview with William Kahan. Computer, 31(3):114–115, 1998.

    Article  Google Scholar 

  60. E. E. Swartzlander and A. G. Alexpoulos. The sign-logarithm number system. IEEE Transactions on Computers, 1975. Reprinted in [583].

    Google Scholar 

  61. A. Vázquez. High-Performance Decimal Floating-Point Units. Ph.D. thesis, Universidade de Santiago de Compostela, 2009.

    Google Scholar 

  62. A. Vázquez, E. Antelo, and P. Montuschi. A new family of high performance parallel decimal multipliers. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 195–204, 2007.

    Google Scholar 

  63. J. E. Vuillemin. Exact real computer arithmetic with continued fractions. IEEE Transactions on Computers, 39(8), 1990.

    Google Scholar 

  64. J. E. Vuillemin. On circuits and numbers. IEEE Transactions on Computers, 43(8):868–879, 1994.

    Article  MathSciNet  Google Scholar 

  65. L.-K. Wang and M. J. Schulte. Decimal floating-point division using Newton–Raphson iteration. In Application-Specific Systems, Architectures and Processors, pages 84–95, 2004.

    Google Scholar 

  66. L.-K. Wang, M. J. Schulte, J. D. Thompson, and N. Jairam. Hardware designs for decimal floating-point addition and related operations. IEEE Transactions on Computers, 58(2):322–335, 2009.

    Article  MathSciNet  Google Scholar 

  67. W. H. Ware, editor. Soviet computer technology—1959. Communications of the ACM, 3(3):131–166, 1960.

    Google Scholar 

  68. Wikipedia. Slide rule — Wikipedia, The Free Encyclopedia, 2017. [Online; accessed 20-November-2017].

    Google Scholar 

  69. Wikipedia. Square root of 2 — Wikipedia, The Free Encyclopedia, 2017. [Online; accessed 20-November-2017].

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Muller, JM. et al. (2018). Introduction. In: Handbook of Floating-Point Arithmetic. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-76526-6_1

Download citation

Publish with us

Policies and ethics