Skip to main content

Exascale Radio Astronomy: Can We Ride the Technology Wave?

  • Conference paper
Supercomputing (ISC 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8488))

Included in the following conference series:

Abstract

The Square Kilometre Array (SKA) will be the most sensitive radio telescope in the world. This unprecedented sensitivity will be achieved by combining and analyzing signals from 262,144 antennas and 350 dishes at a raw datarate of petabits per second. The processing pipeline to create useful astronomical data will require exa-operations per second, at a very limited power budget. We analyze the compute, memory and bandwidth requirements for the key algorithms used in the SKA. By studying their implementation on existing platforms, we show that most algorithms have properties that map inefficiently on current hardware, such as a low compute-bandwidth ratio and complex arithmetic. In addition, we estimate the power breakdown on CPUs and GPUs, analyze the cache behavior on CPUs, and discuss possible improvements. This work is complemented with an analysis of supercomputer trends, which demonstrates that current efforts to use commercial off-the-shelf accelerators results in a two to three times smaller improvement in compute capabilities and power efficiency than custom built machines. We conclude that waiting for new technology to arrive will not give us the instruments currently planned in 2018: one or two orders of magnitude better power efficiency and compute capabilities are required. Novel hardware and system architectures, to match the needs and features of this unique project, must be developed.

This work is conducted in the context of the joint ASTRON and IBM DOME project and is funded the Netherlands Organization for Scientific Research (NWO), the Dutch Ministry of EL&I, and the Province of Drenthe, The Netherlands.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. SKA: Square Kilometer Array, http://www.skatelescope.org/

  2. SKA: SKA Baseline design (2013), https://www.skatelescope.org/wp-content/uploads/2012/07/SKA-TEL-SKO-DD-001-1_BaselineDesign1.pdf

  3. Perley, R.A.E.: A proposal for a large, low frequency array located at the VLA site. VLA Scientific Memorandum 146 (1984)

    Google Scholar 

  4. van Haarlem, M., Wise, M., Gunst, A., Heald, G., McKean, J., et al.: LOFAR: The LOw-Frequency ARray. Astronomy & Astrophysics (May 2013)

    Google Scholar 

  5. Jeffs, B.: Beamforming presentation, http://ens.ewi.tudelft.nl/Education/courses/et4235/Beamforming.pdf

  6. Thompson, A.R., Moran, J.M., Swenson, G.W.: Interferometry and Synthesis in Radio Astronomy, 2nd edn. Wiley-VCH, Weinheim (2001)

    Book  Google Scholar 

  7. Bridle, A.H., Schwab, F.R.: Wide Field Imaging I: Bandwidth and Time-Average Smearing. Synthesis Imaging in Radio Astronomy 6, 247 (1989)

    Google Scholar 

  8. Tasse, C., van der Tol, B., van Zwieten, J., van Diepen, G., Bhatnagar, S.: Applying full polarization A-Projection to very wide field of view instruments: An imager for LOFAR. Instrumentation and Methods for Astrophysics (December 2012)

    Google Scholar 

  9. Cornwell, T., Golap, K., Bhatnagar, S.: The non-coplanar baselines effect in radio interferometry: The W-Projection algorithm. IEEE Journal of Selected Topics in Signal Processing 2 (2008)

    Google Scholar 

  10. Clark, B.G.: An efficient implementation of the algorithm ‘CLEAN’. Astronomy and Astrophysics 89, 377–378 (1980)

    Google Scholar 

  11. Jongerius, R., Wijnholds, S., Nijboer, R., Corporaal, H.: End-to-end compute model of the square kilometre array. IEEE Computer (accepted, 2014)

    Google Scholar 

  12. Romein, J.W.: An efficient work-distribution strategy for gridding radio-telescope data on gpus. In: ACM International Conference on Supercomputing (ICS 2012), Venice, Italy, pp. 321–330 (2012)

    Google Scholar 

  13. Venkatesh, G., Sampson, J., Goulding, N., Garcia, S., Bryksin, V., Lugo-Martinez, J., Swanson, S., Taylor, M.B.: Conservation cores: reducing the energy of mature computations. SIGARCH Comput. Archit. News 38, 205–218 (2010)

    Article  Google Scholar 

  14. ARM: big.little, http://www.arm.com/products/processors/technologies/biglittleprocessing.php

  15. Vassiliadis, S., Wong, S., Gaydadjiev, G., Bertels, K., Kuzmanov, G., Panainte, E.: The MOLEN polymorphic processor. IEEE Transactions on Computers 53, 1363–1375 (2004)

    Article  Google Scholar 

  16. Convey: Convey computer website, http://www.conveycomputer.com

  17. Intel: Intel SSE and AVX extensions, http://software.intel.com/en-us/intel-isa-extensions

  18. Intel: Intel random number generator, http://software.intel.com/sites/default/files/m/d/4/1/d/8/441_Intel_R__DRNG_Software_Implementation_Guide_final_Aug7.pdf

  19. Shahbahrami, A., Juurlink, B., Vassiliadis, S.: Efficient vectorization of the FIR filter. In: Proc. 16th Annual Workshop on Circuits, Systems and Signal Processing (ProRISC), pp. 432–437 (2005)

    Google Scholar 

  20. Jongerius, R., Corporaal, H., Broekema, C., Engbersen, T.: Analyzing LOFAR station processing on multi-core platforms. ICT Open 2012 (2012)

    Google Scholar 

  21. Romein, J.: Signal Processing on GPUs for Radio Telescopes. In: GPU Technology Conference 2013 (2013)

    Google Scholar 

  22. Frigo, M., Johnson, S.G.: FFTW: An adaptive software architecture for the FFT. In: Proc. 1998 IEEE Intl. Conf. Acoustics Speech and Signal Processing, vol. 3, pp. 1381–1384. IEEE (1998)

    Google Scholar 

  23. Xu, W., Yan, Z., Shunying, D.: A high performance FFT library with single instruction multiple data (SIMD) architecture. In: International Conference on Electronics, Communications and Control (ICECC), pp. 630–633 (2011)

    Google Scholar 

  24. Lobeiras, J., Amor, M., Doallo, R.: FFT Implementation on a Streaming Architecture. In: 19th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 119–126 (2011)

    Google Scholar 

  25. Szomoru, A.: The UniBoard: A multi-purpose scalable high-performance computing platform for radio-astronomical applications. In: XXXth URSI General Assembly and Scientific Symposium, pp. 1–4 (2011)

    Google Scholar 

  26. Nieuwpoort, R., Romein, J.: Correlating radio astronomy signals with many-core hardware. International Journal of Parallel Programming 39, 88–114 (2011)

    Article  Google Scholar 

  27. Romein, J.W., Broekema, P.C., Mol, J.D., van Nieuwpoort, R.V.: The LOFAR correlator: implementation and performance analysis. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2010, pp. 169–178. ACM, New York (2010)

    Google Scholar 

  28. Clark, M.A., Plante, P.C.L., Greenhill, L.J.: Accelerating Radio Astronomy Cross-Correlation with Graphics Processing Units. CoRR abs/1107.4264 (2011)

    Google Scholar 

  29. Woods, A.: Accelerating software radio astronomy fx correlation with gpu and fpga co-processors. Master’s thesis, University of Cape Town (2010)

    Google Scholar 

  30. de Souza, L., Bunton, J., Campbell-Wilson, D., Cappallo, R., Kincaid, B.: A Radio Astronomy Correlator Optimized for the Xilinx Virtex-4 SX FPGA. In: International Conference on Field Programmable Logic and Applications, FPL 2007, pp. 62–67 (2007)

    Google Scholar 

  31. van Amesfoort, A.S., Varbanescu, A.L., Sips, H.J., van Nieuwpoort, R.V.: Evaluating Multi-core Platforms for HPC Data-intensive Kernels. In: Proceedings of the 6th ACM Conference on Computing Frontiers, CF 2009, pp. 207–216. ACM, New York (2009)

    Google Scholar 

  32. Humphreys, B., Cornwell, T.: Analysis of convolutional resampling algorithm performance (2011), http://www.skatelescope.org/uploaded/59116_132_Memo_Humphreys.pdf

  33. Varbanescu, A.L., van Amesfoort, A.S., Cornwell, T., van Diepen, G., van Nieuwpoort, R., Elmegreen, B.G., Sips, H.: Building high-resolution sky images using the Cell/B.E. Sci. Program. 17, 113–134 (2009)

    Google Scholar 

  34. Binkert, N., Beckmann, B., Black, G., Reinhardt, S.K., Saidi, A., Basu, A., Hestness, J., Hower, D.R., Krishna, T., Sardashti, S., Sen, R., Sewell, K., Shoaib, M., Vaish, N., Hill, M.D., Wood, D.A.: The gem5 simulator. SIGARCH Comput. Archit. News 39, 1–7 (2011)

    Article  Google Scholar 

  35. Li, S., Ahn, J.H., Strong, R., Brockman, J., Tullsen, D., Jouppi, N.: McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In: MICRO-42. 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 469–480 (2009)

    Google Scholar 

  36. Leng, J., Hetherington, T., ElTantawy, A., Gilani, S., Kim, N.S., Aamodt, T.M., Reddi, V.J.: GPUWattch: Enabling Energy Optimizations in GPGPUs. In: Proceedings of the 40th Annual International Symposium on Computer Architecture, ISCA 2013, pp. 487–498. ACM, New York (2013)

    Chapter  Google Scholar 

  37. Bakhoda, A., Yuan, G.L., Fung, W.W.L., Wong, H., Aamodt, T.M.: Analyzing CUDA Workloads Using a Detailed GPU Simulator. In: IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2009), pp. 163–174 (2009)

    Google Scholar 

  38. Top500: Top500 website, http://www.top500.org/

  39. Green500: Green500 website, http://www.green500.org/

  40. Kamil, S., Shalf, J., Strohmaier, E.: Power efficiency in high performance computing. In: IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008, pp. 1–8 (2008)

    Google Scholar 

  41. Dongarra, J.: HPCG benchmarking, http://www.sandia.gov/~maherou/docs/HPCG-Benchmark.pdf

  42. Lee, V.W., Kim, C., Chhugani, J., Deisher, M., Kim, D., Nguyen, A.D., Satish, N., Smelyanskiy, M., Chennupaty, S., Hammarlund, P., Singhal, R., Dubey, P.: Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. SIGARCH Comput. Archit. News 38, 451–460 (2010)

    Article  Google Scholar 

  43. Dennard, R., Gaensslen, F., Yu, H.N., Leo Rideovt, V., Bassous, E., Leblanc, A.R.: Design of ion-implanted MOSFET’s with very small physical dimensions. IEEE Solid-State Circuits Society Newsletter 12, 38–50 (2007)

    Article  Google Scholar 

  44. Esmaeilzadeh, H., Blem, E., St. Amant, R., Sankaralingam, K., Burger, D.: Dark silicon and the end of multicore scaling. SIGARCH Comput. Archit. News 39, 365–376 (2011)

    Article  Google Scholar 

  45. Keckler, S., Dally, W., Khailany, B., Garland, M., Glasco, D.: GPUs and the Future of Parallel Computing. IEEE Micro 31, 7–17 (2011)

    Article  Google Scholar 

  46. Wulf, W.A., McKee, S.A.: Hitting the memory wall: implications of the obvious. SIGARCH Comput. Archit. News 23, 20–24 (1995)

    Article  Google Scholar 

  47. Patterson, P.D.: Latency lags bandwidth. In: Proceedings of the 2005 International Conference on Computer Design, ICCD 2005, pp. 3–6. IEEE Computer Society, Washington, DC (2005)

    Google Scholar 

  48. Cook, S.: CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco (2012)

    Google Scholar 

  49. Hennessy, J.L., Patterson, D.A.: Computer Architecture, Fourth Edition: A Quantitative Approach. Morgan Kaufmann Publishers Inc., San Francisco (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Vermij, E., Fiorin, L., Hagleitner, C., Bertels, K. (2014). Exascale Radio Astronomy: Can We Ride the Technology Wave?. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds) Supercomputing. ISC 2014. Lecture Notes in Computer Science, vol 8488. Springer, Cham. https://doi.org/10.1007/978-3-319-07518-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07518-1_3

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07517-4

  • Online ISBN: 978-3-319-07518-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics