Skip to main content

Distributed In-Memory Computing on Binary Memristor-Crossbar for Machine Learning

  • Chapter
  • First Online:
Advances in Memristors, Memristive Devices and Systems

Part of the book series: Studies in Computational Intelligence ((SCI,volume 701))

Abstract

The recent emerging memristor can provide non-volatile memory storage but also intrinsic computing for matrix-vector multiplication, which is ideal for low-power and high-throughput data analytics accelerator performed in memory. However, the existing memristor-crossbar based computing is mainly assumed as a multi-level analog computing, whose result is sensitive to process non-uniformity as well as additional overhead from AD-conversion and I/O. In this chapter, we explore the matrix-vector multiplication accelerator on a binary memristor-crossbar with adaptive 1-bit-comparator based parallel conversion. Moreover, a distributed in-memory computing architecture is also developed with according control protocol. Both memory array and logic accelerator are implemented on the binary memristor-crossbar, where logic-memory pair can be distributed with protocol of control bus. Experiment results have shown that compared to the analog memristor-crossbar, the proposed binary memristor-crossbar can achieve significant area-saving with better calculation accuracy. Moreover, significant speedup can be achieved for matrix-vector multiplication in the neuron-network based machine learning such that the overall training and testing time can be both reduced respectively. In addition, large energy saving can be also achieved when compared to the traditional CMOS-based out-of-memory computing architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Akinaga, H., & Shima, H. (2010). Resistive random access memory (reram) based on metal oxides. Proceedings of the IEEE, 98(12), 2237–2251.

    Article  Google Scholar 

  • Chen, P. Y., et al. (2015). Technology-design co-optimization of resistive cross-point array for accelerating learning algorithms on chip. In IEEE date.

    Google Scholar 

  • Chen, Y.-C., Wang, W., Li H., & Zhang, W. (2012). Non-volatile 3d stacking rram-based fpga. In 22nd International conference on field programmable logic and applications (FPL) (pp. 367–372). IEEE.

    Google Scholar 

  • Chua, L. O. (1971). Memristor-the missing circuit element. IEEE Transactions on Circuit Theory, 18(5), 507–519.

    Article  Google Scholar 

  • Coates, A., Ng, A. Y., & Lee, H. (2011). An analysis of single-layer networks in unsupervised feature learning. In International conference on artificial intelligence and statistics (pp. 215–223).

    Google Scholar 

  • Cong, J., & Xiao, B. (2014). Minimizing computation in convolutional neural networks. In International conference on artificial neural networks (pp. 281–290). Springer.

    Google Scholar 

  • Fan, D., Sharad, M., & Roy, K., (2014). Design and synthesis of ultralow energy spin-memristor threshold logic. IEEE Transactions on Nanotechnology, 13(3), 574–583.

    Google Scholar 

  • Fei, W., Yu, H., Zhang, W., & Yeo, K. S. (2012). Design exploration of hybrid cmos and memristor circuit by new modified nodal analysis. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 20(6), 1012–1025.

    Google Scholar 

  • Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In International conference on artificial intelligence and statistics (pp. 249–256).

    Google Scholar 

  • Gu, P., Li, B., Tang, T., Yu, S., Cao, Y., Wang, Y., & Yang, H. (2015). Technological exploration of rram crossbar array for matrix-vector multiplication. In 2015 20th Asia and South Pacific design automation conference (ASP-DAC) (pp. 106–111). IEEE.

    Google Scholar 

  • Haykin, S. S., Haykin, S. S., & Haykin, S. S. (2009). Neural networks and learning machines (Vol. 3). Pearson Education Upper Saddle River.

    Google Scholar 

  • Higham, N. J. (2009). Cholesky factorization. Wiley Interdisciplinary Reviews: Computational Statistics, 1(2), 251–254. doi:10.1002/wics.18.

  • Hinton, G. E., Osindero, S., & Teh, Y. -W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554.

    Google Scholar 

  • Huang, G.-B., Zhu, Q.-Y., & Siew, C.-K. (2006). Extreme learning machine: Theory and applications. Neurocomputing, 70(1), 489–501.

    Article  Google Scholar 

  • Huang, G. B., Ramesh, M., Berg, T., Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst.

    Google Scholar 

  • Kang, J., Gao, B., Chen, B., Huang, P.-Y., Zhang, F., & Deng, Y. et al. (2014). 3d rram: Design and optimization. In 2014 12th IEEE international conference on solid-state and integrated circuit technology (ICSICT) (pp. 1–4). IEEE.

    Google Scholar 

  • Kim, K. -H., Gaba, S., Wheeler, D., Cruz-Albrecht, J. M., Hussain, T., & Srinivasa, N., et al. (2011). A functional hybrid memristor crossbar-array/cmos system for data storage and neuromorphic applications. Nano Letters, 12(1), 389–395.

    Google Scholar 

  • Kim, Y., Zhang, Y., & Li, P. (2012). A digital neuromorphic vlsi architecture with memristor crossbar synaptic array for machine learning. In 2012 IEEE international SOC conference (SOCC) (pp. 328–333). IEEE.

    Google Scholar 

  • Kouzes, R. T., Anderson, G. A., Elbert, S. T., Gorton, I., & Gracio, D. K. (2009). The changing paradigm of data-intensive computing. Computer, 1, 26–34.

    Google Scholar 

  • Krishnamoorthy, A., & Menon, D. (2011). Matrix inversion using cholesky decomposition. arXiv preprint arXiv:11114144.

  • Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images.

    Google Scholar 

  • Kumar, V., Sharma, R., Uzunlar, E., Zheng, L., Bashirullah, R., & Kohl, P., et al. (2014). Airgap interconnects: Modeling, optimization, and benchmarking for backplane, pcb, and interposer applications. IEEE Transactions on Components, Packaging and Manufacturing Technology, 4(8), 1335–1346.

    Google Scholar 

  • LeCun, Y. A., Bottou, L., Orr, G. B., & Müller, K. -R. (2012). Efficient backprop. In Neural networks: Tricks of the Trade (pp. 9–48). Springer.

    Google Scholar 

  • Lee, H., Che, P., Wu, T., Che, Y., Wan, C., & Tzen, P., et al. (2008). Low power and high speed bipolar switching with a thin reactive ti buffer layer in robust hfo2 based rram. In IEEE international electron devices meeting, IEDM 2008 (pp. 1–4). IEEE.

    Google Scholar 

  • Liauw, Y. Y., Zhang, Z., Kim, W., El Gamal, A., Wong, S. S. (2012). Nonvolatile 3d-fpga with monolithically stacked rram-based configuration memory. In 2012 IEEE international solid-state circuits conference (pp. 406–408). IEEE.

    Google Scholar 

  • Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml.

  • Liu, X., Mao, M., Liu, B., Li, H., Chen, Y., & Li, B., et al. (2015). Reno: A high-efficient reconfigurable neuromorphic computing accelerator design. In 2015 52nd ACM/EDAC/IEEE design automation conference (DAC) (pp. 1–6). IEEE.

    Google Scholar 

  • Lu, W., Kim, K. -H., Chang, T., & Gaba, S. (2011). Two-terminal resistive switches (memristors) for memory and logic applications. In Design automation conference (ASP-DAC).

    Google Scholar 

  • Matsunaga, S., Hayakawa, J., Ikeda, S., Miura, K., Endoh, T., & Ohno, H., et al. (2009). Mtj-based nonvolatile logic-in-memory circuit, future prospects and issues. In Proceedings of the Conference on Design European Design and Automation Association: Automation and Test in Europe (pp. 433–435).

    Google Scholar 

  • Müller, K.-R., Tangermann, M., Dornhege, G., Krauledat, M., Curio, G., & Blankertz, B. (2008). Machine learning for real-time single-trial eeg-analysis: From brain-computer interfacing to mental state monitoring. Journal of neuroscience methods, 167(1), 82–90.

    Google Scholar 

  • Park, S., Qazi, M., Peh, L. -S., & Chandrakasan, A. P. (2013). 40.4 fj/bit/mm low-swing on-chip signaling with self-resetting logic repeaters embedded within a mesh noc in 45nm soi cmos. In Proceedings of the Conference on Design, Automation and Test in Europe, EDA Consortium (pp. 1637–1642).

    Google Scholar 

  • Shang, Y., Fei, W., & Yu, H., (2012). Analysis and modeling of internal state variables for dynamic effects of nonvolatile memory devices. IEEE Transactions on Circuits and Systems I: Regular Papers, 59(9), 1906–1918.

    Google Scholar 

  • Singh, P. N., Kumar, A., Debnath, C., Malik, R. (2007). 20mw, 125 msps, 10 bit pipelined adc in 65nm standard digital cmos process. In Custom integrated circuits conference, CICC’07 (pp. 189–192). IEEE.

    Google Scholar 

  • Srimani, T., Manna, B., Mukhopadhyay, A. K., Roy, K., Sharad, M. (2015). Energy efficient and high performance current-mode neural network circuit using memristors and digitally assisted analog cmos neurons. arXiv preprint arXiv:151109085.

  • Strukov, D. B., Snider, G. S., Stewart, D. R., & Williams, R. S. (2008). The missing memristor found. Nature, 453(7191), 80–83.

    Google Scholar 

  • Suykens, J. A., & Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural processing letters, 9(3), 293–300.

    Google Scholar 

  • Tan, T., & Sun, Z. (2010). CASIA-FingerprintV5. http://biometrics.idealtest.org/.

  • Topaloglu, R. O. (2015). More than moore technologies for next generation computer design. Springer.

    Google Scholar 

  • Vaidyanathan, S., & Volos, C. (2016a). Advances and applications in chaotic systems (Vol. 636). Springer.

    Google Scholar 

  • Vaidyanathan, S., Volos, C. (2016b). Advances and applications in nonlinear control systems (Vol. 635). Springer.

    Google Scholar 

  • Wang, Y., Yu, H., & Zhang, W. (2014). Nonvolatile cbram-crossbar-based 3-d-integrated hybrid memory for data retention. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 22(5), 957–970.

    Google Scholar 

  • Wang, Y., Yu, H., Ni, L., Huang, G. -B., Yan, M., & Weng, C., et al.(2015). An energy-efficient nonvolatile in-memory computing architecture for extreme learning machine by domain-wall nanowire devices. IEEE Transactions on Nanotechnology, 14(6), 998–1012.

    Google Scholar 

  • Werbos, P. J. (1990). Backpropagation through time: What it does and how to do it. Proceedings of the IEEE, 78(10), 1550–1560.

    Article  Google Scholar 

  • Williams, S. R. (2008). How we found the missing memristor. Spectrum, IEEE, 45(12), 28–35.

    Article  Google Scholar 

  • Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 2(1–3), 37–52.

    Google Scholar 

  • Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390.

    Article  Google Scholar 

  • Wright, J., Yang, A. Y., Ganesh, A., Sastry, S. S., & Ma, Y., (2009). Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 210–227.

    Google Scholar 

  • Yu, H., & Wang, Y. (2014). Design exploration of emerging nano-scale non-volatile memory. Springer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Yu, H., Ni, L., Huang, H. (2017). Distributed In-Memory Computing on Binary Memristor-Crossbar for Machine Learning. In: Vaidyanathan, S., Volos, C. (eds) Advances in Memristors, Memristive Devices and Systems. Studies in Computational Intelligence, vol 701. Springer, Cham. https://doi.org/10.1007/978-3-319-51724-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-51724-7_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51723-0

  • Online ISBN: 978-3-319-51724-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics