Abstract
Digit serial data transmission can be used to an advantage in the design of special purpose processors where communication issues dominate and where digit pipelining can be used to maintain high data rates. VLSI signal processing applications are one such problem domain. We have developed a family of VLSI components that have digit serial transmission and that can be pipelined at the digit level. These components can be used to construct VLSI processors that are especially suited to signal processing applications. One such particularly attractive processor is a structure we call the arithmetic cube. The arithmetic cube can be programmed to solve linear transformations such as convolutions and DFTs, and has nearest neighbor interconnects, regular layout, simple control, and a limited number of interconnections. Regular layout and simple control derive naturally from the algorithms on which the processor is based. Long wires are eliminated by the nearest neighbor interconnect. High throughput can be achieved by pipelining the processor at the digit level. The arithmetic cube is programmable in the problem size n; once implemented for a certain size N, smaller problems can be solved on the same implementation without a loss in performance. In addition, the architecture extends to larger N in a regular and automatic fashion.
Similar content being viewed by others
References
Agarwal, R., and Cooley, J. 1977. New algorithms for digital convolution. IEEE Trans. Acoustics, Speech, Signal Processing, ASSP-25, 392–410.
Atkins, D. 1975. An introduction to the role of redundancy in computer arithmetic. Computer, 8, 6 (June) 74–76.
Beekman, J. 1986. Mesh arrays for cmos circuit design. Computer Science M.S. thesis, Penn State University. IEEE Computer.
Chow, C. 1980. A variable precision processor module. Ph.D. thesis. Department of Computer Science Technical Report, University of Illinois.
Cooley, J., and Tukey, J. 1965. An algorithm for the machine calculation of complex Fourier series. Math. Comput., 19, 297–301.
Denyer, P., and Renshaw, D. 1985. VLSI Signal Processing: A Bit-Serial Approach. Addison-Wesley, Reading, MA.
Elliot, D., and Rao, K. 1982. Fast Transforms Algorithms, Analyses, Applications. Academic Press, New York.
Ercegovac, M. D. 1984. On-line arithmetic: an overview. In Proceedings of SPIE 1984, vol. 495, Real Time Signal Processing VII, pp. 86–93.
Ercegovac, M. D., and Lang, T. 1986. On-the-fly conversion of redundant into conventional representations. Department of Computer Science Technical Report, UCLA.
Good, I. 1960. The interaction algorithm and practical Fourier analysis. Royal Stat. Soc., B-20, 361–372, 1958, Addendum, B-22, 372–375.
Irwin, M. J., and Owens, R. M. 1983. Fully digit online networks. IEEE Trans. Comput., C-32, 4 (April), 402–406.
Irwin, M. J., and Owens, R. M. 1985. Fine grain computational arrays. J. Parallel Distrib. Comput. (submitted).
Irwin, M. J., and Owens, R. M. 1986. The arithmetic cube and its associated algorithms. In Proceedings of the Workshop on Future Directions in Computer Architecture and Software (Charleston, SC, May), 38–47.
Kolba, D., and Parks, I. 1977. A prime factor FFT algorithm using high speed convolution. IEEE Trans. Acoustics, Speech, Signal Processing, ASSP-25, 281–294.
Kung, H. T. 1979. Let's design algorithms for VLSI systems. In Proceedings of the First Caltech Conference on VLSI (Jan.), 65–90.
Lyons, R. 1981. A bit-serial VLSI architectural methodology for signal processing. In Gray (ed.), VLSI '81. Academic Press, New York.
Mackowiak, T., Irwin, M. J., and Owens, R. M. 1986. The arithmetic cube digital signal processor. GOMAC-86 Digest, (Nov.), 395–398.
Mead, C., and Conway, L. 1980. Introduction to VLSI Systems. Addison-Wesley, Reading, MA.
Owens, R. M., and Irwin, M. J. 1985a. The arithmetic cube. Department of Computer Science Technical Report CS-85-20. Penn State University.
Owens, R. M., and Irwin, M. J. 1985b. Multidimensional algorithms for the arithmetic cube. CS-85-26, Penn State University.
Owens, R. M., and Irwin, M. J. 1986. A system for designing, simulating, and testing high performance VLSI signal processors. IEEE Trans. CAD, CAD-5, 3 (July), 420–428.
Owens, R. M., and JaJa, J. 1986. A VLSI chip for the Winograd/prime factor algorithm to compute the DFT. IEEE Trans. Acoustics Speech, Signal Processing, ASSP-34, 4 (Aug.), 979–988.
Seitz, C. 1984. Concurrent VLSI architectures. IEEE Trans. Comput. C-33, 12 (December).
Trivedi, K. S., and Ercegovac, M. D. 1977. On-line algorithms for division and multiplication. IEEE Trans. Comput., C-26, 7 (July), 681–687.
Winograd, S. 1978. On computing the discrete Fourier transform. Math. Comput., 32, 175–195.
Winograd, S. 1979. On the multiplicative complexity of the discrete Fourier transform. Adv. Math., 32, 83–117.
Author information
Authors and Affiliations
Additional information
This work has been supported in part by the Army Research Office under Contract DAAG29-83-K-0126.
Rights and permissions
About this article
Cite this article
Irwin, M.J., Owens, R.M. Digit pipelined processors. J Supercomput 1, 61–86 (1987). https://doi.org/10.1007/BF00138606
Issue Date:
DOI: https://doi.org/10.1007/BF00138606