OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

International Journal of Reconfigurable Computing 2012

A Convolve-And-MErge Approach for Exact Computations on High-Performance Reconfigurable Computers

DOI: 10.1155/2012/925864

Esam El-Araby,Ivan Gonzalez,Sergio Lopez-Buedo,Tarek El-Ghazawi

Full-Text Cite this paper Add to My Lib

Abstract:

This work presents an approach for accelerating arbitrary-precision arithmetic on high-performance reconfigurable computers (HPRCs). Although faster and smaller, fixed-precision arithmetic has inherent rounding and overflow problems that can cause errors in scientific or engineering applications. This recurring phenomenon is usually referred to as numerical nonrobustness. Therefore, there is an increasing interest in the paradigm of exact computation, based on arbitrary-precision arithmetic. There are a number of libraries and/or languages supporting this paradigm, for example, the GNU multiprecision (GMP) library. However, the performance of computations is significantly reduced in comparison to that of fixed-precision arithmetic. In order to reduce this performance gap, this paper investigates the acceleration of arbitrary-precision arithmetic on HPRCs. A Convolve-And-MErge approach is proposed, that implements virtual convolution schedules derived from the formal representation of the arbitrary-precision multiplication problem. Additionally, dynamic (nonlinear) pipeline techniques are also exploited in order to achieve speedups ranging from 5x (addition) to 9x (multiplication), while keeping resource usage of the reconfigurable device low, ranging from 11% to 19%. 1. Introduction Present-day computers built around fixed-precision components perform integer and/or floating point arithmetic operations using fixed-width operands, typically 32 and/or 64 bits wide. However, some applications require larger precision arithmetic. For example, operands in public-key cryptography algorithms are typically thousands of bits long. Arbitrary-precision arithmetic is also important for scientific and engineering computations where the roundoff errors arising from fixed-precision arithmetic cause convergence and stability problems. Although many applications can tolerate fixed-precision problems, there is a significant number of other applications, such as finance and banking, in which numerical overflow is intolerable. This recurring phenomenon is usually referred to as numerical nonrobustness [1]. In response to this problem, exact computation, based on exact/arbitrary-precision arithmetic, was first introduced in 1995 by Yap and Dube [2] as an emerging numerical computation paradigm. In arbitrary-precision arithmetic, also known as bignum arithmetic, the size of operands is only limited by the available memory of the host system [3, 4]. Among other fields, arbitrary-precision arithmetic is used, for example, in computational metrology and coordinate measuring

References

[1]	V. Sharma, Complexity analysis of algorithms in algebraic computation, Ph.D. dissertation, Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, 2007.
[2]	C. K. Yap and T. Dube, “The exact computation paradigm,” in Computing in Euclidean Geometry, D. Z. Du and F. K. Hwang, Eds., vol. 4 of Lecture Notes Series on Computing, pp. 452–492, World Scientific Press, Singapore, 2nd edition, 1995.
[3]	D. E. Knuth, “The art of computer programming,” in Seminumerical Algorithms, vol. 2, Addison-Wesley, 3rd edition, 1998.
[4]	http://en.wikipedia.org/wiki/Arbitrary_precision_arithmetic.
[5]	C. Li, Exact geometric computation: theory and applications, Ph.D. dissertation, Department of Computer Science, Institute of Mathematical Sciences, New York University, 2001.
[6]	http://bitsavers.org/pdf/ibm/140x/A24-1401-1_1401_System_Summary_Sep64.pdf.
[7]	http://ibm-1401.info/1401-Competition.html#IntroHoneywell200.
[8]	J. Langou, J. Langou, P. Luszczek, J. Kurzak, A. Buttari, and J. Dongarra, “Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (Revisiting Iterative Refinement for Linear Systems),” in Proceedings of the ACM/IEEE SC Conference, Tampa, Fla, USA, November 2006.
[9]	J. Hormigo, J. Villalba, and E. L. Zapata, “CORDIC processor for variable-precision interval arithmetic,” Journal of VLSI Signal Processing Systems, vol. 37, no. 1, pp. 21–39, 2004.
[10]	S. Balakrishnan and S. K. Nandy, “Arbitrary precision arithmetic—SIMD style,” in Proceedings of the 11th International Conference on VLSI Design: VLSI for Signal Processing, p. 128, 1998.
[11]	A. Saha and R. Krishnamurthy, “Design and FPGA implementation of efficient integer arithmetic algorithms,” in Proceedings of the IEEE Southeastcon '93, vol. 4, no. 7, April 1993.
[12]	D. M. Chiarulli, W. G. Rudd, and D. A. Buell, “DRAFT—a dynamically reconfigurable processor for integer arithmetic,” in Proceedings of the 7th Symposium on Computer Arithmetic, pp. 309–321, IEEE Computer Society Press, 1989.
[13]	T. El-Ghazawi, E. El-Araby, M. Huang, K. Gaj, V. Kindratenko, and D. Buell, “The promise of high-performance reconfigurable computing,” IEEE Computer, vol. 41, no. 2, pp. 69–76, 2008.
[14]	A. Michalski, D. Buell, and K. Gaj, “High-throughput reconfigurable computing: design and implementation of an idea encryption cryptosystem on the SRC-6e reconfigurable computer,” in Proceedings of the International Conference on Field Programmable Logic and Applications (FPL '05), pp. 681–686, August 2005.
[15]	E. El-Araby, T. El-Ghazawi, J. Le Moigne, and K. Gaj, “Wavelet spectral dimension reduction of hyperspectral imagery on a reconfigurable computer,” in Proceedings of the IEEE International Conference on Field-Programmable Technology (FPT '04), pp. 399–402, Brisbane, Australia, December 2004.
[16]	E. El-Araby, M. Taher, T. El-Ghazawi, and J. Le Moigne, “Prototyping Automatic Cloud Cover Assessment (ACCA) algorithm for remote sensing on-board processing on a reconfigurable computer,” in Proceedings of the IEEE International Conference on Field Programmable Technology (FPT '05), pp. 207–214, Singapore, December 2005.
[17]	SRC Computers, SRC Carte C Programming Environment v2.2 Guide (SRC-007-18), 2006.
[18]	http://en.wikipedia.org/wiki/Computational_complexity_of_mathematical_operations.
[19]	T. Granlund and P. L. Montgomery, “Division by invariant integers using multiplication,” in Proceedings of the ACM SIGPLAN'94 Conference on Programming Language Design and Implementation (PLDI '94), pp. 61–72, June 1994.
[20]	GMP Manual, GNU MP The GNU Multiple Precision Arithmetic Library, 4.2.1 edition, 2006.
[21]	http://gmplib.org/.
[22]	H. L. Tredennick and T. A. Welch, “High-speed buffering for variable length operands,” Proceedings of the 4th Annual Symposium on Computer Architecture (ISCA '77), vol. 5, no. 7, pp. 205–210, March 1977.
[23]	S. Olariu, M. C. Pinotti, and S. Q. Zheng, “How to sort N items using a sorting network of fixed I/O size,” IEEE Transactions on Parallel and Distributed Systems, vol. 10, no. 5, pp. 487–499, 1999.
[24]	H. ElGindy and G. Ferizis, “Mapping basic recursive structures to runtime reconfigurable hardware,” in Proceedings of the FPL, August 2004.
[25]	L. Zhou, G. R. Morris, and V. K. Prasanna, “High-performance reduction circuits using deeply pipelined operators on FPGAs,” IEEE Transactions on Parallel and Distributed Systems, vol. 18, no. 10, pp. 1377–1392, 2007.
[26]	L. Zhuo and V. K. Prasanna, “High-performance and area-efficient reduction circuits on FPGAs,” in Proceedings of the 17th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD '05), pp. 52–59, Rio de Janeiro, Brazil, October 2005.
[27]	K. Hwang, Advanced Computer Architecture: Parallelism, Scalability, Programmability, McGrawHill, 1993.
[28]	K. Hwang and Z. Xu, Scalable Parallel Computing: Technology, Architecture, Programming, McGrawHill, 1998.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133