全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Throughput Analysis for a High-Performance FPGA-Accelerated Real-Time Search Application

DOI: 10.1155/2012/507173

Full-Text   Cite this paper   Add to My Lib

Abstract:

We propose an FPGA design for the relevancy computation part of a high-throughput real-time search application. The application matches terms in a stream of documents against a static profile, held in off-chip memory. We present a mathematical analysis of the throughput of the application and apply it to the problem of scaling the Bloom filter used to discard nonmatches. 1. Introduction The focus on real-time search is growing with the increasing adoption and spread and of social networking applications. Real-time search is equally important in other areas such as analysing emails for spam or search web traffic for particular patterns. FPGAs have great potential for speeding up many types of applications and algorithms. By performing a task in a fraction of the time of a conventional processor, large energy savings can be achieved. Therefore, there is a growing interest in the use of FPGA platforms for data centres. Because of the dramatic reduction in the required energy per query, data centres with FPGA search solutions could operate at a fraction of the power of current data centres, eliminating the need for cooling infrastructure altogether. As the cost of cooling is actually the dominant cost in today’s data centres [1], the savings would be considerable. In [2, 3] we presented our initial work on applying FPGAs for acceleration or search algorithms. In this paper, we present a novel design for the scoring part of an FPGA-based high-throughput real-time search application. We present a mathematical analysis of the throughput of the system. This novel analysis is applicable to a much wider class of applications than the one discussed in the paper; any algorithm that performs nondeterministic concurrent accesses to a shared resource can be analysed using the model we present. In particular, the technology presented in this paper can also be used for “traditional,” that is, inverted index based, web search. 2. Design of the Real-Time Search Application Real-time search, in information retrieval parlance called “document filtering,” consists of matching a stream of documents against a fixed set of terms, called the “profile.” Typically, the profile is large and must therefore be stored in external memory. The algorithm implemented on the FPGA can be expressed as follows.(i)A document is modelled as a “bag of words,” that is, a set of pairs , where is the term frequency, that is, number of occurrences of the term in the document ; is the term identifier.(ii)The profile is a set of pairs where the term weight is determined using the “Relevance Based

References

[1]  C. L. Belady, “In the data center, power and cooling costs more than the IT equipment it supports,” Electronics Cooling, vol. 13, no. 1, 2007.
[2]  W. Vanderbauwhede, L. Azzopardi, and M. Moadeli, “FPGA-accelerated information retrieval: High-efficiency document filtering,” in the 19th International Conference on Field Programmable Logic and Applications (FPL '09), pp. 417–422, September 2009.
[3]  L. Azzopardi, W. Vanderbauwhede, and M. Moadeli, “Developing energy efficient filtering systems,” in the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '09), pp. 664–665, July 2009.
[4]  V. Lavrenko and W. Bruce Croft, “Relevance-based language models,” in the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 120–127, September 2001.
[5]  Lemur, “The Lemur toolkit for language modeling and information retrieval,” 2005, http://www.lemurproject.org/.
[6]  V. Kindratenko, R. Wilhelmson, R. Brunner, T. J. Martíez, and W. M. Hwu, “High-performance computing with accelerators,” Computing in Science and Engineering, vol. 12, no. 4, Article ID 5492949, pp. 12–16, 2010.
[7]  GiDEL Ltd, “PROCStar III, Data Book,” September 2009.
[8]  B. H. Bloom, “Space/time trade-offs in hash coding with allowable errors,” Communications of the ACM, vol. 13, no. 7, pp. 422–426, 1970.
[9]  Altera Corp, “Stratix III, Device Handbook,” July 2010.
[10]  G. Andrews and K. Eriksson, Integer Partitions, Cambridge University Press, 2004.
[11]  C. Chen and K. Koh, Principles and Techniques in Combinatorics, World Scientific, 1992.
[12]  R. M. Losee, “Term dependence: a basis for Luhn and Zipf models,” Journal of the American Society for Information Science and Technology, vol. 52, no. 12, pp. 1019–1025, 2001.
[13]  M. A. Montemurro, “Beyond the Zipf-Mandelbrot law in quantitative linguistics,” Physica A, vol. 300, no. 3-4, pp. 567–578, 2001.
[14]  W. Press, Numerical Recipes: The Art of Scientific Computing, Cambridge University Press, 2007.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133