People use search engines to find information they desire with the aim that their information needs will be met. Information retrieval (IR) is a field that is concerned primarily with the searching and retrieving of information in the documents and also searching the search engine, online databases, and Internet. Genetic algorithms (GAs) are robust, efficient, and optimizated methods in a wide area of search problems motivated by Darwin’s principles of natural selection and survival of the fittest. This paper describes information retrieval systems (IRS) components. This paper looks at how GAs can be applied in the field of IR and specifically the relevance of genetic algorithms to internet web search. Finally, from the proposals surveyed it turns out that GA is applied to diverse problem fields of internet web search. 1. Introduction There is a virtual explosion in the availability of electronic information. The advent of the Internet or World Wide Web (WWW) has brought far more information than any human being can absorb. The goal of IR systems is to assist user to organize and store such information and retrieve useful information when a user submits a query to the IR systems. To resolve this problem, many research communities have implemented diverse techniques such as full text, inverted index, keyword querying, Boolean querying, knowledge-based, neural network, probabilistic retrieval, genetic algorithm, and machine learning. Now, increasing numbers of people use web search engines which enable them to access any kind of information from the Internet in order to formulate better, well-informed decisions. However, the ability of search engines to return useful and relevant documents is not always satisfactory. Often users need to refine the search query several times and search through large document collections to find relevant information. But, according to [1], the results returned by the search engine may not be relevant to the users’ information needs and, hence users need to modify and reformulate their queries. The focus of IR is the capability to search for information relevant to individual user’s needs within a documents collection which is relevant to the user’s query. According to [2], the authors stated that user is in need of information. The work reported in Agbele et al. [3] describes access to information as an important benefit that can be achieved in many areas including socio-economic development, education, and healthcare. In healthcare, for example, access to appropriate information can minimize visits to physicians and period
References
[1]
F. G. Erba, Z. Yu, and L. Ting, “Using explicit measures to quantify the potential for personalizing search,” Research Journal of Information Technology, vol. 3, no. 1, pp. 24–34, 2011.
[2]
R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Addison Wesley, New York, NY, USA, 1999.
[3]
K. Agbele, H. Nyongesa, and A. Adesina, “ICT and information security perspectives in E-health systems,” Journal of Mobile Communication, vol. 4, pp. 17–22, 2010.
[4]
J. H. Holland, Adaptation in Natural and Artificial Systems, The University of Michigan Press, Ann Arbor, Mich, USA, 1975.
[5]
K. A. DeJong, An Analysis of the Behaviour of a Class of Genetic Adaptive Systems, University of Michigan, 1975.
[6]
D. E. Goldberg, Genetic Algorithms in Search, Optimization, Machine Learning, Addison Wesley, 1989.
[7]
G. Salton and C. Buckley, “Improving retrieval performance by relevance feedback,” Journal of the American Society for Information Science, vol. 41, no. 4, pp. 288–297, 1990.
[8]
L. M. Schmitt, “Fundamental study, theory of genetic algorithms,” Theoretical Computer Science, vol. 259, no. 1-2, pp. 1–61, 2001.
[9]
K. Milena, “Solving timetabling problems using genetic algorithms,” in Proceedings of the IEEE 27th International Spring Seminar Electronics Technology: Meeting the Challenges of Electronics Technology Progress, vol. 1, pp. 96–98, 2004.
[10]
L. Lin, L. Cao, J. Wang, and C. Zhang, “The applications of genetic algorithms in stock market data mining optimization,” in Proceedings of the Capital Market, CRC, Sydney, Australia, 2000.
[11]
W. Ying and L. Bin, “Job-shop scheduling using genetic algorithm,” in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pp. 1994–1999, October 1996.
[12]
J. F. Frenzel, “Genetic algorithms, a new breed of optimization,” IEEE Potentials, vol. 12, pp. 21–24, 1993.
[13]
L. Tamine, C. Chrisment, and M. Boughanem, “Multiple query evaluation based on an enhanced genetic algorithm,” Information Processing and Management, vol. 39, no. 2, pp. 215–231, 2003.
[14]
M. Koorangi and K. Zamanifar, “A distributed agent based web search using a genetic algorithm,” International Journal of Computer Science and Network Security, vol. 7, no. 1, pp. 65–76, 2007.
[15]
R. Varadarajan, V. Hristidis, and T. Li, “Beyond single-page web search results,” IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 3, pp. 411–424, 2008.
[16]
S. Maleki-Dizaji, Evolutionary learning multi-agent based information retrieval systems [Ph.D. thesis], Sheffield Hallam University, 2003.
[17]
J. Cheng, W. Chen, L. Chen, and Y. Ma, “The improvement of genetic algorithm searching performance,” in Proceedings of 1st International Conference on Machine Learning and Cybernetics, pp. 947–951, Beijing, China, November 2002.
[18]
M. Sinha and S. V. Chande, “Query optimization using genetic algorithms,” Research Journal of Information Technology, vol. 2, no. 3, pp. 139–144, 2010.
[19]
M. H. Marghny and A. F. Ali, “Web mining based on genetic algorithm,” in Proceedings of the AIML O5 Conference, CICC, Cairo, Egypt, December 2005.
[20]
S. H. Lin, M. C. Chen, J. M. Ho, and Y. M. Huang, “ACIRD: intelligent Internet document organization and retrieval,” IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 3, pp. 599–614, 2002.
[21]
L. C. Chen, C. J. Luh, and C. Jou, “Generating page clippings from web search results using a dynamically terminated genetic algorithm,” Information Systems, vol. 30, no. 4, pp. 299–316, 2005.
[22]
H. Cheng, C. Yi-Ming, R. Marshal, and Y. Christopher, “An intelligent personal spider (agent) for dynamic Internet/Intranet searching,” Decision Support Systems, vol. 23, no. 1, pp. 41–58, 1998.
[23]
T. P. C. Silva, E. S. de Moura, J. M. B. Cavalcanti, A. S. da Silva, M. G. de Carvalho, and M. A. Gon?alves, “An evolutionary approach for combining different sources of evidence in search engines,” Information Systems, vol. 34, no. 2, pp. 276–289, 2009.
[24]
T. Mitchell, Machine Learning, McGraw-Hill, 1997.
[25]
A. M. Robertson and P. Willett, “Generation of equifrequent groups of words using a genetic algorithm,” Journal of Documentation, vol. 50, no. 3, pp. 213–232, 1994.
[26]
M. Gordon, “Probabilistic and genetic algorithms for document retrieval,” Communications of the ACM, vol. 31, no. 10, pp. 1208–1218, 1988.
[27]
W. Fan, M. D. Gordon, and P. Pathak, “Discovery of context-specific ranking functions for effective information retrieval using genetic programming,” IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 4, pp. 523–527, 2004.
[28]
P. Pathak, M. Gordon, and W. Fan, “Effective information retrieval using genetic algorithms based matching functions adaptation,” in Proceedings of the 33rd Annual Hawaii International Conference on System Siences (HICSS '00), January 2000.
[29]
W. Fan, M. D. Gordon, and P. Pathak, “Personalization of search engine services for effective retrieval and knowledge management,” in Proceedings International Conference on Information Systems (ICIS '00), Brisbane, Australia, 2000.
[30]
F. Eissa and H. Alghamdi, “Agent based information retrieval system,” in Proceedings of the International Conference Proceedings, pp. 265–279, 2005.
[31]
M. S. Vallim and J. M. A. Coello, “An agent for web information dissemination based on a genetic algorithm,” in IEEE, International Conference on Systems, Man and Cybernetics, vol. 4, no. 5–8, pp. 3834–3836, 2003.
[32]
W. Li, B. Xu, H. Yang, W. C. Chung, and C.-W. Lu, “Application of genetic algorithm in search engine,” in Proceedings of the Proceedings of the International Conference on Microelectronic Systems Education (MSE '00), pp. 366–371, IEEE, 2000.
[33]
M. Caramia, G. Felici, and A. Pezzoli, “Improving search results with data mining in a thematic search engine,” Computers and Operations Research, vol. 31, no. 14, pp. 2387–2404, 2004.
[34]
L. Rocio, L. Cecchini, M. Carlos, Lorenzetti, G. Ana, and M. Nelida, “Using genetic algorithms to evolve a population of topical queries,” Information Processing and Management, vol. 44, no. 6, pp. 1863–1878, 2008.
[35]
K. Abe, T. Taketa, and H. Nunokawa, “An efficient information retrieval method in WWW using genetic algorithms,” ICPP Workshops, pp. 522–527, 1999.
[36]
M. J. Martin-Bautista, H. Larsen, and M. A. Vila, “A fuzzy genetic algorithm approach to an adaptive information retrieval agent,” Journal of the American Society for Information Science, vol. 50, no. 9, pp. 760–771, 1999.
[37]
W. Fan, M. D. Gordon, P. Pathak, W. Xi, and E. A. Fox, “Ranking function optimization for efficient web search By genetic programming, an empirical study,” Department of Computer Science of Virginal Tech, Florida Universities, 2003.
[38]
V. Milutinovic, D. Cvetkovic, and J. Mirkovic, “Genetic search based on multiple mutations,” IEEE Computer, vol. 33, no. 11, pp. 118–119, 2000.
[39]
V. Rijsbergen, Information Retrieval, Butterworth, 2nd edition, 1979.
[40]
M. P. Smith and M. Smith, “The use of genetic programming to build Boolean queries for text retrieval through relevance feedback,” Journal of Information Science, vol. 23, no. 6, pp. 423–431, 1997.
[41]
J. J. Yang and R. R. Korfhage, “Query modification using genetic algorithms in vector space models,” International Journal of Expert Systems, vol. 7, no. 2, pp. 165–191, 1994.